如何解决nutch-1.18 错误 java.lang.NoClassDefFoundError: org/apache/nutch/storage/WebPage$Field
我在所有版本的二进制 nutch、sources 或 git 上仍然有同样的错误 源版本编译时出现此常量错误:
ant 运行时
...
resolve-default:
[ivy:resolve] impossible to define new type: class not found: org.apache.ivy.plugins.resolver.SshResolver in [] nor Ivy classloader
[ivy:resolve] impossible to define new type: class not found: org.apache.ivy.plugins.signer.bouncycastle.OpenPGPSignatureGenerator in [] nor Ivy classloader
[ivy:resolve] impossible to define new type: class not found: org.apache.ivy.plugins.resolver.SFTPResolver in [] nor Ivy classloader
[ivy:resolve] impossible to define new type: class not found: org.apache.ivy.plugins.resolver.VfsResolver in [] nor Ivy classloader
[ivy:resolve] :: loading settings :: file = /home/user/Téléchargements/nutch/branch-1.18/ivy/ivysettings.xml
...
but
BUILD SUCCESSFUL
Total time: 36 seconds
当我输入:bin/nutch 注入 base/crawldb urls/
021-05-07 23:00:18,306 WARN mapred.LocalJobRunner - job_local829319691_0001
java.lang.Exception: java.lang.NoClassDefFoundError: org/apache/nutch/storage/WebPage$Field
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.lang.NoClassDefFoundError: org/apache/nutch/storage/WebPage$Field
at org.apache.nutch.scoring.opic.OPICScoringFilter.<clinit>(OPICScoringFilter.java:59)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:164)
at org.apache.nutch.plugin.PluginRepository.getOrderedPlugins(PluginRepository.java:442)
at org.apache.nutch.scoring.ScoringFilters.<init>(ScoringFilters.java:46)
at org.apache.nutch.crawl.Injector$InjectMapper.setup(Injector.java:145)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.storage.WebPage$Field
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at org.apache.nutch.plugin.PluginClassLoader.loadClassFromSystem(PluginClassLoader.java:104)
at org.apache.nutch.plugin.PluginClassLoader.loadClassFromParent(PluginClassLoader.java:92)
at org.apache.nutch.plugin.PluginClassLoader.loadClass(PluginClassLoader.java:72)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 18 more
2021-05-07 23:00:19,027 INFO mapreduce.Job - Job job_local829319691_0001 failed with state FAILED due to: NA
2021-05-07 23:00:19,490 INFO mapreduce.Job - Counters: 17
File System Counters
FILE: Number of bytes read=3158671488622
FILE: Number of bytes written=11186329660
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
Map-Reduce Framework
Map input records=0
Map output records=0
Map output bytes=0
Map output materialized bytes=81536
Input split bytes=982778
Combine input records=0
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=1515713
Total committed heap usage (bytes)=20589437779968
File Input Format Counters
Bytes Read=0
2021-05-07 23:00:19,491 ERROR crawl.Injector - Injector job did not succeed,job status: FAILED,reason: NA
2021-05-07 23:00:19,498 ERROR crawl.Injector - Injector: java.lang.RuntimeException: Injector job did not succeed,reason: NA
at org.apache.nutch.crawl.Injector.inject(Injector.java:444)
at org.apache.nutch.crawl.Injector.run(Injector.java:571)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.nutch.crawl.Injector.main(Injector.java:535)
默认设置是为了消除配置错误的根源。 我对 nutch-1.17 和 nutch-1.16 有同样的错误。
你有什么想法可以帮助我吗?
解决方法
类 org.apache.nutch.storage.WebPage 不是 Nutch 1.18(1.16 或 1.17)的一部分,它包含在不再维护的 Nutch 2.x 中。这意味着 Nutch 2 代码库的一部分位于 Java 类路径上。详细来说,加载的OPIC评分插件不是Nutch 1.18的,而是属于Nutch 2.x的。要解决此问题,请确保 Java 类路径是“干净的”,并且调用了正确的 bin/nutch
脚本(Nutch 1.18 的脚本,而不是 PATH 上的另一个脚本)。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。