nutch-1.18 错误 java.lang.NoClassDefFoundError: org/apache/nutch/storage/WebPage$Field

如何解决nutch-1.18 错误 java.lang.NoClassDefFoundError: org/apache/nutch/storage/WebPage$Field

我在所有版本的二进制 nutch、sources 或 git 上仍然有同样的错误 源版本编译时出现此常量错误:

ant 运行时

...
resolve-default:
[ivy:resolve] impossible to define new type: class not found: org.apache.ivy.plugins.resolver.SshResolver in [] nor Ivy classloader
[ivy:resolve] impossible to define new type: class not found: org.apache.ivy.plugins.signer.bouncycastle.OpenPGPSignatureGenerator in [] nor Ivy classloader
[ivy:resolve] impossible to define new type: class not found: org.apache.ivy.plugins.resolver.SFTPResolver in [] nor Ivy classloader
[ivy:resolve] impossible to define new type: class not found: org.apache.ivy.plugins.resolver.VfsResolver in [] nor Ivy classloader
[ivy:resolve] :: loading settings :: file = /home/user/Téléchargements/nutch/branch-1.18/ivy/ivysettings.xml
...
but 
BUILD SUCCESSFUL
Total time: 36 seconds

当我输入:bin/nutch 注入 base/crawldb urls/

021-05-07 23:00:18,306 WARN  mapred.LocalJobRunner - job_local829319691_0001
java.lang.Exception: java.lang.NoClassDefFoundError: org/apache/nutch/storage/WebPage$Field
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.lang.NoClassDefFoundError: org/apache/nutch/storage/WebPage$Field
    at org.apache.nutch.scoring.opic.OPICScoringFilter.<clinit>(OPICScoringFilter.java:59)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:164)
    at org.apache.nutch.plugin.PluginRepository.getOrderedPlugins(PluginRepository.java:442)
    at org.apache.nutch.scoring.ScoringFilters.<init>(ScoringFilters.java:46)
    at org.apache.nutch.crawl.Injector$InjectMapper.setup(Injector.java:145)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.storage.WebPage$Field
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    at org.apache.nutch.plugin.PluginClassLoader.loadClassFromSystem(PluginClassLoader.java:104)
    at org.apache.nutch.plugin.PluginClassLoader.loadClassFromParent(PluginClassLoader.java:92)
    at org.apache.nutch.plugin.PluginClassLoader.loadClass(PluginClassLoader.java:72)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
    ... 18 more
2021-05-07 23:00:19,027 INFO  mapreduce.Job - Job job_local829319691_0001 failed with state FAILED due to: NA
2021-05-07 23:00:19,490 INFO  mapreduce.Job - Counters: 17
    File System Counters
        FILE: Number of bytes read=3158671488622
        FILE: Number of bytes written=11186329660
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
    Map-Reduce Framework
        Map input records=0
        Map output records=0
        Map output bytes=0
        Map output materialized bytes=81536
        Input split bytes=982778
        Combine input records=0
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=1515713
        Total committed heap usage (bytes)=20589437779968
    File Input Format Counters 
        Bytes Read=0
2021-05-07 23:00:19,491 ERROR crawl.Injector - Injector job did not succeed,job status: FAILED,reason: NA
2021-05-07 23:00:19,498 ERROR crawl.Injector - Injector: java.lang.RuntimeException: Injector job did not succeed,reason: NA
    at org.apache.nutch.crawl.Injector.inject(Injector.java:444)
    at org.apache.nutch.crawl.Injector.run(Injector.java:571)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
    at org.apache.nutch.crawl.Injector.main(Injector.java:535)

默认设置是为了消除配置错误的根源。 我对 nutch-1.17 和 nutch-1.16 有同样的错误。

你有什么想法可以帮助我吗?

解决方法

类 org.apache.nutch.storage.WebPage 不是 Nutch 1.18(1.16 或 1.17)的一部分,它包含在不再维护的 Nutch 2.x 中。这意味着 Nutch 2 代码库的一部分位于 Java 类路径上。详细来说,加载的OPIC评分插件不是Nutch 1.18的,而是属于Nutch 2.x的。要解决此问题,请确保 Java 类路径是“干净的”,并且调用了正确的 bin/nutch 脚本(Nutch 1.18 的脚本,而不是 PATH 上的另一个脚本)。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)&gt; insert overwrite table dwd_trade_cart_add_inc &gt; select data.id, &gt; data.user_id, &gt; data.course_id, &gt; date_format(
错误1 hive (edu)&gt; insert into huanhuan values(1,&#39;haoge&#39;); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive&gt; show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 &lt;configuration&gt; &lt;property&gt; &lt;name&gt;yarn.nodemanager.res