如何解决tika parser get error:写入数据时出现问题,类org.apache.cxf.jaxrs.ext.multipart.MultipartBody
之后,我们将tika解析器0.7升级到Java 11,停止工作,解析器引发错误: ** 无法将扩展名添加到000000001_0 java.lang.IllegalArgumentException:org.apache.tika.parser.Parser不是ImageIO SPI类 在java.desktop / javax.imageio.spi.ServiceRegistry.checkClassAllowed(ServiceRegistry.java:722) 在java.desktop / javax.imageio.spi.ServiceRegistry.lookupProviders(ServiceRegistry.java:207) **
在Java 8中成功运行
我将版本升级到1.24.1(从1.19开始,它必须支持Java 11),并将所有需要的jar文件作为cfx *
127509 Aug 26 13:52 javax.ws.rs-api-2.1.jar
346445 Aug 26 14:23 cxf-rt-frontend-jaxws-3.4.0.jar
382008 Aug 26 14:24 cxf-rt-transports-http-3.4.0.jar
1414830 Aug 26 14:24 cxf-core-3.4.0.jar
696050 Aug 26 14:27 cxf-rt-frontend-jaxrs-3.4.0.jar
187625 Aug 26 14:30 cxf-rt-rs-client-3.4.0.jar
59386 Aug 26 14:32 cxf-rt-rs-service-description-3.4.0.jar
365552 Aug 26 19:57 commons-compress-1.8.1.jar
276413 Aug 27 10:29 commons-io-2.7.jar
53820 Aug 27 10:48 commons-cli-1.4.jar
284220 Aug 27 10:53 commons-lang-2.6.jar
219146 Aug 27 10:55 javax.mail-api-1.6.2.jar
47929542 Aug 27 11:59 tika-bundle-1.24.1.jar
708157 Aug 27 11:59 tika-core-1.24.1.jar
1336431 Aug 27 11:59 tika-parsers-1.24.1.jar
39881 Aug 27 12:00 tika-translate-1.24.1.jar
18069 Aug 27 12:01 tika-serialization-1.24.1.jar
35100 Aug 27 12:01 tika-xmp-1.24.1.jar
但是现在我得到了与以下相同的错误:
Caused by: javax.ws.rs.ProcessingException: Problem with writing the data,class org.apache.cxf.jaxrs.ext.multipart.MultipartBody,ContentType: multipart/form-data
at org.apache.cxf.jaxrs.client.AbstractClient.reportMessageHandlerProblem(AbstractClient.java:846)
at org.apache.cxf.jaxrs.client.AbstractClient.writeBody(AbstractClient.java:529)
at org.apache.cxf.jaxrs.client.WebClient$BodyWriter.doWriteBody(WebClient.java:1223)
at org.apache.cxf.jaxrs.client.AbstractClient$AbstractBodyWriter.handleMessage(AbstractClient.java:1222)
at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)
at org.apache.cxf.jaxrs.client.AbstractClient.doRunInterceptorChain(AbstractClient.java:703)
at org.apache.cxf.jaxrs.client.WebClient.doChainedInvocation(WebClient.java:1086)
at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:932)
at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:901)
at org.apache.cxf.jaxrs.client.WebClient.invoke(WebClient.java:364)
at org.apache.cxf.jaxrs.client.WebClient.post(WebClient.java:373)
at org.apache.tika.parser.journal.GrobidRESTParser.parse(GrobidRESTParser.java:81)
at org.apache.tika.parser.journal.JournalParser.parse(JournalParser.java:60)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
... 50 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242
我们用于解析在线文档的代码:
try {
ContentHandler handler = new BodyContentHandler(-1);
Metadata Metadata = new Metadata();
Parser parser = new AutoDetectParser();
parser.parse(in,handler,Metadata,new ParseContext());
return handler.toString();
} catch (IOException e) {
logger.warn("Problem in getting the content of "+filePath);
throw new RuntimeException("Problem in getting the content of "+filePath,e);
}
catch (SAXException e) {
logger.warn("Problem in parsing the content of "+filePath);
throw new RuntimeException("Problem in parsing the content of "+filePath,e);
}
catch (TikaException e) {
logger.warn("Problem in parsing the content of "+filePath);
throw new RuntimeException("Problem in parsing the content of "+filePath,e);
}
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。