我正在尝试将
XML数据加载到Hive中,但是我收到了一个错误:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.Metadata.HiveException: Hive Runtime Error while processing row {“xmldata”:””}
我使用的xml文件是:
<?xml version="1.0" encoding="UTF-8"?> <catalog> <book> <id>11</id> <genre>Computer</genre> <price>44</price> </book> <book> <id>44</id> <genre>Fantasy</genre> <price>5</price> </book> </catalog>
我使用的配置单元查询是:
1) Create TABLE xmltable(xmldata string) STORED AS TEXTFILE; LOAD DATA lOCAL INPATH '/home/user/xmlfile.xml' OVERWRITE INTO TABLE xmltable; 2) CREATE VIEW xmlview (id,genre,price) AS SELECT xpath(xmldata,'/catalog[1]/book[1]/id'),xpath(xmldata,'/catalog[1]/book[1]/genre'),'/catalog[1]/book[1]/price') FROM xmltable; 3) CREATE TABLE xmlfinal AS SELECT * FROM xmlview; 4) SELECT * FROM xmlfinal WHERE id ='11
直到第二次查询一切都很好,但当我执行第三次查询时,它给了我错误:
错误如下:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.Metadata.HiveException: Hive Runtime Error while processing row {"xmldata":"<?xml version=\"1.0\" encoding=\"UTF-8\"?>"} at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) at org.apache.hadoop.mapred.Child$4.run(Child.java:268) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupinformation.doAs(UserGroupinformation.java:1438) at org.apache.hadoop.mapred.Child.main(Child.java:262) Caused by: org.apache.hadoop.hive.ql.Metadata.HiveException: Hive Runtime Error while processing row {"xmldata":"<?xml version=\"1.0\" encoding=\"UTF-8\"?>"} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:675) at org.apache.hadoop.hive.ql.exec Failed: Execution Error,return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
那么哪里出错了?我也在使用正确的xml文件.
谢谢,
斯里
解决方法
错误原因:
1)case-1 :(你的情况) – xml内容被逐行送入hive.
输入xml:
<?xml version="1.0" encoding="UTF-8"?> <catalog> <book> <id>11</id> <genre>Computer</genre> <price>44</price> </book> <book> <id>44</id> <genre>Fantasy</genre> <price>5</price> </book> </catalog>
检查蜂巢:
select count(*) from xmltable; // return 13 rows - means each line in individual row with col xmldata
错误原因:
XML被读作13件不统一.所以XML无效
2)case-2:xml内容应该作为singleString – XpathUDFs工作
引用语法:所有函数都遵循以下形式:xpath_(xml_string,xpath_expression_string).* source
input.xml中
<?xml version="1.0" encoding="UTF-8"?><catalog><book><id>11</id><genre>Computer</genre><price>44</price></book><book><id>44</id><genre>Fantasy</genre><price>5</price></book></catalog>
检查蜂巢:
select count(*) from xmltable; // returns 1 row - XML is properly read as complete XML.
意思是:
xmldata = <?xml version="1.0" encoding="UTF-8"?><catalog><book> ...... </catalog>
然后像这样应用你的xpathUDF
select xpath(xmldata,'xpath_expression_string' ) from xmltable
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。