微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如果 xml doc 包含 namesapcecontext,XPathFactoryImpl 无法识别根节点

如何解决如果 xml doc 包含 namesapcecontext,XPathFactoryImpl 无法识别根节点

我对 XML 和 Saxon API 很陌生,在这里我使用 Saxon 10.3 HE jar 从 XML 文件提取数据。在这里,我想从使用日期函数的活动 country_information 节点中提取国家/地区属性。 示例输入 XML:

<person xmlns="urn:my.poctest.com">
                  <country_information>
                     <country>FRA</country>
                     <end_date>9999-12-31</end_date>
                     <start_date>2009-12-01</start_date>
                  </country_information>
                  <country_information>
                     <country>FRA</country>
                     <end_date>9999-12-31</end_date>
                     <start_date>2009-12-01</start_date>
                  </country_information>             
               </person>

代码

import java.io.IOException;
import java.io.StringReader;
import java.util.Iterator;
import java.util.Map;

import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import javax.xml.xpath.XPathFactoryConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

import net.sf.saxon.xpath.XPathFactoryImpl;

public class SaxonPoc {

    public static void main(String[] args) throws SAXException,IOException,ParserConfigurationException,XPathExpressionException,XPathFactoryConfigurationException {
        String xml = " <person xmlns=\"urn:my.poctest.com\">\r\n"
                + "       <country_information>\r\n"
                + "          <country>FRA</country>\r\n"
                + "          <end_date>9999-12-31</end_date>\r\n"
                + "          <start_date>2020-02-24</start_date>\r\n"
                + "       </country_information>\r\n" 
                + "       <country_information>\r\n"
                + "          <country>USA</country>\r\n"
                + "          <end_date>2020-02-23</end_date>\r\n"
                + "          <start_date>2009-12-01</start_date>\r\n"
                + "       </country_information>             \r\n" 
                + "       </person>";
        Document doc = SaxonPoc.getDocument(xml,false);
        NodeList matches = (NodeList) SaxonTest.getXpathExpression("//person",null).evaluate(doc,XPathConstants.NODESET);
        if (matches != null) {
            Element node = (Element) matches.item(0);
            XPath xPath1 = SaxonPoc.getXpath(null);
            String xPathStatement = "/person/country_information[xs:date(start_date) le current-date() and  xs:date(end_date) ge current-date()]/country";
            NodeList childNodes = (NodeList) xPath1.evaluate(xPathStatement,node,XPathConstants.NODESET);
            if (childNodes.getLength() > 0) {
                String nodeName = childNodes.item(0).getFirstChild().getNodeName();
                System.out.println("Node :" + nodeName);
                String value = childNodes.item(0).getTextContent();
                System.out.println("Country Name :" + value);
            }

        }
        System.out.println("Finished");

    }

    public static Document getDocument(String xml,boolean isNamespaceAware)
            throws SAXException,ParserConfigurationException {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(isNamespaceAware);
        DocumentBuilder builder = factory.newDocumentBuilder();
        InputSource is = new InputSource(new StringReader(xml));
        return builder.parse(is);
    }

    public static XPath getXpath(Map<String,String> namespaceMappings) throws XPathFactoryConfigurationException {
        XPathFactory xpathFactory = new XPathFactoryImpl();
        XPath xpath = xpathFactory.newXPath();
        NamespaceContext nsc = new NamespaceContext() {

            @Override
            public String getNamespaceURI(String prefix) {
                return (null != namespaceMappings) ? namespaceMappings.get(prefix) : null;
            }

            @Override
            public String getPrefix(String namespaceURI) {
                return null;
            }

            @Override
            public Iterator getPrefixes(String namespaceURI) {
                return null;
            }

        };
        xpath.setNamespaceContext(nsc);

        return xpath;
    }

    public static XPathExpression getXpathExpression(String xpathExpr,Map<String,String> namespaceMappings)
            throws XPathExpressionException,XPathFactoryConfigurationException {
        XPath xpath = getXpath(namespaceMappings);
        return xpath.compile(xpathExpr);
    }

}

我正面临一个空指针,因为它无法找到根节点 person 一个 XML 文档。如果我删除 xmlns="urn:my.poctest.com" 然后它能够​​获得根路径,但在稍后阶段,它会因 javax.xml.xpath 而失败。 XPathExpressionException:net.sf.saxon.trans.XPathException:尚未声明命名空间前缀“xs”。如果我从 XML 文档和 NamespaceContext 实现中删除命名空间,那么它工作正常。但实际上我不想删除这两个东西。

有人可以在这里指出我,我做错了什么吗?提前致谢!!

解决方法

您可能想知道最新版本的 Saxon 包含执行选项

((net.sf.saxon.xpath.XPathEvaluator)XPath).getStaticContext()
    .setUnprefixedElementMatchingPolicy(
       UnprefixedElementMatchingPolicy.ANY_NAMESPACE))

这会导致 XPath 表达式中不带前缀的元素名称仅匹配本地名称,而不管名称空间如何。

这主要是为 HTML 引入的,对于 HTML DOM 中的元素是否在命名空间中存在完全混淆;但它在更普遍的情况下更有用,您真的不关心命名空间,只是希望它们不在那里让您的生活变得痛苦。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。