XSLT:分析字符串并保留子节点

如何解决XSLT:分析字符串并保留子节点

我正在尝试使用正则表达式文本匹配查找引用其他语句的语句。它适用于文本位于同一节点中的情况,但我正在努力处理作为子节点或跨节点拆分的文本。此外,我想忽略 del 标签内的任何文本。

从这样的文档开始:

<doc>
    <sectionA>
        <statement id="1">
            <title>Titlle A</title>
            <statementtext id="a">This is referring to statement 2 about the stuff</statementtext>
            <!-- This is referring to statement <ref statementNumber="2">2</ref> about the stuff -->
        </statement>
        <statement id="2">
            <title>Title B</title>
            <statementtext id="b">This is <b>my</b> statement <b>1</b> referring to something else</statementtext>
            <!-- This is <b>my</b> statement <ref statementNumber="1"><b>1</b></ref> referring to something else -->
        </statement>
        <statement id="3">
            <title>Title 3</title>
            <statementtext id="c">This is another statement <b>1</b><i>2</i> about the stuff</statementtext>
            <!-- This is another statement <ref statementNumber="12"><b>1</b><i>2</i></ref> about the stuff -->
        </statement>
        <statement id="4">
            <title>Title 4</title>
            <statementtext id="d">This is corrected statement <del>1</del><ins>2</ins> about the stuff</statementtext>
            <!-- This is corrected statement <ref statementNumber="2"><del>1</del><ins>2</ins></ref> about the stuff -->
        </statement>        
        <statement id="5">
            <title>Title 5</title>
            <statementtext id="e">This is partially corrected statement 1<del>1</del><ins>5</ins> about the stuff</statementtext>
            <!-- This is partially corrected statement <ref statementNumber="15">1<del>1</del><ins>5</ins></ref> about the stuff -->
        </statement>
                <statement id="6">
            <title>Title 6</title>
            <statementtext  id="f">This is another
            <statementtext  id="g"> that contains a nested satementtext for statement <b>1</b><i>3</i> about </statementtext>
            the stuff</statementtext>
            <!-- This is another <statementtext id="g"> that contains a nested satementtext for statement <ref statementNumber="13"><b>1</b><i>3</i></ref> about </statementtext> -->
        </statement>
        <statement id="7">
            <title>Title 7</title>
            <statementtext id="h">This is <i>statement</i> <b>1</b> referring to something else</statementtext>
            <!-- This is my <i>statement</i> <ref statementNumber="1"><b>1</b></ref> referring to something else -->
        </statement>
        <statement id="8">
            <title>Title 8</title>
            <statementtext id="i">This is has no reference to another statement</statementtext>
            <!-- his is has no reference to another statement -->
        </statement>        
    </sectionA>     
</doc>

使用我当前的模板

  <xsl:template match="statementtext">
      <statementtext>
          <xsl:copy-of select="./@*" />
        <xsl:variable name="thisText">
            <xsl:value-of select="./descendant-or-self::text()"/>
        </xsl:variable>

        <xsl:variable name="thisTextFiltered">
            <xsl:value-of select="./descendant-or-self::text()[not(descendant-or-self::del and comment())]"/>
        </xsl:variable>       

        <xsl:choose>
            <xsl:when test="matches($thisTextFiltered,'(statement\s*)(\d+)','i')">
                    <xsl:analyze-string select="$thisTextFiltered"
                                    regex="(statement\s*)(\d+)"
                                    flags="ix">
                        <xsl:matching-substring>
                        <xsl:value-of select="regex-group(1)"/>
                        <xsl:variable name="statementNumber">
                            <xsl:value-of select="regex-group(2)"></xsl:value-of>
                        </xsl:variable>
                            <ref>
                                <xsl:attribute name="statementNumber">
                                    <xsl:value-of select="$statementNumber" />
                                </xsl:attribute>
                                <xsl:value-of select="regex-group(2)"/>
                            </ref> 
                            </xsl:matching-substring>
                            <xsl:non-matching-substring>
                              <xsl:value-of select="."/>
                            </xsl:non-matching-substring>
                        </xsl:analyze-string>
                </xsl:when>
            <xsl:otherwise>
                <xsl:apply-templates />
            </xsl:otherwise>           
            </xsl:choose>
        </statementtext>
   </xsl:template>

      <xsl:template match="@*|*|processing-instruction()|comment()">
        <xsl:copy>
            <xsl:apply-templates select="*|@*|text()|processing-instruction()|comment()" mode="#current"/>
        </xsl:copy>
    </xsl:template

这是我的输出:

<!DOCTYPE HTML>
<doc>
   <sectionA>
      <statement id="1"><title>Titlle A</title><statementtext id="a">This is referring to statement 
            <ref statementNumber="2">2</ref> about the stuff
         </statementtext>
         <!-- This is referring to statement <ref statementNumber="2">2</ref> about the stuff -->
      </statement>
      <statement id="2"><title>Title B</title><statementtext id="b">This is my statement 
            <ref statementNumber="1">1</ref> referring to something else
         </statementtext>
         <!-- This is <b>my</b> statement <b><ref statementNumber="1">1</ref></b> referring to something else -->
      </statement>
      <statement id="3"><title>Title 3</title><statementtext id="c">This is another statement 
            <ref statementNumber="12">12</ref> about the stuff
         </statementtext>
         <!-- This is another statement <ref statementNumber="12"><b>1</b><i>2</i></ref> about the stuff -->
      </statement>
      <statement id="4"><title>Title 4</title><statementtext id="d">This is corrected statement 
            <ref statementNumber="12">12</ref> about the stuff
         </statementtext>
         <!-- This is corrected statement <ref statementNumber="2"><del>1</del><ins>2</ins></ref> about the stuff -->
      </statement>
      <statement id="5"><title>Title 5</title><statementtext id="e">This is partially corrected statement 
            <ref statementNumber="115">115</ref> about the stuff
         </statementtext>
         <!-- This is partially corrected statement <ref statementNumber="15">1<del>1</del><ins>5</ins></ref> about the stuff -->
      </statement>
      <statement id="6"><title>Title 6</title><statementtext id="f">This is another
                         that contains a nested satementtext for statement 
            <ref statementNumber="13">13</ref> about 
                        the stuff
         </statementtext>
         <!-- This is another <statementtext id="g"> that contains a nested satementtext for statement <ref statementNumber="13"><b>1</b><i>3</i></ref> about </statementtext> -->
      </statement>
      <statement id="7"><title>Title 7</title><statementtext id="h">This is statement
            <ref statementNumber="1">1</ref> referring to something else
         </statementtext>
         <!-- This is my <i>statement</i> <b><ref statementNumber="1">1</ref></b> referring to something else -->
      </statement>
      <statement id="8"><title>Title 8</title><statementtext id="i">This is has no reference to another statement</statementtext>
         <!-- his is has no reference to another statement -->
      </statement>
   </sectionA>
</doc>

我是关闭还是完全改变我的方法

解决方法

我尝试使用预处理步骤来包装数字,然后混合使用 group-starting-with/group-adjacent,我认为它现在涵盖了您提供的所有样本,但它相当复杂且嵌套很深分组代码:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:fn="http://www.w3.org/2005/xpath-functions"
    exclude-result-prefixes="#all"
    expand-text="yes"
    version="3.0">

  <xsl:mode on-no-match="shallow-copy"/>
  
  <xsl:template match="text()" mode="analyze">
      <xsl:apply-templates select="analyze-string(.,'statement\s*([0-9]+)')" mode="wrap"/>
  </xsl:template>
  
  <xsl:mode name="analyze" on-no-match="shallow-copy"/>
  
  <xsl:template match="fn:group[@nr = 1]" mode="wrap">
      <n>{.}</n>
  </xsl:template>

  <xsl:template match="statementtext">
      <xsl:copy>
          <xsl:variable name="wrapped" as="node()*">
              <xsl:apply-templates mode="analyze"/>
          </xsl:variable>
 
          <xsl:for-each-group select="$wrapped" group-starting-with="node()[matches(.,'statement\s*$','i')]">
              <xsl:choose>
                  <xsl:when test="matches(.,'i')">
                      <xsl:apply-templates select="."/>
                      <xsl:for-each-group select="tail(current-group())" group-adjacent="matches(.,'^[0-9 ]+$')">
                          <xsl:choose>
                              <xsl:when test="current-grouping-key() and position() = 1 and matches(.,'^\s+$')">
                                  <xsl:apply-templates select="."/>
                                  <ref statementNumber="{string-join(tail(current-group())[not(self::del)])}">
                                      <xsl:apply-templates select="tail(current-group())"/>
                                  </ref>
                              </xsl:when>
                              <xsl:when test="current-grouping-key() and position() = 1">
                                  <ref statementNumber="{string-join(current-group()[not(self::del)])}">
                                      <xsl:apply-templates select="current-group()"/>
                                  </ref>
                              </xsl:when>
                              <xsl:otherwise>
                                  <xsl:apply-templates select="current-group()"/>
                              </xsl:otherwise>
                          </xsl:choose>
                      </xsl:for-each-group>
                  </xsl:when>
                  <xsl:otherwise>
                      <xsl:apply-templates select="current-group()"/>
                  </xsl:otherwise>
              </xsl:choose>
          </xsl:for-each-group>
      </xsl:copy>
  </xsl:template>
  
  <xsl:template match="n">
      <xsl:apply-templates/>
  </xsl:template>
  
</xsl:stylesheet>

https://xsltfiddle.liberty-development.net/bEJbVrL

,

我不打算花时间来生成可行的解决方案,但您应该在代码中修复以下一些问题:

<xsl:variable name="statementNumber">
       <xsl:value-of select="regex-group(2)"></xsl:value-of>
</xsl:variable>
<ref>
    <xsl:attribute name="statementNumber">
      <xsl:value-of select="$statementNumber" />
    </xsl:attribute>
    <xsl:value-of select="regex-group(2)"/>
</ref> 

这可以减少到

<ref statementNumber="{regex-group(2)}">{regex-group(2)}</ref>

还有这个:

<xsl:variable name="thisTextFiltered">
    <xsl:value-of select="./descendant-or-self::text()[not(descendant-or-self::del and comment())]"/>
</xsl:variable> 

不可能是对的,因为文本节点没有后代(撒克逊人应该给你一个警告)。但我不确定你的真正意图是什么。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)&gt; insert overwrite table dwd_trade_cart_add_inc &gt; select data.id, &gt; data.user_id, &gt; data.course_id, &gt; date_format(
错误1 hive (edu)&gt; insert into huanhuan values(1,&#39;haoge&#39;); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive&gt; show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 &lt;configuration&gt; &lt;property&gt; &lt;name&gt;yarn.nodemanager.res