findall 并替换包括标记的实例

如何解决findall 并替换包括标记的实例

我使用的是 Python 3.7.9 并且我有某种 HTML 代码，其中包含来自 Pandas 表的一些数据。我想为熊猫表中的特定数据着色，因此我想重新使用字符串标记之间的文本并将其替换为其他一些标记（它们在 Confluence 中用于以特定颜色标记文本。）

我的输入文本字符串是：

text = 'some text Now important information starts decrease-123456decrease more text not to touch next marker increase7896278689increase and more text another marker decrease-12355decrease with important information'

替换字符串为：

increase = '<span style=\"color: Red;\">'+val+'</span>'
decrease = '<span style=\"color: Green;\">'+val+'</span>'

和 val 是要在标记之间找到的信息。

所以我的预期输出是：

output = some text Now important information starts <span style=\"color: Green;\">-123456</span> more text not to touch next marker <span style=\"color: Red;\">7896278689</span> and more text another marker <span style="color: Green;">-12355</span> with important information

这是我尝试过的：

import re

text = 'some text Now important information starts decrease-123456decrease more text not to touch next marker increase7896278689increase and more text another marker decrease-12355decrease with important information'
found_increase = re.findall('increase(.+?)increase',text)
found_decrease = re.findall('decrease(.+?)decrease',text)
output=''
for i,val in enumerate(found_increase):
    output=text.replace('increase'+val+'increase','<span style=\"color: Red;\">'+val+'</span>')
for i,val in enumerate(found_decrease):
    output=text.replace('decrease'+val+'decrease','<span style=\"color: Green;\">'+val+'</span>')
print(output)

我也尝试过 Pandas 附带的样式方法，但 Confluence 不是真正的 HTML，因此这种方法对我不起作用。在我上面的例子中，我得到以下输出：

Some text Now important information starts decrease-123456decrease more text not to touch next marker increase7896278689increase and more text another marker <span style="color: Green;">-12355</span> with important information

解决方法

我发现此代码有效：

print(re.sub(r"decrease(.*?)decrease",r"<span style=\"color: Green;\">\1</span>",test))

这里发生的事情是我们正在替换模式

"decrease(.*?)decrease"

与

"<span style=\"color: Green;\">\1</span>"

其中 \1 是 (.*?) 的内容。注意字符串前的前导 r。您可以了解为什么会出现这种情况here。

显然，您也需要为增加版本重新创建它。

请注意 replace() will replace all occurences，您的代码似乎没有考虑到这一点。

python 正则表达式引擎直接支持通过捕获组和 re.sub/re.Pattern.sub 进行替换。默认是替换所有出现的模式。

https://docs.python.org/3/library/re.html#re.sub

访问第一个捕获组的模式分别为 r'\1' 或 '\\1'

import re
text = 'some text now important information starts decrease-123456decrease more text not to touch next marker increase7896278689increase and more text another marker decrease-12355decrease with important information'
inc_replaced = re.sub('increase(.+?)increase','<span style=\"color: Red;\">\\1</span>',text)
output = re.sub('decrease(.+?)decrease','<span style=\"color: Green;\">\\1</span>',text)

>>> output                                                                                                                                                                                                                                
'some text now important information starts <span style="color: Green;">-123456</span> more text not to touch next marker increase7896278689increase and more text another marker <span style="color: Green;">-12355</span> with important information'

findall 并替换包括标记的实例

如何解决findall 并替换包括标记的实例

解决方法

相关推荐