在特定位置将空格改为下划线

如何解决在特定位置将空格改为下划线

我有这样的字符串：

strings = ['pic1.jpg siberian cat 24 25','pic2.jpg siemese cat 14 32','pic3.jpg american bobtail cat 8 13','pic4.jpg cat 9 1']

我想要的是将猫品种之间的空格替换为连字符，以消除 .jpg 和品种中第一个单词和数字之间的空格。

预期输出：

['pic1.jpg siberian_cat 24 25','pic2.jpg siemese_cat 14 32','pic3.jpg american_bobtail cat 8 13','pic4.jpg cat 9 1']

我尝试构建模式如下：

[re.sub(r'(?<!jpg\s)([a-z])\s([a-z])\s([a-z])',r'\1_\2_\3',x) for x in strings ]

但是，我在 .jpg 和下一个单词之间添加了连字符。

问题在于“cat”并不总是放在单词组合的末尾。

解决方法

这是一种使用 re.sub 和回调函数的方法：

strings = ['pic1.jpg siberian cat 24 25','pic2.jpg siemese cat 14 32','pic3.jpg american bobtail cat 8 13','pic4.jpg cat 9 1']  
output = [re.sub(r'(?<!\S)\w+(?: \w+)* cat\b',lambda x: x.group().replace(' ','_'),x) for x in strings]
print(output)

打印：

['pic1.jpg siberian_cat 24 25','pic2.jpg siemese_cat 14 32','pic3.jpg american_bobtail_cat 8 13','pic4.jpg cat 9 1']

这里是对使用的正则表达式模式的解释：

(?<!\S)    assert what precedes first word is either whitespace or start of string
\w+        match a word,which is then followed by
(?: \w+)*  a space another word,zero or more times
[ ]        match a single space
cat\b      followed by 'cat'

换句话说，以第三个列表元素为例，正则表达式匹配american bobtail cat，然后在lambda回调函数中用下划线替换所有空格。

试试这个[re.sub(r'jpg\s((\S+\s)+)cat',"jpg " + "_".join(x.split('jpg')[1].split('cat')[0].strip().split()) + "_cat",x) for x in strings ]

在特定位置将空格改为下划线

如何解决在特定位置将空格改为下划线

解决方法

相关推荐