为什么这种方法无法在python中的字符串中找到子字符串的索引？

如何解决为什么这种方法无法在python中的字符串中找到子字符串的索引？

问题

我试图在字符串中找到"\n"分隔符的索引，以便可以将字符串拆分为不包含"\n"字符的子字符串列表。 answer in this post建议使用numpy.core.defchararray.find(...)（或等效的numpy.char.find(...)）。我想象有一种使用re模块的“更好”的方法，但是我个人觉得它很令人困惑，并希望有一种更易读的方法。因此，我尝试滚动自己的方法来检查这些索引。我没有获得预期的输出，并且想了解原因。如果很重要，则字符串的第一个和最后一个字符分别为[和]；该字符串还包含"和'个字符。

MWE

import numpy as np

## this is the string to search for substrings (delimited by "\n")
s = str(["Parameters:\t['vertical speed']\nConditions:\t['greater than or equal']\nValues:\t[300]\nModifiers:\t[None]\n"])

## s
["Parameters:\t['vertical speed']\nConditions:\t['greater than or equal']\nValues:\t[300]\nModifiers:\t[None]\n"]

## use numpy
loc = np.char.find(s,"\n") # '\n'
print(loc) ## outputs -1; why?

对于以下方法，窗口大小对应于定界字符"\n"的长度（因此window_size=2）；每两个2个连续字符都检查"\n"。

## search using rolling window
## window size = len("\n")
size = len(s)
i = 0
substring = "\n"
window_size = len(substring)
while i < size - window_size - 1: ## avoid parsing substring smaller than window size
    _s = s[i:i+window_size+1]
    # print(" .. {},{}".format(i,_s)) ## verify sub-strings are correct size
    if _s == substring:
        print("\n i = {} - {}\n".format(i,i+window_size+1)) ## *should* print the indices of "\n" but if condition is never satisfied
    i += 1

这两种解决方案均无效，我也不知道为什么。我注释掉了print(" .. {},_s))这一行；取消注释该命令将在字符串中每2个连续字符输出一次-我可以看到索引对应于"\n"的位置，但这未反映在print语句的输出中：

 .. 33,\n
 .. 73,\n
 .. 89,\n
 .. 109,\n

这些方法为什么会失败？对于索引，我的预期输出可能是[33,73,78,109]，对于拆分子字符串的列表，我的预期输出可能是["["Parameters:\t['vertical speed']","Conditions:\t['greater than or equal']","Values:\t[300]","Modifiers:\t[None]"]。

为什么这种方法无法在python中的字符串中找到子字符串的索引？

如何解决为什么这种方法无法在python中的字符串中找到子字符串的索引？

相关推荐