如何解决从多个文件打印常见行 file1.txt: file2.txt file3.txt res.txt
我有3个文件,其中包含任意数量的行(在第一行中指定)。我想获取这些文件中的所有常见行。例如,在每个文件中,文件都有很多行,每行包含四个以空格分隔的坐标。
file1.txt:
5
820.3 262.48 637.815 232.503
657.666 773.366 466.608 754.035
341.845 245.408 163.417 212.897
667.378 687.189 474.277 666.181
518.451 899.594 343.431 881.08
file2.txt
3
1.52 6.878 9.5485
341.845 245.408 163.417 212.897
667.378 687.189 474.277 666.181
file3.txt
4
657.666 773.366 466.608 754.035
341.845 245.408 163.417 212.897
667.378 687.189 474.277 666.181
518.451 899.594 343.431 881.08
我的输出文件res.txt应该是:
res.txt
2
341.845 245.408 163.417 212.897
667.378 687.189 474.277 666.181
在这里,我们有2条共同的行,因此应在第一行中打印。如何缩放多个文件?
我尝试编写一个python脚本来处理两个文件,但是我认为效率不高。我尝试的代码是:
import numpy as np
l1 = []
l2 = []
with open('matchings1_2.txt','r') as f1:
for line in f1:
line = line.split()
l1.append(line)
with open('matchings2_3.txt','r') as f2:
for line in f2:
line = line.split()
l2.append(line)
l1 = np.array(l1[1:]).astype(float)
l2 = np.array(l2[1:]).astype(float)
l = []
for r in l1:
if r in l2:
l.append(list(r))
l.insert(0,[len(l)])
with open('Result.txt','w') as f:
for item in l:
s = ""
for i in range(len(item)):
if (i != len(item) - 1):
s += str(item[i]) + " "
else:
s += str(item[i])
f.write("%s\n" % s)
解决方法
我写了一个较短的代码,希望它不会太复杂,我认为它已经完成了。
rows =[] # to store rows of all files in a nested list
file_names =["f1","f2","f3"] # names of text files
for file in file_names:
f1 = open(file+".txt","r")
temp =[] #to store rows of each file separately
for i in f1:
s = i.rstrip() # removes next line character from both ends of each row
if len(s)!=1: # to exclude first line of each row
temp.append(s)
rows.append(set(temp)) # storing as a set so that we can use intersection
f1.close()
final_rows = rows[0] # initializing as rows of first files
for i in range(1,len(rows)):
final_rows = final_rows.intersection(rows[i]) # repeated intersection
f1 = open("res.txt",'w')
f1.write(str(len(final_rows))+"\n") # storing the length of common rows
for i in final_rows:
f1.write(i+"\n") #storing the common rows
f1.close()
如果所有文件都位于相同格式的同一目录中,则可以进行一些更改:
import os
file_names = os.listdir()# if this python file and text files are in same directory or use os.listdir("xyz/abc") incase they are in other directory
for file in file_names:
f1 = open(file,"r") # use file instead of file+".txt"
,
如@Aryman的答案中所建议的那样,设置相交可能是实现此目标的方法。要将操作应用于未定义长度的序列,可以使用o x n
。
functools.reduce
其中from functools import reduce
from pathlib import Path
def lines(text_file):
with open(text_file) as f:
result = f.read().splitlines()
return result
unique_lines = (set(lines(file)[1:]) # exclude the first line
for file in Path('folder').glob('file*.txt'))
common_lines = reduce(lambda x,y: x & y,unique_lines)
print(list(common_lines))
等效于x & y
。您也可以使用x.intersection(y)
代替lambda。
输出:
operator.and_
,
我可以为下面的问题编写解决方案,我已经在下面粘贴了。所有评论,所以我希望它易于阅读:)
import os # a library for accessing the os
all_rows = [] # to load all lines into
res = [] # to load result into
number_files = 0
path_to_files = "." # you can use "." if your files are in the same directory as the .py file
for file in os.listdir(path_to_files): # put your path to files here,lists all files in that directory
if file.startswith("file") and file.endswith(".txt"):
number_files += 1 # keep a count of number of files for later
with open(file,"r") as f:
content = f.readlines() # read all lines
content = [x.strip() for x in content] # remove \n from lines
all_rows.extend(content) # add all items of content to all_rows without creating a 2d list
f.close()
for i in range(1,int(all_rows[0]) + 1): # all rows in first file
if all_rows.count(all_rows[i]) == number_files: # if row occurs in all files
res.append(all_rows[i]) # append to res
res.insert(0,str(len(res))) # insert number of rows into res
with open(os.path.join(path_to_files,"res.txt"),"w+") as r: # create new file in directory called res.txt
for row in res: # for every row which all files have in common
r.write(row + "\n") # add newline character
r.close() # close file
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。