如何解决尝试在 rust 中迭代 2 个文件
我正在尝试读取 2 个文件并比较每个文件中的每个项目以查看它们是否相等。
use std::fs::File;
use std::io::{BufRead,BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let mut reader2 = BufReader::new(file2);
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line = line.unwrap(); // Ignore errors.
for line2 in reader2.lines() {
let line2 = line2.unwrap(); // Ignore errors.
if line2 == line1 {
println!("{}",line2)
}
}
}
}
然而,这行不通。如何在带有缓冲区的循环上应用循环?
解决方法
您的第一个问题是 this question 的重复。 TLDR:如果您希望能够在调用 reader2
方法后(例如在下一次循环迭代中)重用 lines
,您需要调用 by_ref
。
这样,您的代码将可以编译但无法运行,因为一旦您处理了第一个文件的第一行,您就处于第二个文件的末尾,因此在处理后续行时,第二个文件将显示为空。您可以通过为每一行倒回第二个文件来解决这个问题。使您的代码正常工作的最小更改集是:
use std::io::Read;
use std::io::Seek;
use std::io::SeekFrom;
use std::fs::File;
use std::io::{BufRead,BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let mut reader2 = BufReader::new(file2);
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line1 = line1.unwrap(); // Ignore errors.
reader2.seek (SeekFrom::Start (0)).unwrap(); // <-- Add this line
for line2 in reader2.by_ref().lines() { // <-- Use by_ref here
let line2 = line2.unwrap(); // Ignore errors.
if line2 == line1 {
println!("{}",line2)
}
}
}
}
但是这会很慢。您可以通过读取 HashSet
中的一个文件并检查另一个文件的每一行是否在集合中来使其更快:
use std::collections::HashSet;
use std::fs::File;
use std::io::{BufRead,BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let file2 = File::open(filename2).unwrap();
let reader2 = BufReader::new(file2);
let lines2 = reader2.lines().collect::<Result<HashSet<_>,_>>().unwrap();
// Read the file line by line using the lines() iterator from std::io::BufRead.
for line1 in reader.lines() {
let line1 = line1.unwrap(); // Ignore errors.
if lines2.contains (&line1) {
println!("{}",line1)
}
}
}
最后,您还可以将两个文件读入 HashSet
并打印出交集:
use std::collections::HashSet;
use std::fs::File;
use std::io::{BufRead,BufReader};
fn main() {
let filename1 = "file1.txt";
let filename2 = "file2.txt";
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename1).unwrap();
let reader = BufReader::new(file);
let lines1 = reader.lines().collect::<Result<HashSet<_>,_>>().unwrap();
let file2 = File::open(filename2).unwrap();
let reader2 = BufReader::new(file2);
let lines2 = reader2.lines().collect::<Result<HashSet<_>,_>>().unwrap();
for l in lines1.intersection (&lines2) {
println!("{}",l)
}
}
作为奖励,最后一个解决方案将删除重复的行。 OTOH 它不会保留行的顺序。
,虽然我找到了解决方案,但速度非常慢。如果有人有更好的解决方案来在 2 个文件中找到相似的项目,请告诉我。
use std::fs::File;
use std::io::{BufRead,BufReader};
fn main() {
let mut vec2 = findvec("file1.txt".to_string());
let mut vec3 = &findvec("file2.txt".to_string());
for line in vec2 {
for line2 in vec3 {
if line.to_string() == line2.to_string() {
println!("{}",line.to_string());
}
}
}
}
fn findvec(filename: String) -> Vec<String> {
// Open the file in read-only mode (ignoring errors).
let file = File::open(filename).unwrap();
let reader = BufReader::new(file);
// blank vector
let mut myvec = Vec::new();
// Read the file line by line using the lines() iterator from std::io::BufRead.
for (index,line) in reader.lines().enumerate() {
let line = line.unwrap(); // Ignore errors.
// Show the line and its number.
myvec.push(line);
}
myvec
}
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。