如何解决如何将这个AWK函数放入for循环中以提取列?
我有数十个文件(例如fA.txt,fB.txt和fc.txt),并希望得到fALL.txt中所示的输出。
fA.txt: id V W X Y Z a 1 2 4 8 16 b 3 6 13 17 18 c 5 1 20 4 8 fB.txt: id F G H J K a 2 5 9 7 12 b 4 9 12 3 19 c 6 13 2 40 7 fC.txt: id L M N O P a 7 2 19 8 16 b 8 6 12 23 47 c 91 11 15 19 80 desired output fALL.txt: id fA_V fB_F fC_L a 1 2 7 b 3 4 8 c 5 6 91 id fA_W fB_G fC_M a 2 5 2 b 6 9 6 c 1 13 11 id fA_X fB_H fC_N a 4 9 19 b 13 12 12 c 20 2 15 id fA_Y fB_J fC_O a 8 7 8 b 17 3 23 c 4 40 19 id fA_Z fB_K fC_P a 16 12 16 b 18 19 47 c 8 7 80
我在此站点上看到了以下AWK代码,该代码适用于只有两列的输入文件。
'NR==FNR{a[FNR]=$0; next} {a[FNR] = a[FNR] OFS $2} END{for (i=1;i<=FNR;i++) print a[i]}' file1 file2 file3
就我而言,我对上述内容进行了如下修改,并适用于提取第二列:
'NR==FNR{a[FNR]=$1 OFS $2; next} {a[FNR] = a[FNR] OFS $2} END{for (i=1; i<=FNR; i++) print a[i]}' file1 file2 file3
我尝试将以上内容放入for循环中以提取后续列,但未成功。任何有用的提示将不胜感激。
所需输出中的第一数据块是每个输入文件中的第二列,其标题与文件名和相应输入文件中的列标题串联在一起。随后的块是每个输入文件的第三,第四,第五列。
解决方法
此gnu awk
应该适合您。
cat tab.awk
BEGIN {
OFS = "\t"
for (i=1; i<=ARGC; ++i) {
fn = ARGV[i]
sub(/\.[^.]+$/,"_",fn)
fhdr[i] = fn
}
}
!seen[$1]++ {
keys[++k] = $1
}
{
for (i=2; i<=NF; ++i)
map[$1][i] = map[$1][i] (map[$1][i] == "" ? "" : OFS) (FNR == 1 ? fhdr[ARGIND] : "") $i
}
END {
for (i=2; i<=NF; ++i) {
for (j=1; j<=k; j++) {
key = keys[j]
print key,map[key][i]
}
print ""
}
}
然后将其用作:
awk -f tab.awk f{A,B,C}.txt
id fA_V fB_F fC_L
a 1 2 7
b 3 4 8
c 5 6 91
id fA_W fB_G fC_M
a 2 5 2
b 6 9 6
c 1 13 11
id fA_X fB_H fC_N
a 4 9 19
b 13 12 12
c 20 2 15
id fA_Y fB_J fC_O
a 8 7 8
b 17 3 23
c 4 40 19
id fA_Z fB_K fC_P
a 16 12 16
b 18 19 47
c 8 7 80
说明:
BEGIN {
OFS = "\t" # Use output field separator as tab
for (i=1; i<=ARGC; ++i) { # for each filename in input
fn = ARGV[i]
sub(/\.[^.]+$/,fn) # remove anything after dot with a _
fhdr[i] = fn # and save it in fhdr associative array
}
}
!seen[$1]++ { # if this id is not found in seen array
keys[++k] = $1 # store in seen and in keys array by index
}
{
for (i=2; i<=NF; ++i) # for each field starting from 2nd column
map[$1][i] = map[$1][i] (map[$1][i] == "" ? "" : OFS) (FNR == 1 ? fhdr[ARGIND] : "") $i
# build 2 dimensional array map where key is $1,i and value is column value
# for 1st record prefix column value with part filename stored in fhdr array
# we keep appending value in this array with OFS delimiter
}
END { # do this in the end
for (i=2; i<=NF; ++i) { # for each column position from 2 onwards
for (j=1; j<=k; j++) { # for each id stored in keys array
key = keys[j]
print key,map[key][i] # print id and value text built above
}
print "" # print a line break
}
}
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。