如何解决Unix将可变的命名值解析为单独的行
输入文件:
ID|Text
1|name1=value1;name3;name4=value2;name5=value5
2|name1=value1;name2=value2;name6=;name7=value7;name8=value8
此处的文本已将值对命名为内容,并且长度可变。请注意,文本列中的名称可以包含分号。我们正在尝试解析输入,但是我们无法通过AWK或BASH处理输入
所需的输出:
1|name1=value1
1|name3;name4=value2
1|name5=value5
2|name1=value1
2|name2=value2
2|name6=
2|name7=value7
2|name8=value8
下面的代码片段适用于ID = 2,但不适用于ID = 1
echo "2|name1=value1;name2=value2;name6=;name7=value7;name8=value8" | while IFS="|"; read id text;do dsc=`echo $text|tr ';' '\n'`;echo "$dsc" >tmp;done
cat tmp
2|name1=value1
2|name2=value2
2|name6=
2|name7=value7
2|name8=value8
echo "1|name1=value1;name3;name4=value2;name5=value5" | while IFS="|"; read id text;do dsc=`echo $text|tr ';' '\n'`;echo "$dsc" >tmp;sed -i "s/^/${id}\|/g" tmp;done
cat tmp
1|name1=value1
1|name3
1|name4=value2
1|name5=value5
非常感谢您的帮助。
解决方法
请您尝试使用新版本的GNU awk
来跟踪,编写和测试所示示例。由于OP的awk
版本较旧,因此如果有人拥有awk
的旧版本,请尝试将其更改为awk --re-interval
awk '
BEGIN{
FS=OFS="|"
}
FNR==1{ next }
{
first=$1
while(match($0,/(name[0-9]+;?){1,}=(value[0-9]+)?/)){
print first,substr($0,RSTART,RLENGTH)
$0=substr($0,RSTART+RLENGTH)
}
}' Input_file
输出如下。
1|name1=value1
1|name3;name4=value2
1|name5=value5
2|name1=value1
2|name2=value2
2|name6=
2|name7=value7
2|name8=value8
说明: 添加了以上详细说明(以下仅出于说明目的)。
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here.
FS=OFS="|" ##Setting FS and OFS wiht | here.
}
FNR==1{ next } ##If line is first line then go next,do not print anything.
{
first=$1 ##Creating first and setting as first field here.
while(match($0,}=(value[0-9]+)?/)){
##Running while loop which has match which has a regex of matching name and value all mentioned permutations and combinations.
print first,RLENGTH) ##Printing first and sub string(currently matched one)
$0=substr($0,RSTART+RLENGTH) ##Saving rest of the line into current line.
}
}' Input_file ##Mentioning Input_file name here.
,
样本数据:
$ cat name.dat
ID|Text
1|name1=value1;name3;name4=value2;name5=value5
2|name1=value1;name2=value2;name6=;name7=value7;name8=value8
一个awk
解决方案:
awk -F"[|;]" ' # use "|" and ";" as input field delimiters
FNR==1 { next } # skip header line
{ pfx=$1 "|" # set output prefix to field 1 + "|"
printpfx=1 # set flag to print prefix
for ( i=2 ; i<=NF ; i++ ) # for fields 2 to NF
{
if ( printpfx) { printf "%s",pfx ; printpfx=0 } # if print flag == 1 then print prefix and clear flag
if ( $(i) ~ /=/ ) { printf "%s\n",$(i) ; printpfx=1 } # if current field contains "=" then print it,end this line of output,reset print flag == 1
if ( $(i) !~ /=/ ) { printf "%s;",$(i) } # if current field does not contain "=" then print it and include a ";" suffix
}
}
' name.dat
以上内容生成:
1|name1=value1
1|name3;name4=value2
1|name5=value5
2|name1=value1
2|name2=value2
2|name6=
2|name7=value7
2|name8=value8
,
Bash解决方案:
#!/usr/bin/env bash
while IFS=\| read -r id text || [ -n "$id" ]; do
IFS=\; read -r -a kv_arr < <(printf %s "$text")
printf "$id|%s\\n" "${kv_arr[@]}"
done < <(tail -n +2 a.txt)
一个普通的POSIX shell解决方案:
#!/usr/bin/env sh
# Chop the header line from the input file
tail -n +2 a.txt |
# While reading id and text Fields Separated by vertical bar
while IFS=\| read -r id text || [ -n "$id" ]; do
# Sets the separator to a semicolon
IFS=\;
# Print each semicolon separated field formatted on
# its own line with the ID
# shellcheck disable=SC2086 # Explicit split on semicolon
printf "$id|%s\\n" $text
done
输入a.txt
:
ID|Text
1|name1=value1;name3;name4=value2;name5=value5
2|name1=value1;name2=value2;name6=;name7=value7;name8=value8
输出:
1|name1=value1
1|name3
1|name4=value2
1|name5=value5
2|name1=value1
2|name2=value2
2|name6=
2|name7=value7
2|name8=value8
,
您有一些不错的答案,而且已经被接受。这是一个简短得多的gnu awk命令,它也可以完成这项工作:
body = {"name": "my comment here"}
print(url_api)
header = {'Authorization': 'Bearer ' + token}
print(header)
response = requests.post(url_api,verify=False,headers=header,data=body)
print(response)
awk -F '|' 'NR > 1 {
for (s=$2; match(s,/([^=]+=[^;]*)(;|$)/,m); s=substr(s,RLENGTH+1))
print $1 FS m[1]
}' file.txt
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。