如何解决如何在python中将字节对象传输到以'\ x0'结尾的字符串?
请看这个熊猫数据框:
In [19]: a
Out[19]:
tk sec usec bp1 bp2 bp3 bp4 bp5 ap1 ap2 ... as1 as2 as3 as4 as5 lp amt ls vol oi
0 b'ZN2106' 1619743523 646104 21920.0 21915.0 21910.0 21905.0 21900.0 21930.0 21935.0 ... 11 5 3 8 3 21930.0 1.642792e+10 0 149210 96841
1 b'ZN2106\x0010250\x0009' 1619744401 684254 21935.0 21930.0 21925.0 21920.0 21910.0 21940.0 21945.0 ... 1 8 3 3 17 21940.0 1.642990e+10 0 149228 96843
2 b'ZN2106\x0016750\x009' 1619744402 319044 21940.0 21935.0 21930.0 21925.0 21920.0 21945.0 21950.0 ... 1 1 6 1 13 21940.0 1.643615e+10 0 149285 96829
3 b'ZN2106\x0014750\x0009' 1619744403 422966 21945.0 21940.0 21935.0 21930.0 21925.0 21950.0 21955.0 ... 7 5 11 4 15 21940.0 1.644120e+10 0 149331 96838
4 b'ZN2106\x0012750\x002' 1619744403 883381 21945.0 21940.0 21935.0 21930.0 21925.0 21955.0 21960.0 ... 3 7 6 16 59 21950.0 1.644647e+10 0 149379 96846
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... .. ... ...
20343 b'ZN2106\x0067000\x009' 1619765999 791039 21795.0 21790.0 21785.0 21780.0 21775.0 21800.0 21805.0 ... 95 12 2 11 14 21795.0 2.768403e+10 0 252355 85339
20344 b'ZN2106\x0061000\x00\x000' 1619766000 302063 21795.0 21790.0 21785.0 21780.0 21775.0 21800.0 21805.0 ... 93 13 2 11 14 21800.0 2.768424e+10 0 252357 85339
20345 b'ZN2106\x0013750\x0010' 1619766000 781186 21795.0 21790.0 21785.0 21780.0 21775.0 21800.0 21805.0 ... 93 13 2 11 14 21795.0 2.768435e+10 0 252358 85338
20346 b'ZN2106\x0019000\x009' 1619766001 317317 21795.0 21790.0 21785.0 21780.0 21775.0 21800.0 21805.0 ... 92 13 2 11 14 21795.0 2.768490e+10 0 252363 85338
20347 b'ZN2106\x0019000' 1619766002 518211 21795.0 21790.0 21785.0 21780.0 21775.0 21800.0 21805.0 ... 92 13 2 11 14 21795.0 2.768490e+10 0 252363 85338
[20348 rows x 28 columns]
列 tk
是字节对象,我想将其设为字符串。
我试过了:
df['tk'].str.decode('uft-8')
但我得到了:
In [17]: a['tk'].str.decode('utf-8')
Out[17]:
0 ZN2106
1 ZN21061025009
2 ZN2106167509
3 ZN21061475009
4 ZN2106127502
...
20343 ZN2106670009
20344 ZN2106610000
20345 ZN21061375010
20346 ZN2106190009
20347 ZN210619000
Name: tk,Length: 20348,dtype: object
这不是我想要的,正如你所看到的,第二行,
我想要的是 ZN2106
,但它返回我'ZN21061025009'
它忽略以'\x0'结尾的字符串,我该如何解决?
解决方法
试试apply
:
import pandas as pd
a = pd.DataFrame([b'ZN2106',b'ZN2106\x0010250\x0009'],columns=['tk'])
print(a)
print()
print(a['tk'].apply(lambda x: x.decode('utf8').split('\x00')[0]))
输出:
tk
0 b'ZN2106'
1 b'ZN2106\x0010250\x0009'
0 ZN2106
1 ZN2106
Name: tk,dtype: object
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。