微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

为什么我需要一个带有pg_dump的临时文件?

如何解决为什么我需要一个带有pg_dump的临时文件?

我一直收到来自pgdumplib错误,并将问题归结为pg_dump输出如何重定向

这是我想做的,但是始终失败并显示RuntimeError: Unsupported data format

F=/tmp/test_Fc_format.dump
ssh codimd "sudo -u codimd bash -c 'cd /; pg_dump -d codimd -Fc'" >$F
python -c "import pgdumplib; dump = pgdumplib.load('$F')"

我可以通过重定向远程计算机上的文件解决此问题。此序列始终有效:

F=/tmp/test_Fc_format.dump
ssh codimd "sudo -u codimd bash -c 'cd /; pg_dump -d codimd -Fc >/tmp/934354 && cat /tmp/934354'" >$F
python -c "import pgdumplib; dump = pgdumplib.load('$F')"

请注意,唯一的区别是第二个序列添加 >/tmp/934354 && cat /tmp/934354,即首先将pg_dump输出重定向到远程计算机上的文件,然后将其发送到stdout。在这两种情况下,生成文件大小都是相同的(尽管由于数据库处于联机状态,所以大小不相同)。

本地和远程计算机均运行Ubuntu 20.04。

为什么需要执行此额外步骤,并且有更好的方法解决此问题?

更新1 : 这也会产生错误

F=/tmp/test_Fc_format.dump
ssh codimd "sudo -u codimd bash -c 'cd /; pg_dump -d codimd -Fc |tee /tmp/934354 >/dev/null && cat /tmp/934354'" >$F
python -c "import pgdumplib; dump = pgdumplib.load('$F')"

换句话说,要正常工作,pg_dump似乎需要重定向到本地文件-f选项。

更新2 : 在每个数据库上使用hd之后,这里是数据库的好版本之间差异的完整列表(注意pg_dump不会两次产生相同的输出):

2c2
< 00000010  00 11 00 00 00 00 19 00  00 00 00 16 00 00 00 00  |................|
---
> 00000010  00 2e 00 00 00 00 19 00  00 00 00 16 00 00 00 00  |................|
349c349
< 000015c0  31 38 01 01 00 00 00 01  00 00 00 00 00 00 00 00  |18..............|
---
> 000015c0  31 38 01 01 00 00 00 02  b0 30 00 00 00 00 00 00  |18.......0......|
370c370
< 00001710  31 34 01 01 00 00 00 01  00 00 00 00 00 00 00 00  |14..............|
---
> 00001710  31 34 01 01 00 00 00 02  1e 5f 00 00 00 00 00 00  |14......._......|
387c387
< 00001820  00 00 01 00 00 00 00 00  00 00 00 00 b7 0b 00 00  |................|
---
> 00001820  00 00 02 21 d9 60 00 00  00 00 00 00 b7 0b 00 00  |...!.`..........|
399c399
< 000018e0  01 00 00 00 00 00 00 00  00 00 bf 0b 00 00 00 01  |................|
---
> 000018e0  02 91 0a 4c 01 00 00 00  00 00 bf 0b 00 00 00 01  |...L............|
412,413c412,413
< 000019b0  03 00 00 00 32 32 30 01  01 00 00 00 01 00 00 00  |....220.........|
< 000019c0  00 00 00 00 00 00 ba 0b  00 00 00 01 00 00 00 00  |................|
---
> 000019b0  03 00 00 00 32 32 30 01  01 00 00 00 02 ce 0b 4c  |....220........L|
> 000019c0  01 00 00 00 00 00 ba 0b  00 00 00 01 00 00 00 00  |................|
425c425
< 00001a80  35 01 01 00 00 00 01 00  00 00 00 00 00 00 00 00  |5...............|
---
> 00001a80  35 01 01 00 00 00 02 0b  ee 4c 01 00 00 00 00 00  |5........L......|
438c438
< 00001b50  00 00 01 00 00 00 00 00  00 00 00 00 b8 0b 00 00  |................|
---
> 00001b50  00 00 02 28 ee 4c 01 00  00 00 00 00 b8 0b 00 00  |...(.L..........|
456c456
< 00001c70  01 00 00 00 01 00 00 00  00 00 00 00 00 00 c7 0b  |................|
---
> 00001c70  01 00 00 00 02 45 ee 4c  01 00 00 00 00 00 c7 0b  |.....E.L........|

更新3 : 事实证明,这与ssh无关。我认为pg_dump需要一个搜索文件作为输出在这里,我演示了在重定向输出文件之前插入|cat会导致文件损坏。如果为true,这是pg_dump中的错误吗?

$ pg_dump -d codimd -Fc >/tmp/good
$ python3 -c "import pgdumplib; dump = pgdumplib.load('/tmp/good')"
$ # no error
$ pg_dump -d codimd -Fc |cat >/tmp/bad
$ python3 -c "import pgdumplib; dump = pgdumplib.load('/tmp/bad')"
Traceback (most recent call last):
  File "<string>",line 1,in <module>
  File "/usr/local/lib/python3.8/dist-packages/pgdumplib/__init__.py",line 24,in load
    return dump.Dump(converter=converter).load(filepath)
  File "/usr/local/lib/python3.8/dist-packages/pgdumplib/dump.py",line 254,in load
    raise RuntimeError('Unsupported data format')
RuntimeError: Unsupported data format
$ 

解决方法

对我进行一些测试:

F=test_dmp.out

pg_dump -d test -U postgres -Fc > $F | ls -al test_dmp.out 
-rw-r--r-- 1 aklaver users 0 Sep 30 14:06 test_dmp.out

pg_dump -d test -U postgres -Fc > temp_file.out && cat temp_file.out > $F | ls -al test_dmp.out 
-rw-r--r-- 1 aklaver users 97488 Sep 30 14:08 test_dmp.out

pg_dump -d test -U postgres -Fc -f $F | ls -al test_dmp.out 
-rw-r--r-- 1 aklaver users 97488 Sep 30 14:08 test_dmp.out

我知道不是>为什么行不通的答案,但是它提供了一种替代方法。由于某些原因,>将创建一个空文件。其他选项则没有。

更新

似乎与SSH转移有关:

F=test_dump.out
#Using local only.
pg_dump -d test -U postgres -Fc > $F
python -c "import pgdumplib; dump = pgdumplib.load('$F'); print('Database: {}'.format(dump.dbname))"
Database: test

#Using SSH,different database
ssh arkansas "sudo -u aklaver bash -c 'cd /; pg_dump -d redmine -U postgres -Fc'" >$F

python -c "import pgdumplib; dump = pgdumplib.load('$F'); print('Database: {}'.format(dump.dbname))"
Traceback (most recent call last):
  File "<string>",line 1,in <module>
  File "/home/aklaver/py_virt/py37/lib/python3.7/site-packages/pgdumplib/__init__.py",line 24,in load
    return dump.Dump(converter=converter).load(filepath)
  File "/home/aklaver/py_virt/py37/lib/python3.7/site-packages/pgdumplib/dump.py",line 254,in load
    raise RuntimeError('Unsupported data format')
RuntimeError: Unsupported data format
#Though the file itself ends up being ok. The below does not error out.
#There seems to be something asynchronous going on. In other words 
#pgdumplib is reading the file before it is complete.
pg_restore  -f test.sql test_dump.out

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。