如何解决Docker下的Postgres从主机处杀死OOM
我有一个具有8GB内存(已格式化)的VM,其上有2个Docker容器:
数据已完全加载,但是在导入期间,我陷入了CREATE INDEX
语句中,该语句向具有约8亿条记录的表添加了简单的hash(ID)索引。
索引创建过程中的问题是,postgres进程的大小会增加到主机内核对其执行OOM杀死的程度。
使用(Nomad)配置参数,将Docker容器配置为使用约6GB(但我也尝试使用较低的值):
config {
memory_hard_limit = 6144
...
resources {
memory = 6144
此外,vm.overcommit_memory = 1
(以前是默认的0
)
和vm.overcommit_ratio = 100
(之前的默认值是50)。
Docker CLI正确显示内存限制:
# more /sys/fs/cgroup/memory/memory.limit_in_bytes
6442450944
但是Postgres进行CREATE INDEX
时,它最终会因OOM而被杀死:
Oct 1 15:17:46 prod kernel: [11515.764300] postgres invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL),nodemask=(null),order=0,oom_score_adj=0
Oct 1 15:17:46 prod kernel: [11515.764302] postgres cpuset=d562f548ef35884fb77a41e53466a18205d63cce3d9ffa8a808d2e95b5609e3e mems_allowed=0
Oct 1 15:17:46 prod kernel: [11515.764308] cpu: 1 PID: 18485 Comm: postgres Not tainted 4.15.0-118-generic #119-Ubuntu
Oct 1 15:17:46 prod kernel: [11515.764309] Hardware name: OpenStack Foundation OpenStack Nova,BIOS 1.10.2-1ubuntu1 04/01/2014
Oct 1 15:17:46 prod kernel: [11515.764309] Call Trace:
Oct 1 15:17:46 prod kernel: [11515.764331] dump_stack+0x6d/0x8e
Oct 1 15:17:46 prod kernel: [11515.764335] dump_header+0x71/0x285
Oct 1 15:17:46 prod kernel: [11515.764337] oom_kill_process+0x21f/0x420
Oct 1 15:17:46 prod kernel: [11515.764339] out_of_memory+0x116/0x4e0
Oct 1 15:17:46 prod kernel: [11515.764343] mem_cgroup_out_of_memory+0xbb/0xd0
Oct 1 15:17:46 prod kernel: [11515.764345] mem_cgroup_oom_synchronize+0x2e8/0x320
Oct 1 15:17:46 prod kernel: [11515.764347] ? mem_cgroup_css_reset+0xe0/0xe0
Oct 1 15:17:46 prod kernel: [11515.764349] pagefault_out_of_memory+0x36/0x7b
Oct 1 15:17:46 prod kernel: [11515.764354] mm_fault_error+0x90/0x180
Oct 1 15:17:46 prod kernel: [11515.764355] __do_page_fault+0x46b/0x4b0
Oct 1 15:17:46 prod kernel: [11515.764357] do_page_fault+0x2e/0xe0
Oct 1 15:17:46 prod kernel: [11515.764363] ? async_page_fault+0x2f/0x50
Oct 1 15:17:46 prod kernel: [11515.764366] do_async_page_fault+0x51/0x80
Oct 1 15:17:46 prod kernel: [11515.764367] async_page_fault+0x45/0x50
Oct 1 15:17:46 prod kernel: [11515.764369] RIP: 0033:0x562aff7df26d
Oct 1 15:17:46 prod kernel: [11515.764370] RSP: 002b:00007ffef148b790 EFLAGS: 00010206
[...]
Oct 1 15:17:46 prod kernel: [11515.764374] R13: 0000000000000195 R14: 0000000000000000 R15: 0000000000000000
Oct 1 15:17:46 prod kernel: [11515.764376] Task in /docker/d562f548ef35884fb77a41e53466a18205d63cce3d9ffa8a808d2e95b5609e3e killed as a result of limit of /docker/d562f548ef35884fb77a41e53466a18205d63cce3d9ffa8a808d2e95b5609e3e
Oct 1 15:17:46 prod kernel: [11515.764382] memory: usage 6291456kB,limit 6291456kB,failcnt 6210203
Oct 1 15:17:46 prod kernel: [11515.764383] memory+swap: usage 0kB,limit 9007199254740988kB,failcnt 0
Oct 1 15:17:46 prod kernel: [11515.764384] kmem: usage 29264kB,failcnt 0
Oct 1 15:17:46 prod kernel: [11515.764384] Memory cgroup stats for /docker/d562f548ef35884fb77a41e53466a18205d63cce3d9ffa8a808d2e95b5609e3e: cache:1628316KB RSS:4633876KB RSS_huge:0KB shmem:1628260KB mapped_file:1628272KB dirty:0KB writeback:0KB inactive_anon:1617900KB active_anon:4644212KB inactive_file:0KB active_file:0KB unevictable:0KB
Oct 1 15:17:46 prod kernel: [11515.764391] [ pid ] uid tgid total_vm RSS pgtables_bytes swapents oom_score_adj name
Oct 1 15:17:46 prod kernel: [11515.764512] [18146] 1001 18146 424925 16612 262144 0 0 postgres
Oct 1 15:17:46 prod kernel: [11515.764525] [18341] 1001 18341 425526 24841 954368 0 0 postgres
Oct 1 15:17:46 prod kernel: [11515.764527] [18342] 1001 18342 424958 388655 3293184 0 0 postgres
Oct 1 15:17:46 prod kernel: [11515.764529] [18343] 1001 18343 424958 5540 180224 0 0 postgres
Oct 1 15:17:46 prod kernel: [11515.764530] [18344] 1001 18344 425071 2106 180224 0 0 postgres
Oct 1 15:17:46 prod kernel: [11515.764531] [18345] 1001 18345 16582 1156 135168 0 0 postgres
Oct 1 15:17:46 prod kernel: [11515.764533] [18346] 1001 18346 425062 1743 159744 0 0 postgres
Oct 1 15:17:46 prod kernel: [11515.764537] [18418] 1001 18418 425631 5605 196608 0 0 postgres
Oct 1 15:17:46 prod kernel: [11515.764540] [18483] 1001 18483 428502 7794 225280 0 0 postgres
Oct 1 15:17:46 prod kernel: [11515.764541] [18485] 1001 18485 1578423 1561820 12652544 0 0 postgres
Oct 1 15:17:46 prod kernel: [11515.764545] Memory cgroup out of memory: Kill process 18485 (postgres) score 994 or sacrifice child
Oct 1 15:17:46 prod kernel: [11515.767788] Killed process 18485 (postgres) total-vm:6313692kB,anon-RSS:4614856kB,file-RSS:11372kB,shmem-RSS:1621052kB
Oct 1 15:17:47 prod kernel: [11516.011553] oom_reaper: reaped process 18485 (postgres),Now anon-RSS:0kB,file-RSS:0kB,shmem-RSS:1621052kB
我已经尝试过在主机,Docker配置和Postgres内存设置上使用各种内存大小组合,但是每次内存使用量一直在上升,直到关闭oom-killer为止。
为什么Postgres看起来不能发挥出色并停留在内部-我相信-被允许分配内存?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。