微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

linux – ‘BUG:无法在Google Compute Engine上处理’内核NULL指针取消引用’

在半定期的基础上,我看到GCE实例冻结了以下错误消息(来自串行控制台):
g[1375589.784755] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
g[1375589.786206] IP: [<ffffffff810a67d9>] check_preempt_wakeup+0xd9/0x1d0
g[1375589.787341] PGD 5da04067 PUD db83067 PMD 0 
g[1375589.788607] Oops: 0000 [#1] SMP 
g[1375589.788705] Modules linked in: veth xt_addrtype xt_conntrack iptable_filter ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 ip_tables x_tables nf_nat nf_conntrack bridge stp llc aufs(C) softdog crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 processor psmouse parport_pc parport i2c_piix4 i2c_core thermal_sys lrw virtio_net evdev pcspkr serio_raw gf128mul glue_helper ablk_helper cryptd button ext4 crc16 mbcache jbd2 sd_mod crc_t10dif crct10dif_common virtio_scsi scsi_mod virtio_pci virtio virtio_ring
g[1375589.788705] cpu: 1 PID: 1515 Comm: docker Tainted: G         C    3.16.0-0.bpo.4-amd64 #1 Debian 3.16.7-ckt9-3~deb8u1~bpo70+1
g[1375589.788705] Hardware name: Google Google,BIOS Google 01/01/2011
g[1375589.788705] task: ffff88006fffc110 ti: ffff880003ac4000 task.ti: ffff880003ac4000
g[1375589.788705] RIP: 0010:[<ffffffff810a67d9>]  [<ffffffff810a67d9>] check_preempt_wakeup+0xd9/0x1d0
g[1375589.788705] RSP: 0018:ffff880003ac7e30  EFLAGS: 00010002
g[1375589.788705] RAX: 0000000000000001 RBX: ffff880073112ec0 RCX: 0000000000000002
g[1375589.788705] RDX: 0000000000000001 RSI: ffff880009156d20 RDI: ffff880073112f38
g[1375589.788705] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
g[1375589.788705] R10: ffffffffffffffe0 R11: 0000000000000000 R12: ffff88006d2dcd00
g[1375589.788705] R13: ffff88006fffc110 R14: 0000000000000000 R15: 0000000000000000
g[1375589.788705] FS:  000000000323a880(0063) GS:ffff880073100000(0000) knlGS:0000000000000000
g[1375589.788705] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
g[1375589.788705] CR2: 0000000000000078 CR3: 0000000034bff000 CR4: 00000000000406e0
g[1375589.788705] Stack:
g[1375589.788705]  0000000000000000 ffffffff00000000 ffff88000000006e ffff880073112ec0
g[1375589.788705]  ffff8800091573a4 0000000000000286 0000000000012ec0 ffff880073112ec0
g[1375589.788705]  0000000000000002 ffffffff8109cef4 ffff880009156d20 ffffffff810a01a4
g[1375589.788705] Call Trace:
g[1375589.788705]  [<ffffffff8109cef4>] ? check_preempt_curr+0x84/0xa0
g[1375589.788705]  [<ffffffff810a01a4>] ? wake_up_new_task+0xf4/0x1b0
g[1375589.788705]  [<ffffffff8118516d>] ? mprotect_fixup+0x15d/0x250
g[1375589.788705]  [<ffffffff8106d10f>] ? do_fork+0xcf/0x340
g[1375589.788705]  [<ffffffff8154b779>] ? stub_clone+0x69/0x90
g[1375589.788705]  [<ffffffff8154b40d>] ? system_call_fast_compare_end+0x10/0x15
g[1375589.788705] Code: 00 00 83 e8 01 4d 8b 64 24 70 39 d0 7f f4 48 8b 7d 78 49 3b 7c 24 78 74 1d 66 0f 1f 84 00 00 00 00 00 48 8b 6d 70 4d 8b 64 24 70 <48> 8b 7d 78 49 3b 7c 24 78 75 ec 48 85 ff 74 e7 e8 f2 f9 ff ff 
g[1375589.788705] RIP  [<ffffffff810a67d9>] check_preempt_wakeup+0xd9/0x1d0
g[1375589.788705]  RSP <ffff880003ac7e30>
g[1375589.788705] CR2: 0000000000000078
g[1375589.788705] ---[ end trace 5fab7713cb2d171f ]---

我能够恢复它们的唯一方法登录Web界面并手动重置它们.不用说,它不会扩展.

我已经尝试过设置看门狗设备并设置kernel.panic = 10,理论上应该重新启动VM.

对于这些虚拟机,我使用’container-vm’作为操作系统风格(即Debian预装了Docker或多或少).

有没有人见过这个?

解决方法

我没有足够的声誉来发表评论.所以我在这里发表评论.
我有同样的问题.我检查了互联网上的错误报告,发现几乎每个内核输出都包含do_fork()函数.
之后我发现:

http://www.serverphorums.com/read.php?12,1053418

在这里更新版本:

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/kernel/sched/core.c?id=ea86cb4b7621e1298a37197005bf0abcc86348d4

我希望它对某人有帮助.

我想在我的发行版中修复此问题,但我不知道如何推动发行版的人将这个补丁放到认的内核中.

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐