微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

线上Kafka集群节点宕机问题排查

线上Kafka集群节点宕机问题排查

主机和进程信息

主机信息:6cores,64G,5.3T
Kafka进程信息:4G, partition 1K左右,消息数据量3.7T

今天上午发现Kafka有个节点挂了,上去查看日志发现有如下异常

Java HotSpot(TM) 64-Bit Server VM warning: Attempt to deallocate stack guard pages Failed.
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00007f04249ed000, 12288, 0) Failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) Failed to map 12288 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/appweb/hs_err_pid5566.log

备份日志,重启,

查看启动日志,一直有WARN日志

server.log.2021-07-22-11:[2021-07-22 11:37:18,003] WARN Found a corrupted index file due to requirement Failed: Corrupt index found, index file (/data/kafka_2/ai_jl_simple-4/00000000000000000082.index) has non-zero size but the last offset is 82 which is no larger than the base offset 82.}. deleting /data/kafka_2/ai_jl_simple-4/00000000000000000082.timeindex, /data/kafka_2/ai_jl_simple-4/00000000000000000082.index and rebuilding index... (kafka.log.Log)

持续到最后启动成功,检查最后一个上述WARN日志,整个过程持续48分钟才启动成功

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐