Kafka 服务隔几天就卡死一次 kill-9 之后 重启能恢复正常 过几天又不行了
麻烦各位大佬看看是啥原因,看看是性能问题还是网络问题还是啥别的
版本 kafka3.0.0 ,用的自带的 zookeeper ,数据量不小,每个主题保存了 5G ,大概最多几个小时的数据
用了自带的 sasl-plaintext 鉴权,单服务器单节点
broker.id=0
listeners=LOCAL://xxx:9092,JTKG://xxx:9093,GDKJ://xxx:9094 # 跨网所以映射了多个,这里的 xxx 代表同一个 ip
advertised.listeners=LOCAL://xxx:9092,JTKG://yyy:9093,GDKJ://zzz:29092 # xxx,yyy,zzz 是不同 ip
listener.security.protocol.map=LOCAL:SASL_PLAINTEXT,JTKG:SASL_PLAINTEXT,GDKJ:SASL_PLAINTEXT
inter.broker.listener.name=LOCAL
sasl.enabled.mechanisms=SCRAM-SHA-256
sasl.mechanism.inter.broker.protocol=SCRAM-SHA-256
authorizer.class.name=kafka.security.authorizer.AclAuthorizer
super.users=User:admin
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=1024000
socket.receive.buffer.bytes=1024000
socket.request.max.bytes=104857600
log.dirs=/home/kafka/kafka_2.12-3.0.0/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.retention.bytes=5368709120
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=60000
group.initial.rebalance.delay.ms=0
下为最近一次日志
[2024-12-19 13:14:31,683] INFO [Controller id=0] Partitions undergoing preferred replica election: (kafka.controller.KafkaController)
[2024-12-19 13:14:31,683] INFO [Controller id=0] Partitions that completed preferred replica election: (kafka.controller.KafkaController)
[2024-12-19 13:14:31,692] INFO [Controller id=0] Skipping preferred replica election for partitions due to topic deletion: (kafka.controller.KafkaController)
[2024-12-19 13:14:31,692] INFO [Controller id=0] Resuming preferred replica election for partitions: (kafka.controller.KafkaController)
[2024-12-19 13:14:31,693] INFO [Controller id=0] Starting replica leader election (PREFERRED) for partitions triggered by ZkTriggered (kafka.controller.KafkaController)
[2024-12-19 13:14:31,823] INFO [Controller id=0] Starting the controller scheduler (kafka.controller.KafkaController)
[2024-12-19 13:14:31,847] DEBUG [Controller id=0] Resigning (kafka.controller.KafkaController)
[2024-12-19 13:14:31,847] DEBUG [Controller id=0] Unregister BrokerModifications handler for Set(0) (kafka.controller.KafkaController)
[2024-12-19 13:14:31,859] INFO [PartitionStateMachine controllerId=0] Stopped partition state machine (kafka.controller.ZkPartitionStateMachine)
[2024-12-19 13:14:31,859] INFO [ReplicaStateMachine controllerId=0] Stopped replica state machine (kafka.controller.ZkReplicaStateMachine)
[2024-12-19 13:14:32,002] INFO [RequestSendThread controllerId=0] Shutting down (kafka.controller.RequestSendThread)
[2024-12-19 13:14:32,051] WARN [RequestSendThread controllerId=0] Controller 0 epoch 60 fails to send request (xxx) to broker xxx:9092 (id: 0 rack: null). Reconnecting to broker. (kafka.controller.RequestSendThread)
java.io.IOException: Client was shutdown before response was read
at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:109)
at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:252)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
[2024-12-19 13:14:32,070] ERROR [RequestSendThread controllerId=0] Controller 0 fails to send a request to broker xxx:9092 (id: 0 rack: null) (kafka.controller.RequestSendThread)
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at kafka.utils.ShutdownableThread.pause(ShutdownableThread.scala:82)
at kafka.controller.RequestSendThread.backoff$1(ControllerChannelManager.scala:233)
at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:261)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
[2024-12-19 13:14:32,071] INFO [RequestSendThread controllerId=0] Stopped (kafka.controller.RequestSendThread)
[2024-12-19 13:14:32,076] INFO [RequestSendThread controllerId=0] Shutdown completed (kafka.controller.RequestSendThread)
[2024-12-19 13:14:32,146] INFO [Controller id=0] Resigned (kafka.controller.KafkaController)
[2024-12-19 13:14:32,316] DEBUG [Controller id=0] Broker 0 has been elected as the controller, so stopping the election process. (kafka.controller.KafkaController)
[2024-12-19 13:14:51,582] INFO [ControllerEventThread controllerId=0] Shutting down (kafka.controller.ControllerEventManager$ControllerEventThread)
[2024-12-19 13:14:51,582] INFO [ControllerEventThread controllerId=0] Stopped (kafka.controller.ControllerEventManager$ControllerEventThread)
[2024-12-19 13:14:51,582] INFO [ControllerEventThread controllerId=0] Shutdown completed (kafka.controller.ControllerEventManager$ControllerEventThread)
[2024-12-19 13:14:51,583] DEBUG [Controller id=0] Resigning (kafka.controller.KafkaController)
[2024-12-19 13:14:51,583] DEBUG [Controller id=0] Unregister BrokerModifications handler for Set() (kafka.controller.KafkaController)
[2024-12-19 13:14:51,583] INFO [PartitionStateMachine controllerId=0] Stopped partition state machine (kafka.controller.ZkPartitionStateMachine)
[2024-12-19 13:14:51,583] INFO [ReplicaStateMachine controllerId=0] Stopped replica state machine (kafka.controller.ZkReplicaStateMachine)
[2024-12-19 13:14:51,583] INFO [Controller id=0] Resigned (kafka.controller.KafkaController)
这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。
V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。
V2EX is a community of developers, designers and creative people.