我下午三点下线了controller节点1后,2被选为新的controller。但是发现有个topic A的partition1元数据信息不对,原先它的三个副本在2,3,4(2为leader)。但是看打印的日志中发现新controller 2 告诉6节点说topic A的partition1的副本是3,2,6,但并没有向2,3,4同步元数据的变更,导致6节点无故新增了一副本并向leader拉取一直失败,
以下是日志:各位大佬帮我看看这是为什么呢?感觉像是controller的下线导致元数据错乱了
6节点机器上的日志:
(两点的日志还正常)
[2021-03-16 14:12:39.979] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', p
artitionIndex=1, controllerEpoch=29, leader=2, leaderEpoch=13, isr=[2, 3, 4], zkVersion=25, replicas=[2, 3, 4], offlineReplicas=[
]) for partition A-1 in response to UpdateMetadata request sent by controller 1 epoch 29 with correlation id 0 (state.change.logger)
(这是重启controller后的日志)
[2021-03-16 15:00:46.600] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', p
artitionIndex=1, controllerEpoch=29, leader=2, leaderEpoch=13, isr=[2, 3, 4], zkVersion=25, replicas=[3, 2, 6], offlineReplicas=[
]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 0 (state.change.logger)
这个日志上还有一点疑惑的是controller epoch是30,但UpdateMetadataPartitionState里的controller epoch是29,和两点的一致。这块是不是有问题呢?
2节点机器上的日志:
[2021-03-16 15:07:26,164] WARN [ReplicaManager broker=2] Leader 2 failed to record follower 6's position 11679162 since the replica is not recognized to be one of the assigned replicas 3,4,2 for partition A-1. Empty records will be returned for this partition. (kafka.server.ReplicaManager)
kafka默认topic一旦创建,分区(包含副本)一旦创建,分配到的节点就不会改变。
所以有没有做过什么操作或测试,影响了这个。
并没有,当时只是重启了controller,然后看日志就发现了这个问题
神奇的6节点,超出我的认知了。
最后,我还是坚持上面的观点。
ps:建议你清理一下脏数据。
我刚又查了一下日志,发现三点重启controller节点之后进行replica state machine启动,发现启动完成的日志里副本是3,2,6.这个我是不是可以理解为zk里的元数据就是这。但不懂的就是三点之前副本是2,3,4的这个元数据怎么来的,而且集群跑了好长时间了,都没问题,从kafka-manager以及监控指标上看也没有未同步的副本等任何错误信息。另外两点重启了一台普通broker,集群也正常。只有三点重启controller才开始有异常。
同时问一下您指的清理脏数据怎么操作呢?指哪些脏数据
你用命令查看一下topic A的分区情况:
bin/kafka-topics.sh --list --bootstrap-server localhost:9092
看看。
该topic是否正常使用?
现在是正常的,因为是线上的,当时做了紧急恢复,目前元数据信息就是3,2,6
而且从监控上看,这个topic在三点重启controller后也在正常生产、消费,follower3,4正常同步。6副本无法同步,日志里报Leader 2 failed to record follower 6's position 11679162 since the replica is not recognized to be one of the assigned replicas 3,4,2 for partition A-1. Empty records will be returned for this partition. (kafka.server.ReplicaManager)。
是后来又重启了另外一台后,集群才开始出现未同步副本的
你是如何恢复的?
leader2忽略了follower6的,不是分区A-1的副本(3、4、2)中。
我好奇的是6节点,怎么会有不存在A-1的数据的,之前有吗。
是三点十几重启了3节点后,分区A-1的元数据开始变更了,这才知道6是follower。之前都没有,6节点上的目录都是三点那会创建的。这是6节点的日志:
[2021-03-16 15:00:46.829] INFO Created log for partition A-1 in /data4/kafka-data/A-1 with properties {compression.type -> producer, message.downconversion.enable -> true, min.insync.replicas -> 2, segment.jitter.ms -> 0, cleanup.policy -> [delete], flush.ms -> 9223372036854775807, segment.bytes -> 1073741824, retention.ms -> 172800000, flush.messages -> 9223372036854775807, message.format.version -> 2.0-IV1, file.delete.delay.ms -> 60000, max.compaction.lag.ms -> 9223372036854775807, max.message.bytes -> 4194304, min.compaction.lag.ms -> 0, message.timestamp.type -> CreateTime, preallocate -> false, min.cleanable.dirty.ratio -> 0.5, index.interval.bytes -> 4096, unclean.leader.election.enable -> true, retention.bytes -> -1, delete.retention.ms -> 86400000, segment.ms -> 604800000, message.timestamp.difference.max.ms -> 9223372036854775807, segment.index.bytes -> 10485760}. (kafka.log.LogManager)
(这是三点十几重启了3节点后的日志,可以看出leaderEpoch变了)
[2021-03-16 15:11:04.092] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', partitionIndex=1, controllerEpoch=30, leader=2, leaderEpoch=14, isr=[2, 4], zkVersion=26, replicas=[3, 2, 6], offlineReplicas=[3]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 23 (state.change.logger) [2021-03-16 15:11:10.331] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', partitionIndex=1, controllerEpoch=30, leader=2, leaderEpoch=14, isr=[2, 4], zkVersion=26, replicas=[3, 2, 6], offlineReplicas=[3]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 29 (state.change.logger) [2021-03-16 15:11:15.076] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', partitionIndex=1, controllerEpoch=30, leader=2, leaderEpoch=14, isr=[2, 4, 6], zkVersion=27, replicas=[3, 2, 6], offlineReplicas=[3]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 34 (state.change.logger) [2021-03-16 15:11:15.236] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', partitionIndex=1, controllerEpoch=30, leader=2, leaderEpoch=14, isr=[2, 4, 6], zkVersion=27, replicas=[3, 2, 6], offlineReplicas=[3]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 35 (state.change.logger) [2021-03-16 15:11:23.389] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', partitionIndex=1, controllerEpoch=30, leader=2, leaderEpoch=14, isr=[2, 6], zkVersion=28, replicas=[3, 2, 6], offlineReplicas=[3]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 38 (state.change.logger) [2021-03-16 15:11:32.727] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', partitionIndex=1, controllerEpoch=30, leader=2, leaderEpoch=14, isr=[2, 6], zkVersion=28, replicas=[3, 2, 6], offlineReplicas=[3]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 40 (state.change.logger) [2021-03-16 15:11:34.809] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', partitionIndex=1, controllerEpoch=30, leader=2, leaderEpoch=14, isr=[2, 6], zkVersion=28, replicas=[3, 2, 6], offlineReplicas=[]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 41 (state.change.logger) [2021-03-16 15:11:34.884] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', partitionIndex=1, controllerEpoch=30, leader=2, leaderEpoch=14, isr=[2, 6], zkVersion=28, replicas=[3, 2, 6], offlineReplicas=[]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 42 (state.change.logger) [2021-03-16 15:11:55.426] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', partitionIndex=1, controllerEpoch=30, leader=2, leaderEpoch=14, isr=[2, 6, 3], zkVersion=29, replicas=[3, 2, 6], offlineReplicas=[]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 46 (state.change.logger) [2021-03-16 15:15:54.197] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', partitionIndex=1, controllerEpoch=30, leader=3, leaderEpoch=15, isr=[2, 6, 3], zkVersion=30, replicas=[3, 2, 6], offlineReplicas=[]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 50 (state.change.logger)
其实我现在疑惑的点在这呢:这两个时间点的leader epoch、zkVersion都一致,怎么唯独replicas信息变了?这个是不是有问题呢
[2021-03-16 14:12:39.979] metis-main2-sg2-kafka TRACE [Broker id=10040] Cached leader info UpdateMetadataPartitionState(topicName='metis_mobile_live_stat', p artitionIndex=1, controllerEpoch=29, leader=10045, leaderEpoch=13, isr=[10045, 10042, 10043], zkVersion=25, replicas=[10045, 10042, 10043], offlineReplicas=[ ]) for partition metis_mobile_live_stat-1 in response to UpdateMetadata request sent by controller 10041 epoch 29 with correlation id 0 (state.change.logger) [2021-03-16 15:00:46.600] metis-main2-sg2-kafka TRACE [Broker id=10040] Cached leader info UpdateMetadataPartitionState(topicName='metis_mobile_live_stat', p artitionIndex=1, controllerEpoch=29, leader=10045, leaderEpoch=13, isr=[10045, 10042, 10043], zkVersion=25, replicas=[10042, 10045, 10040], offlineReplicas=[ ]) for partition metis_mobile_live_stat-1 in response to UpdateMetadata request sent by controller 10045 epoch 30 with correlation id 0 (state.change.logger)
日志发错了:
[2021-03-16 14:12:39.979] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', p artitionIndex=1, controllerEpoch=29, leader=2, leaderEpoch=13, isr=[2, 3, 4], zkVersion=25, replicas=[2, 3, 4], offlineReplicas=[ ]) for partition A-1 in response to UpdateMetadata request sent by controller 1 epoch 29 with correlation id 0 (state.change.logger) [2021-03-16 15:00:46.600] TRACE [Broker id=6] Cached leader info UpdateMetadataPartitionState(topicName='A', p artitionIndex=1, controllerEpoch=29, leader=2, leaderEpoch=13, isr=[2, 3, 4], zkVersion=25, replicas=[3, 2, 6], offlineReplicas=[ ]) for partition A-1 in response to UpdateMetadata request sent by controller 2 epoch 30 with correlation id 0 (state.change.logger)
分区副本创建后,是不会改变的,超出我的认知范围了。
1、你用我上面提供的获取topic分区的命令,自己做一个快照,然后过几天同样查看一下,是否有变化。
2、针对日志,其实都是info级别的,并没有error级别的,虽然有些打印不合理,但是如果只是在举证,并不影响topic的使用,可以忽视。
有了个新发现:监控上PartitionReassignmentRateAndTimeMs这个指标竟然有值。但是2成为controller的时候打印的日志里说没有正在均衡的partition(这个就是从/admin/reassign_partitions获取的):
[2021-03-16 15:00:46,246] INFO [Controller id=2] Partitions being reassigned: Map() (kafka.controller.KafkaController)
当时也没有调用API进行均衡,所以我怀疑kafka内部是不是有什么机制会导致呢?
现在查这个问题也是想弄清楚原因,要不然我后边都没法滚动升级了,其它机器都不敢重启了。
关于升级时需要逐台重启kafka机器有什么好的建议不?
kafka什么版本?如果是大版本升级,你需要看看升级注意事项。
如果你没用到新的功能(如:流、连接器)等,其实老版本比较稳定的(因为功能单一)。
升级文档好久没更新了,我抽时间更新一下:https://www.orchome.com/505
从2.1.1升级到2.6.1。官网给的就是让滚动升级,流程它每个版本说的基本一样。
你的答案