15:14:49,487 DEBUG AbstractCoordinator:561 - Sending GroupCoordinator request for group test_1 to broker 192.168.1.84:9092 (id: 0 rack: null)
15:14:49,489 DEBUG AbstractCoordinator:572 - Received GroupCoordinator response ClientResponse(receivedTimeMs=1541142889489, latencyMs=1, disconnected=false, requestHeader={api_key=10,api_version=0,correlation_id=239,client_id=consumer-1}, responseBody={error_code=15,coordinator={node_id=-1,host=,port=-1}}) for group test_1
15:14:49,489 DEBUG AbstractCoordinator:594 - Group coordinator lookup for group test_1 failed: The group coordinator is not available.
15:14:49,489 DEBUG AbstractCoordinator:215 - Coordinator discovery failed for group test_1, refreshing metadata

详细问题介绍如下：本人部署了 3节点kafka（91、92、93），使用zk管理，验证kafka容错性，91 kafka进程kill，kafka 正常生产消费；继续kill 92 kafka进程，kafka正常生产但是不能消费。消费客户端出现如下错误：


15:14:49,487 DEBUG AbstractCoordinator:561 - Sending GroupCoordinator request for group test_1 to broker 192.168.1.84:9092 (id: 0 rack: null)
15:14:49,489 DEBUG AbstractCoordinator:572 - Received GroupCoordinator response ClientResponse(receivedTimeMs=1541142889489, latencyMs=1, disconnected=false, requestHeader={api_key=10,api_version=0,correlation_id=239,client_id=consumer-1}, responseBody={error_code=15,coordinator={node_id=-1,host=,port=-1}}) for group test_1
15:14:49,489 DEBUG AbstractCoordinator:594 - Group coordinator lookup for group test_1 failed: The group coordinator is not available.
15:14:49,489 DEBUG AbstractCoordinator:215 - Coordinator discovery failed for group test_1, refreshing metadata

你的副本有几个？

回答于 5年前

半兽人

一如既往 -> 半兽人 5年前

3个

按kafka本身的高可用性和容错性，3节点我的topic 3个副本，不应该随便宕其中2个节点集群理应正常工作和消费啊，大神求指教？哪里配置有问题

半兽人 -> 一如既往 5年前

看看describe。

系统的__offst主题副本也是3个吗

[skyon10@server6 kafka_2.11-2.0.0]$ bin/kafka-topics.sh --describe --zookeeper 192.168.1.84:2181,192.168.1.85:2181,192.168.1.86:2181 --topic spdb-cal
Topic:spdb-cal  PartitionCount:3        ReplicationFactor:3     Configs:
        Topic: spdb-cal Partition: 0    Leader: 0       Replicas: 0,1,2 Isr: 0
        Topic: spdb-cal Partition: 1    Leader: 0       Replicas: 1,2,0 Isr: 0
        Topic: spdb-cal Partition: 2    Leader: 0       Replicas: 2,0,1 Isr: 0
[skyon10@server6 kafka_2.11-2.0.0]$ bin/kafka-topics.sh --describe --zookeeper 192.168.1.84:2181,192.168.1.85:2181,192.168.1.86:2181 --topic spdb-cal
Topic:spdb-cal  PartitionCount:3        ReplicationFactor:3     Configs:
        Topic: spdb-cal Partition: 0    Leader: 0       Replicas: 0,1,2 Isr: 0,1,2
        Topic: spdb-cal Partition: 1    Leader: 0       Replicas: 1,2,0 Isr: 0,1,2
        Topic: spdb-cal Partition: 2    Leader: 0       Replicas: 2,0,1 Isr: 0,1,2

你指的系统的__offst主题副本也是3个吗？这是什么意思？指的 server.properties 里面的？

__consumer_offsets 这个。
如果不是3个，那你要增加一下，命令大全里最后有脚本。
https://www.orchome.com/454 

如何查看 __consumer_offsets 的个数？

The replication factor for the group metadata internal topics "consumer_offsets" and "transaction_state"
# For anything other than development testing, a value greater than 1 is recommended for to ensure availability such as 3.
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=3

我修改了还是不行

不是立即生效，所以你要手动修改。
bin/kafka-topics.sh --describe --zookeeper

半兽人 -> 半兽人 5年前

说法有误，不是立即生效，是新配置的参数，对已经创建的topic，是无效的。

我新建了个topic，在server.properties 里面修改了副本数等于3，并重启了kafka集群
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=3
transaction.state.log.min.isr=3

还是报如下错误，能正常生产不能消费：

09:25:34,252 DEBUG AbstractCoordinator:651 - [Consumer clientId=consumer-1, groupId=test_3] Sending FindCoordinator request to broker 192.168.1.86:9092 (id: 2 rack: null)
09:25:34,256 DEBUG AbstractCoordinator:662 - [Consumer clientId=consumer-1, groupId=test_3] Received FindCoordinator response ClientResponse(receivedTimeMs=1541381134256, latencyMs=3, disconnected=false, requestHeader=RequestHeader(apiKey=FIND_COORDINATOR, apiVersion=2, clientId=consumer-1, correlationId=372), responseBody=FindCoordinatorResponse(throttleTimeMs=0, errorMessage='null', error=COORDINATOR_NOT_AVAILABLE, node=:-1 (id: -1 rack: null)))
09:25:34,257 DEBUG AbstractCoordinator:685 - [Consumer clientId=consumer-1, groupId=test_3] Group coordinator lookup failed: The coordinator is not available.
09:25:34,258 DEBUG AbstractCoordinator:242 - [Consumer clientId=consumer-1, groupId=test_3] Coordinator discovery failed, refreshing metadata

__consumer_offsets 几个副本

这个怎么看？

[skyon10@server5 kafka_2.11-2.0.0]$ bin/kafka-topics.sh --describe --zookeeper 192.168.1.84:2181,192.168.1.85:2181,192.168.1.86:2181 --topic spdb-test
Topic:spdb-test PartitionCount:3        ReplicationFactor:3     Configs:
        Topic: spdb-test        Partition: 0    Leader: 2       Replicas: 1,2,0 Isr: 2,1
        Topic: spdb-test        Partition: 1    Leader: 2       Replicas: 2,0,1 Isr: 2,1
        Topic: spdb-test        Partition: 2    Leader: 1       Replicas: 0,1,2 Isr: 2,1

我已经反复说了几遍看__consumer_offsets了。
bin/kafka-topics.sh --describe --zookeeper

新建的topic 副本数还是3，我已经把 describe 的结果发出来了

一如既往 -> 一如既往 5年前

目前新建的topic 还是出现以前一样的问题，我3个节点，topic 是三备份三分区的。做可用性测试的时候，broker 1、broker2 其中一个或者2个都 kafka进程kill掉，kafka是能正常消费的，但是borker 0 kafka进程kill了，kafka就不能正常消费了，java客户端就会出现如下错误：
09:25:34,252 DEBUG AbstractCoordinator:651 - [Consumer clientId=consumer-1, groupId=test_3] Sending FindCoordinator request to broker 192.168.1.86:9092 (id: 2 rack: null)
09:25:34,256 DEBUG AbstractCoordinator:662 - [Consumer clientId=consumer-1, groupId=test_3] Received FindCoordinator response ClientResponse(receivedTimeMs=1541381134256, latencyMs=3, disconnected=false, requestHeader=RequestHeader(apiKey=FIND_COORDINATOR, apiVersion=2, clientId=consumer-1, correlationId=372), responseBody=FindCoordinatorResponse(throttleTimeMs=0, errorMessage='null', error=COORDINATOR_NOT_AVAILABLE, node=:-1 (id: -1 rack: null)))
09:25:34,257 DEBUG AbstractCoordinator:685 - [Consumer clientId=consumer-1, groupId=test_3] Group coordinator lookup failed: The coordinator is not available.
09:25:34,258 DEBUG AbstractCoordinator:242 - [Consumer clientId=consumer-1, groupId=test_3] Coordinator discovery failed, refreshing metadata

要吐血了
bin/kafka-topics.sh --describe --zookeeper 10.0.21.56:2181|grep consumer_offsets

__consumer_offsets 这个主题下，存储的是所有消费者组消费情况，如果它不是高可用，那自然就无法消费了。

大神，不好意思,看如下截图是不是有问题：
bin/kafka-topics.sh --describe --zookeeper 192.168.1.84:2181,192.168.1.85:2181,192.168.1.86:2181|grep consumer_offsets
Topic:consumer_offsets        PartitionCount:50       ReplicationFactor:1     Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
        Topic: consumer_offsets       Partition: 0    Leader: 2       Replicas: 2     Isr: 2
        Topic: consumer_offsets       Partition: 1    Leader: -1      Replicas: 0     Isr: 0
        Topic: consumer_offsets       Partition: 2    Leader: 1       Replicas: 1     Isr: 1
        Topic: consumer_offsets       Partition: 3    Leader: 2       Replicas: 2     Isr: 2
        Topic: consumer_offsets       Partition: 4    Leader: -1      Replicas: 0     Isr: 0
        Topic: consumer_offsets       Partition: 5    Leader: 1       Replicas: 1     Isr: 1
        Topic: consumer_offsets       Partition: 6    Leader: 2       Replicas: 2     Isr: 2
        Topic: consumer_offsets       Partition: 7    Leader: -1      Replicas: 0     Isr: 0
        Topic: consumer_offsets       Partition: 8    Leader: 1       Replicas: 1     Isr: 1
        Topic: consumer_offsets       Partition: 9    Leader: 2       Replicas: 2     Isr: 2
        Topic: consumer_offsets       Partition: 10   Leader: -1      Replicas: 0     Isr: 0
        ...........

把__consumer_offsets的副本也扩展成3个。
https://www.orchome.com/454
拉到最后，有扩展的命令。

是否确定 __consumer_offsets 在broker 0 节点上的都不能正常消费？那请问大神这种情况我如何修改参数使其能消费？

你这种是手工修正，有没有自动修复的方法？

你现在这种，只能手动。
新建的集群，不能有任何生产和消费的情况下，默认参数调好后，默认创建的__consumer_offsets就是你默认值了。

如果我重新搭建集群或者新建topic 我要修改哪个参数可以使 __consumer_offsets 的副本数变大而不是1？

刚刚按照你的方式手动调整了，还是不能正常消费，报错还是和上述一样。

Topic: consumer_offsets       Partition: 0    Leader: 2       Replicas: 1,2   Isr: 2,1
        Topic: consumer_offsets       Partition: 1    Leader: -1      Replicas: 1,2,0 Isr: 0
        Topic: consumer_offsets       Partition: 2    Leader: 1       Replicas: 1,2   Isr: 1,2
        Topic: consumer_offsets       Partition: 3    Leader: 2       Replicas: 1,2   Isr: 2,1
        Topic: consumer_offsets       Partition: 4    Leader: -1      Replicas: 1,2,0 Isr: 0
        Topic: consumer_offsets       Partition: 5    Leader: 1       Replicas: 1,2   Isr: 1,2
        Topic: consumer_offsets       Partition: 6    Leader: 2       Replicas: 1,2   Isr: 2,1
        Topic: consumer_offsets       Partition: 7    Leader: -1      Replicas: 1,2,0 Isr: 0
        Topic: consumer_offsets       Partition: 8    Leader: 1       Replicas: 1,2   Isr: 1,2
        Topic: consumer_offsets       Partition: 9    Leader: 2       Replicas: 1,2   Isr: 2,1
        Topic: __consumer_offsets       Partition: 10   Leader: -1      Replicas: 1,2,0 Isr: 0

leader有-1的话，就不行。

大神，这如何修复？

大神，我解决了，我把kill的那个节点重启了，手动修正kafka-reassign-partitions --verify 就全部完成了。leader 全部正常了。我还有个问题：我如何在新建集群或者新建topic的时候就把50 个__consumer_offsets 复制到每个节点上，而不是均匀分布在每个节点上？以免还出现现在这种情况？

只有它是需要你用脚本进行分布的，一旦主题创建，就无法在自动了。

仰望星空 -> 半兽人 4年前

大神在吗，请教个问题

半兽人 -> 仰望星空 4年前

到问题专区里发贴吧。

你的答案

查看kafka相关的其他问题或提一个您自己的问题。

kafka集群部分节点宕机，生产者能继续消费，消费者不能正常消费

你的答案

昵称