kafka无法启动

漂泊的美好 发表于: 2018-02-08   最后更新时间: 2018-02-08  
  •   26 订阅,725 游览

最近线上集群出了几个故障,kafka处于一种假“运行”状态,就是jps pid还存在,监听端口9092却不存在,日志还在输出,其他broker无法连接,zk中注册路径还在。重启的话会因为zk中路径还存在,所以无法启动成功。困惑了很久,然后当然也翻阅了apache jira,https://issues.apache.org/jira/browse/KAFKA-3410 其中这个跟我的问题很像。大概意思是follower因为某种原因,从leader中被剔除,而如果此时leader partitions重启,很可能会造成缓存数据丢失以至于leader LEO<follower LEo,follower partition所在broker会退出。我查了源码确实会退出。

// we should never encounter this situation since a non-ISR leader cannot be elected if disallowed by the broker configuration.
      if (!LogConfig.fromProps(brokerConfig.originals, AdminUtils.fetchEntityConfig(replicaMgr.zkUtils,
        ConfigType.Topic, topicPartition.topic)).uncleanLeaderElectionEnable) {
        // Log a fatal error and shutdown the broker to ensure that data loss does not unexpectedly occur.
        fatal("Exiting because log truncation is not allowed for partition %s,".format(topicPartition) +
          " Current leader %d's latest offset %d is less than replica %d's latest offset %d"
          .format(sourceBroker.id, leaderEndOffset, brokerConfig.brokerId, replica.logEndOffset.messageOffset))
        System.exit(1)
      }

相关日志


[2016-08-30 16:51:03,374] FATAL [ReplicaFetcherThread-0-1], Exiting because log truncation is not allowed for topic test, Current leader 1's latest offset 0 is less than replica 2's latest offset 1 (kafka.server.ReplicaFetcherThread)
[2016-08-30 16:51:03,374] INFO [Kafka Server 2], shutting down (kafka.server.KafkaServer)
[2016-08-30 16:51:03,375] INFO [Kafka Server 2], Starting controlled shutdown (kafka.server.KafkaServer)
[2016-08-30 16:51:03,397] INFO [Kafka Server 2], Controlled shutdown succeeded (kafka.server.KafkaServer)
[2016-08-30 16:51:03,399] INFO [Socket Server on Broker 2], Shutting down (kafka.network.SocketServer)
[2016-08-30 16:51:03,403] INFO [Socket Server on Broker 2], Shutdown completed (kafka.network.SocketServer)
[2016-08-30 16:51:03,404] INFO [Kafka Request Handler on Broker 2], shutting down (kafka.server.KafkaRequestHandlerPool)


现在我有个问题想问下:
System.exit(1)JVM退出,但为什么pid还存在,并且zk中路径还存在?







发表于: 3月前   最后更新时间: 3月前   游览量:725
上一条: 关于Kafka,Step6 集群配置安装的问题
下一条: kafka如何查看是否生产的消息是否成功发送?

评论…


  • 评论…
    • in this conversation
      提问