topic replication 同步失败 急!!
问题:一个topic有3个副本,但是副本直接同步失败,只有一个broker 可用
在broker1 和broker2 上查看controler 日志发现错误日志:
[2016-05-16 13:44:41,617] INFO [SessionExpirationListener on 0], ZK expired; shut down all controller components and try to re-elect (kafka.controller.KafkaController$SessionExpirationListener)
查看server.log 日志:
PartitionFetchInfo(23122,104857400). Possible cause: java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. (kafka.server.ReplicaFetcherThread) [2016-05-16 13:45:07,118] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions (kafka.server.ReplicaFetcherManager) [2016-05-16 13:45:07,129] INFO [ReplicaFetcherManager on broker 0] Added fetcher for partitions List() (kafka.server.ReplicaFetcherManager)
broker3 的日志“:server.log
kafka.common.KafkaException: This operation cannot be completed on a complete request. at kafka.network.Transmission$class.expectIncomplete(Transmission.scala:34) at kafka.api.TopicDataSend.expectIncomplete(FetchResponse.scala:102) at kafka.api.TopicDataSend.writeTo(FetchResponse.scala:120) at kafka.network.MultiSend.writeTo(Transmission.scala:101) at kafka.api.FetchResponseSend.writeTo(FetchResponse.scala:231) at kafka.network.Processor.write(SocketServer.scala:472) at kafka.network.Processor.run(SocketServer.scala:342) at java.lang.Thread.run(Thread.java:745)
从以上几个错误看来,socket不正常,导致io读取报java.io.EOFException,消息不完整。
排查
1、建议检查下服务器之间的安全机制,有些服务器会自动断开长连接(注意)。
2、查看下服务器资源,磁盘性能。
感谢!
check下state-change日志
你的答案