ighack

0 声望

这家伙太懒,什么都没留下

个人动态
  • 该用户已被注销 贬了 在 WARN Attempting to send response via channel for which there is no open connection 的评论!

    去的kafka中也经常有

    [2019-05-21 09:45:04,120] WARN Attempting to send response via channel for which there is no open connection, connection id 7 (kafka.network.Processor)
    [2019-05-21 09:45:11,446] WARN Attempting to send response via channel for which there is no open connection, connection id 0 (kafka.network.Processor)
    [2019-05-21 09:45:25,075] WARN Attempting to send response via channel for which there is no open connection, connection id 7 (kafka.network.Processor)
    [2019-05-21 09:50:02,121] WARN Attempting to send response via channel for which there is no open connection, connection id 0 (kafka.network.Processor)
    [2019-05-21 09:50:13,734] WARN Attempting to send response via channel for which there is no open connection, connection id 5 (kafka.network.Processor)

    3年前
  • 冰海落花 关注了Ta · 4年前
  • 冰海落花 回复 ighackkafka报错打开文件数过多导致kafka关闭? 中 :

    请问这个告警解决了吗

    4年前
  • ighack 回复 ighackkafka集群每隔30多天的时候就会有副本被踢出ISR 中 :

    我的Partition应该算是比较均衡

    topic: mdb_Fd_Route_GD    Partition: 0    Leader: 2    Replicas: 2,3,4    Isr: 2,4,3
        Topic: mdb_Fd_Route_GD    Partition: 1    Leader: 3    Replicas: 3,4,0    Isr: 4,3,0
        Topic: mdb_Fd_Route_GD    Partition: 2    Leader: 4    Replicas: 4,0,1    Isr: 4,1,0
        Topic: mdb_Fd_Route_GD    Partition: 3    Leader: 0    Replicas: 0,1,2    Isr: 2,1,0
        Topic: mdb_Fd_Route_GD    Partition: 4    Leader: 1    Replicas: 1,2,3    Isr: 2,3,1
        Topic: mdb_Fd_Route_GD    Partition: 5    Leader: 2    Replicas: 2,4,0    Isr: 2,4,0
        Topic: mdb_Fd_Route_GD    Partition: 6    Leader: 3    Replicas: 3,0,1    Isr: 3,1,0
        Topic: mdb_Fd_Route_GD    Partition: 7    Leader: 4    Replicas: 4,1,2    Isr: 4,2,1
        Topic: mdb_Fd_Route_GD    Partition: 8    Leader: 0    Replicas: 0,2,3    Isr: 2,3,0
        Topic: mdb_Fd_Route_GD    Partition: 9    Leader: 1    Replicas: 1,3,4    Isr: 4,3,1
    

    大多数都是这样的

    4年前
  • ighack 回复 半兽人kafka集群每隔30多天的时候就会有副本被踢出ISR 中 :

    我在27这台机器看到有很多controller.log.2020-03-25-08内容为

    [2020-03-25 08:00:01,455] DEBUG [Controller 4]: topics not in preferred replica Map() (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] TRACE [Controller 4]: leader imbalance ratio for broker 0 is 0.000000 (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] DEBUG [Controller 4]: topics not in preferred replica Map() (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] TRACE [Controller 4]: leader imbalance ratio for broker 1 is 0.000000 (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] DEBUG [Controller 4]: topics not in preferred replica Map() (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] TRACE [Controller 4]: leader imbalance ratio for broker 2 is 0.000000 (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] DEBUG [Controller 4]: topics not in preferred replica Map() (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] TRACE [Controller 4]: leader imbalance ratio for broker 3 is 0.000000 (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] DEBUG [Controller 4]: topics not in preferred replica Map() (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] TRACE [Controller 4]: leader imbalance ratio for broker 4 is 0.000000 (kafka.controller.KafkaController)
    [2020-03-25 08:05:01,450] TRACE [Controller 4]: checking need to trigger partition rebalance (kafka.controller.KafkaController)
    [2020-03-25 08:05:01,454] DEBUG [Controller 4]: preferred replicas by broker Map(0 -> Map([gtp_data_log,1] -> List(0, 3, 4), [wlpt_to_mdb,2] -> List(0, 2, 3), [JLP_TO_LMIS_CHO
    NGQ1,3] -> List(0, 4, 1), [JLP_TO_LMIS_SHANGH,5] -> List(0, 1, 2), [mdb_Fd_Route_NM,4] -> List(0, 2, 3), [TMP_TO_LMIS_SD,1] -> List(0, 3, 4), [JLP_TO_LMIS_GD,0] -> List(0, 1,
    2), [consumer_offsets,30] -> List(0, 2, 3), [JLP_TO_LMIS_HEN,7] -> List(0, 2, 3), [TMP_TO_LMIS_GD,7] -> List(0, 4, 1), [TMP_TO_LMIS_LZ,9] -> List(0, 2, 3), [gtp_data_log,6]
    -> List(0, 4, 1), [TMP_TO_LMIS_CHONGQ,4] -> List(0, 4, 1), [TMP_TO_LMIS_HAIN,7] -> List(0, 2, 3), [JTmdb_Fd_Good,2] -> List(0, 3, 4), [JLP_TO_LMIS_FJ,6] -> List(0, 2, 3), [sen
    demail,2] -> List(0, 4), [JLP_TO_LMIS_SHANGH,0] -> List(0, 4, 1), [mdb_Fd_Route_LZ,0] -> List(0, 2, 3), [consumer_offsets,10] -> List(0, 2, 3), [JLP_TO_LMIS_FJ1,2] -> List(0
    , 3, 4), [mdb_Fd_Route_HAIN,2] -> List(0, 1, 2), [JLP_TO_LMIS_HEN,2] -> List(0, 1, 2), [TMP_TO_LMIS_FJ,6] -> List(0, 4, 1), [TMP_TO_LMIS_XM,0] -> List(0, 1, 2), [JLP_TO_LMIS_S
    D,2] -> List(0, 3, 4), [TMP_TO_LMIS_JIANGX,1] -> List(0, 1, 2), [__consumer_offsets,40] -> List(0, 4, 1), [TMP_TO_LMIS_BEIJ,4] -> List(0, 3, 4), [Parallel_Computing_Stock,0]
    

    其他的机器上也有controller.log.2020-03-这样的日志。但不会每个小时都生成。内容也不像上面这样

    [2020-03-03 10:57:28,711] INFO [Controller 1]: Controller startup complete (kafka.controller.KafkaController)
    [2020-03-03 10:57:31,354] DEBUG [Controller 1]: Controller resigning, broker id 1 (kafka.controller.KafkaController)
    [2020-03-03 10:57:31,354] DEBUG [Controller 1]: De-registering IsrChangeNotificationListener (kafka.controller.KafkaController)
    [2020-03-03 10:57:31,356] INFO [Partition state machine on Controller 1]: Stopped partition state machine (kafka.controller.PartitionStateMachine)
    [2020-03-03 10:57:31,357] INFO [Replica state machine on controller 1]: Stopped replica state machine (kafka.controller.ReplicaStateMachine)
    [2020-03-03 10:57:31,358] INFO [Controller 1]: Broker 1 resigned as the controller (kafka.controller.KafkaController)
    [2020-03-03 10:57:33,325] INFO [Controller 1]: Controller starting up (kafka.controller.KafkaController)
    [2020-03-03 10:57:33,342] INFO [Controller 1]: Controller startup complete (kafka.controller.KafkaController)
    

    看起来比较正常,只有在踢出ISR中的副本时有的机器上有这样的日志

    [2020-03-03 10:59:54,553] DEBUG [Controller 2]: Removing replica 1 from ISR 3,0 for partition [TMP_TO_LMIS_SHANGH,6]. (kafka.controller.KafkaController)
    [2020-03-03 10:59:54,554] WARN [Controller 2]: Cannot remove replica 1 from ISR of partition [TMP_TO_LMIS_SHANGH,6] since it is not in the ISR. Leader = 3 ; ISR = List(3, 0) (
    kafka.controller.KafkaController)[2020-03-03 10:59:54,554] DEBUG The stop replica request (delete = true) sent to broker 1 is  (kafka.controller.ControllerBrokerRequestBatch)
    [2020-03-03 10:59:54,554] DEBUG The stop replica request (delete = false) sent to broker 1 is [Topic=TMP_TO_LMIS_SHANGH,Partition=6,Replica=1] (kafka.controller.ControllerBrok
    erRequestBatch)[2020-03-03 10:59:54,554] DEBUG The stop replica request (delete = true) sent to broker 1 is  (kafka.controller.ControllerBrokerRequestBatch)
    [2020-03-03 10:59:54,554] DEBUG The stop replica request (delete = false) sent to broker 1 is [Topic=__consumer_offsets,Partition=17,Replica=1] (kafka.controller.ControllerBro
    kerRequestBatch)[2020-03-03 10:59:54,554] INFO [Replica state machine on controller 2]: Invoking state change to OfflineReplica for replicas [Topic=__consumer_offsets,Partition=17,Replica=1] 
    (kafka.controller.ReplicaStateMachine)[2020-03-03 10:59:54,554] DEBUG [Controller 2]: Removing replica 1 from ISR 2,0 for partition [__consumer_offsets,17]. (kafka.controller.KafkaController)
    zookeeper
    

    /controller_epoch 记录了controller变化的次数,也就是切换了多少次,次数大了说明集群不稳定,controller总是重新选举
    我有225。但不知道不稳定在那里

    4年前
  • ighack 回复 ighackkafka集群每隔30多天的时候就会有副本被踢出ISR 中 :

    我在27这台机器看到有很多controller.log.2020-03-25-08内容为:

    [2020-03-25 08:00:01,455] DEBUG [Controller 4]: topics not in preferred replica Map() (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] TRACE [Controller 4]: leader imbalance ratio for broker 0 is 0.000000 (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] DEBUG [Controller 4]: topics not in preferred replica Map() (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] TRACE [Controller 4]: leader imbalance ratio for broker 1 is 0.000000 (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] DEBUG [Controller 4]: topics not in preferred replica Map() (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] TRACE [Controller 4]: leader imbalance ratio for broker 2 is 0.000000 (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] DEBUG [Controller 4]: topics not in preferred replica Map() (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] TRACE [Controller 4]: leader imbalance ratio for broker 3 is 0.000000 (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] DEBUG [Controller 4]: topics not in preferred replica Map() (kafka.controller.KafkaController)
    [2020-03-25 08:00:01,455] TRACE [Controller 4]: leader imbalance ratio for broker 4 is 0.000000 (kafka.controller.KafkaController)
    [2020-03-25 08:05:01,450] TRACE [Controller 4]: checking need to trigger partition rebalance (kafka.controller.KafkaController)
    [2020-03-25 08:05:01,454] DEBUG [Controller 4]: preferred replicas by broker Map(0 -> Map([gtp_data_log,1] -> List(0, 3, 4), [wlpt_to_mdb,2] -> List(0, 2, 3), [JLP_TO_LMIS_CHO
    NGQ1,3] -> List(0, 4, 1), [JLP_TO_LMIS_SHANGH,5] -> List(0, 1, 2), [mdb_Fd_Route_NM,4] -> List(0, 2, 3), [TMP_TO_LMIS_SD,1] -> List(0, 3, 4), [JLP_TO_LMIS_GD,0] -> List(0, 1, 
    2), [__consumer_offsets,30] -> List(0, 2, 3), [JLP_TO_LMIS_HEN,7] -> List(0, 2, 3), [TMP_TO_LMIS_GD,7] -> List(0, 4, 1), [TMP_TO_LMIS_LZ,9] -> List(0, 2, 3), [gtp_data_log,6] 
    -> List(0, 4, 1), [TMP_TO_LMIS_CHONGQ,4] -> List(0, 4, 1), [TMP_TO_LMIS_HAIN,7] -> List(0, 2, 3), [JTmdb_Fd_Good,2] -> List(0, 3, 4), [JLP_TO_LMIS_FJ,6] -> List(0, 2, 3), [sen
    demail,2] -> List(0, 4), [JLP_TO_LMIS_SHANGH,0] -> List(0, 4, 1), [mdb_Fd_Route_LZ,0] -> List(0, 2, 3), [__consumer_offsets,10] -> List(0, 2, 3), [JLP_TO_LMIS_FJ1,2] -> List(0
    , 3, 4), [mdb_Fd_Route_HAIN,2] -> List(0, 1, 2), [JLP_TO_LMIS_HEN,2] -> List(0, 1, 2), [TMP_TO_LMIS_FJ,6] -> List(0, 4, 1), [TMP_TO_LMIS_XM,0] -> List(0, 1, 2), [JLP_TO_LMIS_S
    D,2] -> List(0, 3, 4), [TMP_TO_LMIS_JIANGX,1] -> List(0, 1, 2), [__consumer_offsets,40] -> List(0, 4, 1), [TMP_TO_LMIS_BEIJ,4] -> List(0, 3, 4), [Parallel_Computing_Stock,0]
    

    其他的机器上也有controller.log.2020-03-这样的日志。但不会每个小时都生成。内容也不像上面这样

    4年前
  • 半兽人 回复 ighackkafka集群每隔30多天的时候就会有副本被踢出ISR 中 :

    你得先从节点日志查查。

    4年前
  • 张乘辉 关注了Ta · 5年前