kafka集群性能测试问题

剑枫寒 发表于: 2019-08-17   最后更新时间: 2019-08-18 22:56:11   2,569 游览

kafka集群性能测试

1. linux服务器配置如下:

Distributor ID:    Ubuntu
Description:    Ubuntu 16.04 LTS
Release:    16.04
Codename:    xenial
physical cpu :   2
proccessor:   24
model name  : Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz

2. 启动了三个broker集群,分布在不同服务器上,使用kafka自带的脚本进行测试(./kafka-producer-perf-test.sh)

@ubuntu:/data/svckafka/kafka/bin$ ./kafka-producer-perf-test.sh -h
usage: producer-performance [-h] --topic TOPIC --num-records NUM-RECORDS [--payload-delimiter PAYLOAD-DELIMITER] --throughput THROUGHPUT
                            [--producer-props PROP-NAME=PROP-VALUE [PROP-NAME=PROP-VALUE ...]] [--producer.config CONFIG-FILE]
                            [--print-metrics] [--transactional-id TRANSACTIONAL-ID] [--transaction-duration-ms TRANSACTION-DURATION]
                            (--record-size RECORD-SIZE | --payload-file PAYLOAD-FILE)

This tool is used to verify the producer performance.

optional arguments:
  -h, --help             show this help message and exit
  --topic TOPIC          produce messages to this topic
  --num-records NUM-RECORDS
                         number of messages to produce
  --payload-delimiter PAYLOAD-DELIMITER
                         provides delimiter to be used when --payload-file is provided.  Defaults  to new line. Note that this parameter will
                         be ignored if --payload-file is not provided. (default: \n)
  --throughput THROUGHPUT
                         throttle maximum  message  throughput  to  *approximately*  THROUGHPUT  messages/sec.  Set  this  to  -1  to disable
                         throttling.
  --producer-props PROP-NAME=PROP-VALUE [PROP-NAME=PROP-VALUE ...]
                         kafka producer related configuration properties like  bootstrap.servers,client.id etc. These configs take precedence
                         over those passed via --producer.config.
  --producer.config CONFIG-FILE
                         producer config properties file.
  --print-metrics        print out metrics at the end of the test. (default: false)
  --transactional-id TRANSACTIONAL-ID
                         The transactionalId to use if transaction-duration-ms  is  >  0.  Useful  when testing the performance of concurrent
                         transactions. (default: performance-producer-default-transactional-id)
  --transaction-duration-ms TRANSACTION-DURATION
                         The max age of each transaction. The commitTransaction will  be called after this time has elapsed. Transactions are
                         only enabled if this value is positive. (default: 0)

  either --record-size or --payload-file must be specified but not both.

  --record-size RECORD-SIZE
                         message size in bytes. Note that you must provide exactly one of --record-size or --payload-file.
  --payload-file PAYLOAD-FILE
                         file to read the message payloads from. This works  only  for  UTF-8  encoded text files. Payloads will be read from
                         this file and a payload will be randomly selected when  sending  messages. Note that you must provide exactly one of
                         --record-size or --payload-file.

3. 问题:

(1) 随着分区数的增多,吞吐量应该越来越小,但是我测的结果不一样,如下图:

screenshot

(2)随着副本数增多,吞吐量也应该越来越小,但我测的结果好像没怎么变化,如下图:

screenshot

发表于 2019-08-17
添加评论

测试命令:
./kafka-producer-perf-test.sh --topic test --num-records 10000 --record-size 1000 --throughput -1 --producer-props bootstrap.servers="server1:9292,server2:9292,server3:9292"

1、partition越多,会提升吞吐的(机器越多,效果越明显,分散写)。
2、副本越多,吞吐是理应越来越来。
以上2个问题的关键原因是kafka不是实时的落地到磁盘上的,都在缓存中。因为kafka认为,多台缓存比实时落磁盘更安全。
所以,在你的kafka集群之间,内部通讯、缓存没成为瓶颈的时候,是差不多的。

你的答案

查看kafka相关的其他问题或提一个您自己的问题