3.3 Producer 配置
Essential configuration properties for the producer include:
生产者基本配置属性包括:
- metadata.broker.list
- request.required.acks
- producer.type
- serializer.class
Property | Default | Description |
---|---|---|
metadata.broker.list |
|
This is for bootstrapping and the producer will only use it
for getting metadata (topics, partitions and replicas). The socket
connections for sending the actual data will be established based on the
broker information returned in the metadata. The format is
host1:port1,host2:port2, and the list can be a subset of brokers or a
VIP pointing to a subset of brokers. |
request.required.acks | 0 |
This value controls when a produce request is considered completed. Specifically, how many other brokers must have committed the data to their log and acknowledged this to the leader? Typical values are
|
request.timeout.ms | 10000 |
The amount of time the broker will wait trying to meet the
request.required.acks requirement before sending back an error to the
client. (broker尽力实现request.required.acks需求时的等待时间,否则会发送错误到客户端) |
producer.type | sync |
This parameter specifies whether the messages are sent
asynchronously in a background thread. Valid values are (1) async for
asynchronous send and (2) sync for synchronous send. By setting the
producer to async we allow batching together of requests (which is great
for throughput) but open the possibility of a failure of the client
machine dropping unsent data. |
serializer.class | kafka.serializer.DefaultEncoder |
The serializer class for messages. The default encoder takes a byte[] and returns the same byte[]. (消息的序列化类别。默认编码器输入一个字节byte[],然后返回相同的字节byte[]) |
key.serializer.class |
|
The serializer class for keys (defaults to the same as for messages if nothing is given). (关键字的序列化类。如果没给与这项,默认情况是和消息一致) |
partitioner.class | kafka.producer.DefaultPartitioner |
The partitioner class for partitioning messages amongst sub-topics. The default partitioner is based on the hash of the key. (partitioner 类,用于在subtopics之间划分消息。默认partitioner基于key的hash表) |
compression.codec | none |
This parameter allows you to specify the compression codec
for all data generated by this producer. Valid values are "none", "gzip"
and "snappy". |
compressed.topics | null |
This parameter allows you to set whether compression should
be turned on for particular topics. If the compression codec is anything
other than NoCompressionCodec, enable compression only for specified
topics if any. If the list of compressed topics is empty, then enable
the specified compression codec for all topics. If the compression codec
is NoCompressionCodec, compression is disabled for all topics |
message.send.max.retries | 3 |
This property will cause the producer to automatically retry a
failed send request. This property specifies the number of retries when
such failures occur. Note that setting a non-zero value here can lead
to duplicates in the case of network errors that cause a message to be
sent but the acknowledgement to be lost. |
retry.backoff.ms | 100 |
Before each retry, the producer refreshes the metadata of
relevant topics to see if a new leader has been elected. Since leader
election takes a bit of time, this property specifies the amount of time
that the producer waits before refreshing the metadata. |
topic.metadata.refresh.interval.ms | 600 * 1000 |
The producer generally refreshes the topic metadata from
brokers when there is a failure (partition missing, leader not
available...). It will also poll regularly (default: every 10min so
600000ms). If you set this to a negative value, metadata will only get
refreshed on failure. If you set this to zero, the metadata will get
refreshed after each message sent (not recommended). Important note: the
refresh happen only AFTER the message is sent, so if the producer never
sends a message the metadata is never refreshed |
queue.buffering.max.ms | 5000 |
Maximum time to buffer data when using async mode. For example
a setting of 100 will try to batch together 100ms of messages to send
at once. This will improve throughput but adds message delivery latency
due to the buffering. (当应用async模式时,用户缓存数据的最大时间间隔。例如,设置为100时,将会批量处理100ms之内消息。这将改善吞吐率,但是会增加由于缓存产生的延迟。) |
queue.buffering.max.messages | 10000 |
The maximum number of unsent messages that can be queued up
the producer when using async mode before either the producer must be
blocked or data must be dropped. (当使用async模式时,在在producer必须被阻塞或者数据必须丢失之前,可以缓存到队列中的未发送的最大消息条数) |
queue.enqueue.timeout.ms | -1 |
The amount of time to block before dropping messages when running in async mode and the buffer has reached queue.buffering.max.messages. If set to 0 events will be enqueued immediately or dropped if the queue is full (the producer send call will never block). If set to -1 the producer will block indefinitely and never willingly drop a send. |
batch.num.messages | 200 |
The number of messages to send in one batch when using async
mode. The producer will wait until either this number of messages are
ready to send or queue.buffer.max.ms is reached. (使用async模式时,可以批量处理消息的最大条数。或者消息数目已到达这个上线或者是queue.buffer.max.ms到达,producer才会处理) |
send.buffer.bytes | 100 * 1024 |
Socket write buffer size (socket 写缓存的大小) |
client.id | "" |
The client id is a user-specified string sent in each request
to help trace calls. It should logically identify the application making
the request. (这个client id是用户特定的字符串,在每次请求中包含用来追踪调用,他应该逻辑上可以确认是那个应用发出了这个请求。) |
More details about producer configuration can be found in the scala classkafka.producer.ProducerConfig.
更多细节需要查看scala类 kafka.producer.ProducerConfig