3.3 Producer 配置

Essential configuration properties for the producer include:

  • request.required.acks
  • producer.type
  • serializer.class
Property Default Description

This is for bootstrapping and the producer will only use it for getting metadata (topics, partitions and replicas). The socket connections for sending the actual data will be established based on the broker information returned in the metadata. The format is host1:port1,host2:port2, and the list can be a subset of brokers or a VIP pointing to a subset of brokers.

request.required.acks 0

This value controls when a produce request is considered completed. Specifically, how many other brokers must have committed the data to their log and acknowledged this to the leader? Typical values are

  • 0, which means that the producer never waits for an acknowledgement from the broker (the same behavior as 0.7). This option provides the lowest latency but the weakest durability guarantees (some data will be lost when a server fails).
  • 1, which means that the producer gets an acknowledgement after the leader replica has received the data. This option provides better durability as the client waits until the server acknowledges the request as successful (only messages that were written to the now-dead leader but not yet replicated will be lost).
  • -1, The producer gets an acknowledgement after all in-sync replicas have received the data. This option provides the greatest level of durability. However, it does not completely eliminate the risk of message loss because the number of in sync replicas may, in rare cases, shrink to 1. If you want to ensure that some minimum number of replicas (typically a majority) receive a write, then you must set the topic-level min.insync.replicas setting. Please read the Replication section of the design documentation for a more in-depth discussion.
    0: 表示producer从来不等待来自broker的确认信息(和0.7一样的行为)。这个选择提供了最小的时延但同时风险最大(因为当server宕机时,数据将会丢失)。
    1:表示获得leader replica已经接收了数据的确认信息。这个选择时延较小同时确保了server确认接收成功。
    -1:producer会获得所有同步replicas都收到数据的确认。同时时延最大,然而,这种方式并没有完全消除丢失消息的风险,因为同步replicas的数量可能是1.如果你想确保某些replicas接收到数据,那么你应该在topic-level设置中选项min.insync.replicas设置一下。请阅读一下设计文档,可以获得更深入的讨论。 10000 The amount of time the broker will wait trying to meet the request.required.acks requirement before sending back an error to the client.
producer.type sync

This parameter specifies whether the messages are sent asynchronously in a background thread. Valid values are (1) async for asynchronous send and (2) sync for synchronous send. By setting the producer to async we allow batching together of requests (which is great for throughput) but open the possibility of a failure of the client machine dropping unsent data.
(1)  async: 异步发送
(2)  sync: 同步发送

serializer.class kafka.serializer.DefaultEncoder The serializer class for messages. The default encoder takes a byte[] and returns the same byte[].
The serializer class for keys (defaults to the same as for messages if nothing is given).
partitioner.class kafka.producer.DefaultPartitioner The partitioner class for partitioning messages amongst sub-topics. The default partitioner is based on the hash of the key.
partitioner 类,用于在subtopics之间划分消息。默认partitioner基于key的hash表
compression.codec none

This parameter allows you to specify the compression codec for all data generated by this producer. Valid values are "none", "gzip" and "snappy".
此项参数可以设置压缩数据的codec,可选codec为:“none”, “gzip”, “snappy”

compressed.topics null

This parameter allows you to set whether compression should be turned on for particular topics. If the compression codec is anything other than NoCompressionCodec, enable compression only for specified topics if any. If the list of compressed topics is empty, then enable the specified compression codec for all topics. If the compression codec is NoCompressionCodec, compression is disabled for all topics
此项参数可以设置某些特定的topics是否进行压缩。如果压缩codec是NoCompressCodec之外的codec,则对指定的topics数 据应用这些codec。如果压缩topics列表是空,则将特定的压缩codec应用于所有topics。如果压缩的codec是 NoCompressionCodec,压缩对所有topics军不可用。

message.send.max.retries 3

This property will cause the producer to automatically retry a failed send request. This property specifies the number of retries when such failures occur. Note that setting a non-zero value here can lead to duplicates in the case of network errors that cause a message to be sent but the acknowledgement to be lost.
此项参数将使producer自动重试失败的发送请求。此项参数将置顶重试的次数。注意:设定非0值将导致重复某些网络错误:引起一条发送并引起确认丢失 100

Before each retry, the producer refreshes the metadata of relevant topics to see if a new leader has been elected. Since leader election takes a bit of time, this property specifies the amount of time that the producer waits before refreshing the metadata.
在每次重试之前,producer会更新相关topic的metadata,以此进行查看新的leader是否分配好了。因为leader的选择需要一点时间,此选项指定更新metadata之前producer需要等待的时间。 600 * 1000

The producer generally refreshes the topic metadata from brokers when there is a failure (partition missing, leader not available...). It will also poll regularly (default: every 10min so 600000ms). If you set this to a negative value, metadata will only get refreshed on failure. If you set this to zero, the metadata will get refreshed after each message sent (not recommended). Important note: the refresh happen only AFTER the message is sent, so if the producer never sends a message the metadata is never refreshed
producer一般会在某些失败的情况下(partition missing,leader不可用等)更新topic的metadata。他将会规律的循环。如果你设置为负值,metadata只有在失败的情况下才 更新。如果设置为0,metadata会在每次消息发送后就会更新(不建议这种选择,系统消耗太大)。重要提示: 更新是有在消息发送后才会发生,因此,如果producer从来不发送消息,则metadata从来也不会更新。 5000 Maximum time to buffer data when using async mode. For example a setting of 100 will try to batch together 100ms of messages to send at once. This will improve throughput but adds message delivery latency due to the buffering.
queue.buffering.max.messages 10000 The maximum number of unsent messages that can be queued up the producer when using async mode before either the producer must be blocked or data must be dropped.
当使用async模式时,在在producer必须被阻塞或者数据必须丢失之前,可以缓存到队列中的未发送的最大消息条数 -1

The amount of time to block before dropping messages when running in async mode and the buffer has reached queue.buffering.max.messages. If set to 0 events will be enqueued immediately or dropped if the queue is full (the producer send call will never block). If set to -1 the producer will block indefinitely and never willingly drop a send.

batch.num.messages 200 The number of messages to send in one batch when using async mode. The producer will wait until either this number of messages are ready to send or is reached.
send.buffer.bytes 100 * 1024 Socket write buffer size
(socket 写缓存的大小) "" The client id is a user-specified string sent in each request to help trace calls. It should logically identify the application making the request.
这个client  id是用户特定的字符串,在每次请求中包含用来追踪调用,他应该逻辑上可以确认是那个应用发出了这个请求。

More details about producer configuration can be found in the scala classkafka.producer.ProducerConfig.
更多细节需要查看scala类 kafka.producer.ProducerConfig

发表于: 4年前   最后更新时间: 1年前   游览量:11052
上一条: Consumer配置
下一条: Kafka New Producer配置