返回到文章

采纳

编辑于

Producer配置

Producer配置
history
kafka_history


3.3 Producer 配置






Essential configuration properties for the producer include:

生产者基本配置属性包括:




  • metadata.broker.list


  • request.required.acks


  • producer.type


  • serializer.class





































































































Property

Default

Description

metadata.broker.list





This is for bootstrapping and the producer will only use it
for getting metadata (topics, partitions and replicas). The socket
connections for sending the actual data will be established based on the
broker information returned in the metadata. The format is
host1:port1,host2:port2, and the list can be a subset of brokers or a
VIP pointing to a subset of brokers.

服务于bootstrapping。producer仅用来获取metadata(topics,partitions,replicas)。发送实际数据的socket连接将基于返回的metadata数据信息而建立。格式是:

host1:port1,host2:port2

这个列表可以是brokers的子列表或者是一个指向brokers的VIP



request.required.acks

0


This value controls when a produce request is considered
completed. Specifically, how many other brokers must have committed the
data to their log and acknowledged this to the leader? Typical values
are




  • 0, which means that the producer never waits for an
    acknowledgement from the broker (the same behavior as 0.7). This option
    provides the lowest latency but the weakest durability guarantees (some
    data will be lost when a server fails).


  • 1, which means that the producer gets an
    acknowledgement after the leader replica has received the data. This
    option provides better durability as the client waits until the server
    acknowledges the request as successful (only messages that were written
    to the now-dead leader but not yet replicated will be lost).


  • -1, The producer gets an acknowledgement after
    all in-sync replicas have received the data. This option provides the
    greatest level of durability. However, it does not completely eliminate
    the risk of message loss because the number of in sync replicas may, in
    rare cases, shrink to 1. If you want to ensure that some minimum number
    of replicas (typically a majority) receive a write, then you must set
    the topic-level min.insync.replicas setting. Please read the Replication
    section of the design documentation for a more in-depth discussion.

    此配置是表明当一次produce请求被认为完成时的确认值。特别是,多少个其他brokers必须已经提交了数据到他们的log并且向他们的leader确认了这些信息。典型的值包括:

    0: 表示producer从来不等待来自broker的确认信息(和0.7一样的行为)。这个选择提供了最小的时延但同时风险最大(因为当server宕机时,数据将会丢失)。

    1:表示获得leader replica已经接收了数据的确认信息。这个选择时延较小同时确保了server确认接收成功。

    -1:producer会获得所有同步replicas都收到数据的确认。同时时延最大,然而,这种方式并没有完全消除丢失消息的风险,因为同步replicas的数量可能是1.如果你想确保某些replicas接收到数据,那么你应该在topic-level设置中选项min.insync.replicas设置一下。请阅读一下设计文档,可以获得更深入的讨论。




request.timeout.ms

10000

The amount of time the broker will wait trying to meet the
request.required.acks requirement before sending back an error to the
client.

broker尽力实现request.required.acks需求时的等待时间,否则会发送错误到客户端


producer.type

sync


This parameter specifies whether the messages are sent
asynchronously in a background thread. Valid values are (1) async for
asynchronous send and (2) sync for synchronous send. By setting the
producer to async we allow batching together of requests (which is great
for throughput) but open the possibility of a failure of the client
machine dropping unsent data.

此选项置顶了消息是否在后台线程中异步发送。正确的值:

(1)  async: 异步发送

(2)  sync: 同步发送

通过将producer设置为异步,我们可以批量处理请求(有利于提高吞吐率)但是这也就造成了客户端机器丢掉未发送数据的可能性



serializer.class

kafka.serializer.DefaultEncoder

The serializer class for messages. The default encoder takes a byte[] and returns the same byte[].

消息的序列化类别。默认编码器输入一个字节byte[],然后返回相同的字节byte[]


key.serializer.class




The serializer class for keys (defaults to the same as for messages if nothing is given).

关键字的序列化类。如果没给与这项,默认情况是和消息一致


partitioner.class

kafka.producer.DefaultPartitioner

The partitioner class for partitioning messages amongst sub-topics. The default partitioner is based on the hash of the key.

partitioner 类,用于在subtopics之间划分消息。默认partitioner基于key的hash表


compression.codec

none


This parameter allows you to specify the compression codec
for all data generated by this producer. Valid values are "none", "gzip"
and "snappy".

此项参数可以设置压缩数据的codec,可选codec为:“none”, “gzip”, “snappy”



compressed.topics

null


This parameter allows you to set whether compression should
be turned on for particular topics. If the compression codec is anything
other than NoCompressionCodec, enable compression only for specified
topics if any. If the list of compressed topics is empty, then enable
the specified compression codec for all topics. If the compression codec
is NoCompressionCodec, compression is disabled for all topics

此项参数可以设置某些特定的topics是否进行压缩。如果压缩codec是NoCompressCodec之外的codec,则对指定的topics数
据应用这些codec。如果压缩topics列表是空,则将特定的压缩codec应用于所有topics。如果压缩的codec是
NoCompressionCodec,压缩对所有topics军不可用。



message.send.max.retries

3


This property will cause the producer to automatically retry a
failed send request. This property specifies the number of retries when
such failures occur. Note that setting a non-zero value here can lead
to duplicates in the case of network errors that cause a message to be
sent but the acknowledgement to be lost.

此项参数将使producer自动重试失败的发送请求。此项参数将置顶重试的次数。注意:设定非0值将导致重复某些网络错误:引起一条发送并引起确认丢失



retry.backoff.ms

100


Before each retry, the producer refreshes the metadata of
relevant topics to see if a new leader has been elected. Since leader
election takes a bit of time, this property specifies the amount of time
that the producer waits before refreshing the metadata.

在每次重试之前,producer会更新相关topic的metadata,以此进行查看新的leader是否分配好了。因为leader的选择需要一点时间,此选项指定更新metadata之前producer需要等待的时间。



topic.metadata.refresh.interval.ms

600 1000


The producer generally refreshes the topic metadata from
brokers when there is a failure (partition missing, leader not
available...). It will also poll regularly (default: every 10min so
600000ms). If you set this to a negative value, metadata will only get
refreshed on failure. If you set this to zero, the metadata will get
refreshed after each message sent (not recommended). Important note: the
refresh happen only AFTER the message is sent, so if the producer never
sends a message the metadata is never refreshed

producer一般会在某些失败的情况下(partition
missing,leader不可用等)更新topic的metadata。他将会规律的循环。如果你设置为负值,metadata只有在失败的情况下才
更新。如果设置为0,metadata会在每次消息发送后就会更新(不建议这种选择,系统消耗太大)。重要提示:
更新是有在消息发送后才会发生,因此,如果producer从来不发送消息,则metadata从来也不会更新。



queue.buffering.max.ms

5000

Maximum time to buffer data when using async mode. For example
a setting of 100 will try to batch together 100ms of messages to send
at once. This will improve throughput but adds message delivery latency
due to the buffering.

当应用async模式时,用户缓存数据的最大时间间隔。例如,设置为100时,将会批量处理100ms之内消息。这将改善吞吐率,但是会增加由于缓存产生的延迟。


queue.buffering.max.messages

10000

The maximum number of unsent messages that can be queued up
the producer when using async mode before either the producer must be
blocked or data must be dropped.

当使用async模式时,在在producer必须被阻塞或者数据必须丢失之前,可以缓存到队列中的未发送的最大消息条数


queue.enqueue.timeout.ms

-1


The amount of time to block before dropping messages when
running in async mode and the buffer has reached
queue.buffering.max.messages. If set to 0 events will be enqueued
immediately or dropped if the queue is full (the producer send call will
never block). If set to -1 the producer will block indefinitely and
never willingly drop a send.



batch.num.messages

200

The number of messages to send in one batch when using async
mode. The producer will wait until either this number of messages are
ready to send or queue.buffer.max.ms is reached.

使用async模式时,可以批量处理消息的最大条数。或者消息数目已到达这个上线或者是queue.buffer.max.ms到达,producer才会处理


send.buffer.bytes

100 1024

Socket write buffer size

(socket 写缓存的大小)


client.id

""

The client id is a user-specified string sent in each request
to help trace calls. It should logically identify the application making
the request.

这个client  id是用户特定的字符串,在每次请求中包含用来追踪调用,他应该逻辑上可以确认是那个应用发出了这个请求。



More details about producer configuration can be found in the scala classkafka.producer.ProducerConfig.

更多细节需要查看scala类 kafka.producer.ProducerConfig