kafka应用程序与操作系统的冲洗管理

半兽人 发表于: 2015-03-10   最后更新时间: 2016-04-09  
  •   40 订阅,1544 游览

应用程序与操作系统的冲洗管理


Kafka always immediately writes all data to the filesystem and supports the ability to configure the flush policy that controls when data is forced out of the OS cache and onto disk using the and flush. This flush policy can be controlled to force data to disk after a period of time or after a certain number of messages has been written. There are several choices in this configuration.

kafka一直都是立即把所有数据写入文件系统,并提供数据压出操作系统缓存到磁盘上的配置冲洗的能力。这个冲洗策略可控制在“一段时间之后”或“消息到一定数量之后”强制数据写入磁盘,在这个配置中有几个选择。


Kafka must eventually call fsync to know that data was flushed. When recovering from a crash for any log segment not known to be fsync'd Kafka will check the integrity of each message by checking its CRC and also rebuild the accompanying offset index file as part of the recovery process executed on startup.

kafka最终必须调用fsync知道数据被写入。当从崩溃中恢复未知的任何日志部分时是Kafka fsync通过检查CRC来检查每条消息的完整性并也重建偏移量,这些都是在启动执行恢复过程的一部分。


Note that durability in Kafka does not require syncing data to disk, as a failed node will always recover from its replicas

注意,kafka的耐久性不需要同步数据到磁盘,因为失败的节点会从它的副本恢复数据。


We recommend using the default flush settings which disable application fsync entirely. This means relying on the background flush done by the OS and Kafka's own background flush. This provides the best of all worlds for most uses: no knobs to tune, great throughput and latency, and full recovery guarantees. We generally feel that the guarantees provided by replication are stronger than sync to local disk, however the paranoid still may prefer having both and application level fsync policies are still supported.

我们推荐使用默认的设置,完全禁用fsync应用。这意味着依赖操作系统和kafka自己的背景冲洗,最适合大多数使用:无需调整,大吞吐量和延迟,以及全面恢复保证,我们一般认为,通过副本提供的保证比同步到本地磁盘更强,但是,偏执狂仍然支持应用级fsync策略。


The drawback of using application level flush settings are that this is less efficient in it's disk usage pattern (it gives the OS less leeway to re-order writes) and it can introduce latency as fsync in most Linux filesystems blocks writes to the file whereas the background flushing does much more granular page-level locking.

使用应用程序级冲洗设置的缺点是它的磁盘,这是低效率使用模式(它给较少的时间来重新排序写入操作系统)并且它可能会引入延迟fsync在多数Linux文件系统块写入文件,而背景冲洗是更细粒度的页面级锁。


In general you don't need to do any low-level tuning of the filesystem, but in the next few sections we will go over some of this in case it is useful.

一般情况下你不需要做任何底层文件系统的调优,但在接下来的几节中,我们将讨论一些这样的情况。







发表于: 1年前   最后更新时间: 9月前   游览量:1544
上一条: kafka磁盘和文件系统
下一条: kafka了解Linux操作系统的冲洗行为
评论…

  • 评论…
    • in this conversation
      提问