kafka ext4文件系统的注意事项

半兽人 发表于: 2015-03-10   最后更新时间: 2016-04-09  
  •   40 订阅,1387 游览

kafka ext4文件系统的注意事项


Ext4 may or may not be the best filesystem for Kafka. Filesystems like XFS supposedly handle locking during fsync better. We have only tried Ext4, though.

Ext4可能是也可能不是最好的文件系统,据说像XFS文件系统处理锁fsync更好,我们只尝试过Ext4,


It is not necessary to tune these settings, however those wanting to optimize performance have a few knobs that will help:

没有必要调整这些设置,提供了一些方式,将帮助优化性能:

  • data=writeback: Ext4 defaults to data=ordered which puts a strong order on some writes. Kafka does not require this ordering as it does very paranoid data recovery on all unflushed log. This setting removes the ordering constraint and seems to significantly reduce latency.
    data= writeback: 默认Ext4数据=强顺序写。kafka不需要这种排序,它在未冲洗的日志来做数据恢复,这个设置取消了顺序约束,似乎大大降低了延迟。
  • Disabling journaling: Journaling is a tradeoff: it makes reboots faster after server crashes but it introduces a great deal of additional locking which adds variance to write performance. Those who don't care about reboot time and want to reduce a major source of write latency spikes can turn off journaling entirely.
    禁用日志记录:日志是一个权衡,它使服务器崩溃后重启更快,但它引入了大量的额外的锁定,这增加了写性能的差异。如果不关心重启时间,想减少写入延迟高峰的一个主要来源,可以完全关闭日志记录。
  • commit=num_secs: This tunes the frequency with which ext4 commits to its metadata journal. Setting this to a lower value reduces the loss of unflushed data during a crash. Setting this to a higher value will improve throughput.
    commit=num_secs: 这个是ext4提交其元数据日志的频率,设置一个较低的值减少崩溃时未冲洗数据的丢失,这个设置较高会提高吞吐量。
  • nobh: This setting controls additional ordering guarantees when using data=writeback mode. This should be safe with Kafka as we do not depend on write ordering and improves throughput and latency.
    nobh:使用data=writeback时,这个设置控制额外的顺序保证。这应该对Kafka是安全的,我们不依赖写入顺序提高吞吐量和延迟。
  • delalloc: Delayed allocation means that the filesystem avoid allocating any blocks until the physical write occurs. This allows ext4 to allocate a large extent instead of smaller pages and helps ensure the data is written sequentially. This feature is great for throughput. It does seem to involve some locking in the filesystem which adds a bit of latency variance.
    delalloc: 延迟分配意味着文件系统避免分配任何块,直到物理写入发生。这允许ext4分配很大程度上代替小的页面并确保数据按顺序写入。这一特性非常适合吞吐量,它似乎涉及一些锁,在添加延迟差异的文件系统。






发表于: 1年前   最后更新时间: 9月前   游览量:1387
上一条: kafka了解Linux操作系统的冲洗行为
下一条: kafka重要的客户端配置
评论…

  • 评论…
    • in this conversation
      提问