半兽人 发表于: 2015-03-10   最后更新时间: 2016-04-07  
  •   40 订阅,1447 游览


We recommend using multiple drives to get good throughput and not sharing the same drives used for Kafka data with application logs or other OS filesystem activity to ensure good latency. As of 0.8 you can either RAID these drives together into a single volume or format and mount each drive as its own directory. Since Kafka has replication the redundancy provided by RAID can also be provided at the application level. This choice has several tradeoffs.

If you configure multiple data directories partitions will be assigned round-robin to data directories. Each partition will be entirely in one of the data directories. If data is not well balanced among partitions this can lead to load imbalance between disks.


RAID can potentially do better at balancing load between disks (although it doesn't always seem to) because it balances load at a lower level. The primary downside of RAID is that it is usually a big performance hit for write throughput and reduces the available disk space.


Another potential benefit of RAID is the ability to tolerate disk failures. However our experience has been that rebuilding the RAID array is so I/O intensive that it effectively disables the server, so this does not provide much real availability improvement.

RAID 的另一个潜在的好处是能够容忍磁盘故障。然而,我们的经验是,重建RAID阵列的I/O密集型的应用,有效地禁用服务器上,因此这不提供很多实际的可用性改进。

发表于: 1年前   最后更新时间: 9月前   游览量:1447
上一条: kafka操作系统
下一条: kafka应用程序与操作系统的冲洗管理

  • 评论…
    • in this conversation