kafka磁盘和文件系统

半兽人 发表于: 2015-03-10   最后更新时间: 2016-04-07  
  •   40 订阅,1447 游览

磁盘和文件系统


We recommend using multiple drives to get good throughput and not sharing the same drives used for Kafka data with application logs or other OS filesystem activity to ensure good latency. As of 0.8 you can either RAID these drives together into a single volume or format and mount each drive as its own directory. Since Kafka has replication the redundancy provided by RAID can also be provided at the application level. This choice has several tradeoffs.
我们建议使用多个驱动器来获取良好的吞吐量并不共享相同的驱动器用于kafka数据和应用程序日志或其他操作系统的文件系统活动,以确保良好的延迟。由于0.8你可以RAID这些驱动器组合成一个卷或格式和挂载每个驱动器作为自己的目录。由于kafka具有复制RAID的和应用程序级别提供的冗余。这种选择有几方面来权衡。


If you configure multiple data directories partitions will be assigned round-robin to data directories. Each partition will be entirely in one of the data directories. If data is not well balanced among partitions this can lead to load imbalance between disks.

如果你配置多个数据目录分区,将会被循环分配数据目录,每个分区将完全的在一个数据目录,如果数据没有被分区之间很好的平衡,这可能导致磁盘之间的负载失衡。


RAID can potentially do better at balancing load between disks (although it doesn't always seem to) because it balances load at a lower level. The primary downside of RAID is that it is usually a big performance hit for write throughput and reduces the available disk space.

RAID磁盘之间负载平衡(尽管并不总是),因为它在较低的水平平衡负载做的很好,RAID的主要缺点是,它通常为写入大的吞吐量影响性能并减少了可用磁盘空间。


Another potential benefit of RAID is the ability to tolerate disk failures. However our experience has been that rebuilding the RAID array is so I/O intensive that it effectively disables the server, so this does not provide much real availability improvement.

RAID 的另一个潜在的好处是能够容忍磁盘故障。然而,我们的经验是,重建RAID阵列的I/O密集型的应用,有效地禁用服务器上,因此这不提供很多实际的可用性改进。






发表于: 1年前   最后更新时间: 9月前   游览量:1447
上一条: kafka操作系统
下一条: kafka应用程序与操作系统的冲洗管理
评论…

  • 评论…
    • in this conversation
      提问