kafka了解Linux操作系统的冲洗行为

原创
半兽人 发表于: 2015-03-10   最后更新时间: 2019-11-09 14:10:10  
{{totalSubscript}} 订阅, 13,095 游览

In Linux, data written to the filesystem is maintained in pagecache until it must be written out to disk (due to an application-level fsync or the OS's own flush policy). The flushing of data is done by a set of background threads called pdflush (or in post 2.6.32 kernels "flusher threads").
在Linux中,写入文件系统的数据保存在页缓存,知道它必须被写入到磁盘(由应用程序级fsync或系统自己的冲洗策略)。数据的冲洗是通过一组后台线程调用pdflush(或在post 2.6.32内核 ”冲洗器线程”)完成的。

Pdflush has a configurable policy that controls how much dirty data can be maintained in cache and for how long before it must be written back to disk. This policy is described here. When Pdflush cannot keep up with the rate of data being written it will eventually cause the writing process to block incurring latency in the writes to slow down the accumulation of data.
Pdflush可通过配置策略来控制多少脏数据可以保存在缓存和多长时间之前必须写回到磁盘。策略说明(英文链接)。当Pdflush无法跟上数据写入的速率时,它最终会导致写入进程块引起数据延迟累积变慢。

You can see the current state of OS memory usage by doing
您可以通过执行看到操作系统的内存使用情况的当前状态

> cat /proc/meminfo

The meaning of these values are described in the link above.
在上面的链接中介绍了这些值的含义。

Using pagecache has several advantages over an in-process cache for storing data that will be written out to disk:
页缓存有一个进程内缓存,用于存储将被写入到磁盘的数据 有几个优点:

  • The I/O scheduler will batch together consecutive small writes into bigger physical writes which improves throughput.
    I/O调度器将连续的小写入一起打包成更大的物理写,从而提高吞吐量。

  • The I/O scheduler will attempt to re-sequence writes to minimize movement of the disk head which improves throughput.
    I/O调度器尝试重新排序写入顺序,以尽量减少磁头的运动从而提高吞吐量。

  • It automatically uses all the free memory on the machine
    它会自动使用计算机上的所有可用的内存

更新于 2019-11-09

查看kafka更多相关的文章或提一个关于kafka的问题,也可以与我们一起分享文章