转载请注明来源地址:http://www.cnblogs.com/dongxiao-yang/p/5206631.htmlhtml
We recommend using multiple drives to get good throughput and not sharing the same drives used for Kafka data with application logs or other OS filesystem activity to ensure good latency. You can either RAID these drives together into a single volume or format and mount each drive as its own directory. Since Kafka has replication the redundancy provided by RAID can also be provided at the application level. This choice has several tradeoffs.服务器
咱们推荐服务器使用多块硬盘:(1)实现高吞吐 (2)隔离kafka数据文件与应用的日志文件以及其余系统相关的磁盘消耗以保证低延迟。多块硬盘能够raid成一个卷或者每块硬盘单独显示一个盘符挂载。因为kakfa在应用级别已经能够提供raid所提供的数据冗余备份的功能,能够经过几个方面权衡选择的策略。app
If you configure multiple data directories partitions will be assigned round-robin to data directories. Each partition will be entirely in one of the data directories. If data is not well balanced among partitions this can lead to load imbalance between disks.ide
若是配置为多块硬盘,分区将会轮询分布到硬盘文件下,每一个分区将会彻底落到一块单独磁盘上。若是数据里的分区并非均匀分布的话会可能致使磁盘之间的负载不均衡。ui
RAID can potentially do better at balancing load between disks (although it doesn't always seem to) because it balances load at a lower level. The primary downside of RAID is that it is usually a big performance hit for write throughput and reduces the available disk space.this
raid先天性的在硬盘间数据均衡上表现的更好(虽然并不老是如此),由于raid是在更底层的层面实现的数据均衡。但其主要缺点是raid一般在写吞吐上会有很高的消耗,而且会减小可用的磁盘空间。spa
Another potential benefit of RAID is the ability to tolerate disk failures. However our experience has been that rebuilding the RAID array is so I/O intensive that it effectively disables the server, so this does not provide much real availability improvement.日志
raid 的另外一个潜在的好处是可以容忍磁盘故障。然而,咱们的经验是,重建raid队列的动做是一个过于io密集的工做,显著地使服务器工做失能,所以这不提供不少实际的可用性改进。orm