Apache HBase Performance Tuning 官文总结

Apache HBase Performance Tuning

RAM, RAM, RAM. 不要让HBase饿死.node

请使用64位的平台算法

必须将swapping设定为0apache

使用本地硬件来完成hdfs的checksumming计算，见：https://blogs.apache.org/hbase/entry/saving_cpu_using_native_hadoop

老年代使用CMS垃圾算法，设置-XX:CMSInitiatingOccupancyFraction为60或者70（越小表明发生越多GC，CPU也会消耗越多）
年轻代使用UseParNewGC算法
使用MSLAB机制来放置memstore带来的内存碎片，将base.hregion.memstore.mslab.enabled设置为true便可，92以后的版本都是默认true的。
HBASE-8163单子介绍了MSLAB池的机制，能够更加有效的使用MSLAB
除了8613单中提到的机制外也可使用：XX:PretenureSizeThreshold设置的大小比hbase.hregion.memstore.mslab.chunksize大这样MSLAB的块将直接在老年去产生，避免没必要要的年轻代拷贝升级
其余关于通常的Java的GC能够参考Eliminating Large JVM GC Pauses Caused by Background IO Traffic
重要配置

hbase.master.wait.on.regionservers.mintostart	大集群环境下增大此配置以防止region被分发到少数几台RS上
`zookeeper.session.timeout`	默认3分钟，在JVM调优的状况下能够减小，宕机是能够尽处理宕机机器
`dfs.datanode.failed.volumes.tolerated`	数据卷的损坏状况，这是一个hdfs的配置，默认为0当 dfs.datanode.data.dir下面的任何卷的读写失败都会形成datanode的宕机因此建议将此值设定为卷数的一半
`hbase.regionserver.handler.count`	这是服务端相应客户端请求的线程处理数，通常根据客户端的状况，如客户端每次都将服务器大数据put或者scan服务器，那么须要设置的小一点，若是每次交互数据量较小则能够session 提升此参数，增长处理性能。app
hbase.ipc.server.max.callqueue.size	q请求队列，在纯写的状况能够增大，当有写负载的时候须要主要过大的配置有可能jvm 带来OOM群体。socket

启用ColumnFamily的压缩
将WAL的文件大小设置为小于hdfs的块大小，而且最大wal文件数能够根据 (RS heap * memstore factor )/ wal size
在对业务很了解的状况下能够关闭自动分裂，改成手动分裂，能够将hbase.hregion.max.filesize设置为一个超大值，好比100G可是不建议设置为无限大。
对于与分裂region能够建议每台RS有10个与分裂region
手动控制major cpmpaction来减轻业务压力
在HBase纸上作MR任务的时候请关闭推测执行特性，将mapreduce.map.speculative and mapreduce.reduce.speculative设置为false
配置中将ipc.server.tcpnodelay ==> true
hbase.ipc.client.tcpnodelay ==> true 减小RPC延迟

MTTR设定：

Set the following in the RegionServer.tcp

<property> <name>hbase.lease.recovery.dfs.timeout</name> <value>23000</value> <description>How much time we allow elapse between calls to recover lease. Should be larger than the dfs timeout.</description> </property> <property> <name>dfs.client.socket-timeout</name> <value>10000</value> <description>Down the DFS timeout from 60 to 10 seconds.</description> </property>

And on the NameNode/DataNode side, set the following to enable 'staleness' introduced in HDFS-3703, HDFS-3912.ide

<property> <name>dfs.client.socket-timeout</name> <value>10000</value> <description>Down the DFS timeout from 60 to 10 seconds.</description> </property> <property> <name>dfs.datanode.socket.write.timeout</name> <value>10000</value> <description>Down the DFS timeout from 8 * 60 to 10 seconds.</description> </property> <property> <name>ipc.client.connect.timeout</name> <value>3000</value> <description>Down from 60 seconds to 3.</description> </property> <property> <name>ipc.client.connect.max.retries.on.timeouts</name> <value>2</value> <description>Down from 45 seconds to 3 (2 == 3 retries).</description> </property> <property> <name>dfs.namenode.avoid.read.stale.datanode</name> <value>true</value> <description>Enable stale state in hdfs</description> </property> <property> <name>dfs.namenode.stale.datanode.interval</name> <value>20000</value> <description>Down from default 30 seconds</description> </property> <property> <name>dfs.namenode.avoid.write.stale.datanode</name> <value>true</value> <description>Enable stale state in hdfs</description> </property>

Apache HBase Performance Tuning 官文总结

Apache HBase Performance Tuning

`zookeeper.session.timeout`

`dfs.datanode.failed.volumes.tolerated`

`hbase.regionserver.handler.count`