My Hadoop cluster data needs no RAID!

One of the operational challenges in introducing Hadoop to traditional IT and Enterprise operations is understanding when to break one of our sacred IT mantras: Thou shalt always RAID your data. Never shalt thou install a system without RAID. One shall be your RAID if thou seekest performance and redundancy sparing no expense. Five shall […]

Improving Hadoop datanode disk fault tolerance

By design, Hadoop is meant to tolerate failures in a responsible manner. One of those failure modes is for an HDFS datanode to go off line because it lost a data disk. By default, the datanode process will not tolerate any disk failures before shutting itself off. When this happens, the HDFS namenode discovers that […]