I received an interesting question today from a reader regarding the use of USB thumb drives for the OS drives in Hadoop datanodes.
Have you ever put the OS for a Hadoop node on a USB thumb drive? (or considered it)
I have a smaller 8 node cluster and that would free up one of my drives.
I thought about it for a bit and wrote back. I figure this might be useful for others who are entertaining the idea. This is, by no means, an exhaustive list of things to consider.
I haven’t really looked into it in all honesty. If I did it, I would want to look at/be concerned about:
- USB, in my experience, is pretty flaky under load. You might end up with more disconnects than you would with a standard disk controller, manifesting in system failures.
- You’d need to control the amount of writes you’d be doing to the thumb drive to prevent wearing them out. This means minimizing what log4j writes to the drive and making sure you don’t have swap enabled at all which has its own performance issues[1].
- Not sure how much better/worse this would perform versus just netbooting the OS and having init scripts do the work of finding the data drives, mounting them, and so on. Since the datanode (more or less) walks the directories of its defined datanode drives each time it starts up, it doesn’t really matter if the drives shuffle around between reboots: the datanode will just find the blocks and report them back to the namenode as being available.
- Would it really be an appreciable gain in space on the datanode? Assuming you’ve already got the same size drives in each disk bay on the system, maybe it would be better to do a tiny software raid10 stripe across all drives? You gain a little redundancy at a very minor expense of disk space spread across all drives.
- If I was having to buy additional disks to fill out this newly available slot across all systems, would I be better off just expanding the cluster by adding more systems to make up for the space difference?
In my case, I run datanodes with 12 data disks and 2x raid1 disks for OS and logs. I knew our future expansions were always going to be via scaling the cluster horizontally through adding machines, not by changing what individual systems are capable of doing, so I designed the platform to be maximized from the start.
Using USB is certainly doable; ultimately, it would depend on the platform limitations (e.g. only having 2 disk slots), what my available budget was, and so on.
[1] MySQL Swap Insanity … I’ve experienced this on systems with swap disabled that are undergoing heavy memory pressure where the system has multi-second pauses while the kernel tries to find dirty pages to clean up and free. Turning off swap makes the kernel do something weird in this case that even a small amount of swap bypasses. So, now we always run with swap on a system, even if it’s only 50-100MB.