Introduction
It turns out this little tool is long overdue, as simple of a concept as it is, but also easy to misunderstand the use cases for, ours at Etsy however was very targeted. Several years ago we were hammering out our internal cloud infrastructure, using KVM/QEMU based solution that you can read about over here. We were populating our virtual machine frontend using Chef Ohai data as the canonical source of our system information. There is an ohai plugin that gathers KVM virtualization data using virsh commands which are part of libvirt. It was a perfect way to capture information about which guests existed on a given system and other information about them.
Our problem
We were hitting a bottleneck in that our chef clients were setup to run about every ten minutes. But within that ten minutes it would be possible that a virtual machine would be added or deleted, and therefore it was difficult to keep our interface in sync. Imagine creating a new virtual machine but not being able to display data about it until you waited around ten minutes, and to make matters worse these clients run at a splay interval, which means they don’t run all at the same time. Therefore, we started on a simple script that would let us run ohai quickly without needing to do the full chef run. While our chef runs are relatively quick (usually < 1 minute) it would introduce problems if we try to run chef while the client is already running.
Going to open source
It was supposed to be released a while ago, but has taken some time for various reasons. It’s a shockingly tiny amount of code but there were some barriers to releasing it. The majority of the code was written by Mark Roddy but he’d turned it over to me to open source. I went through the normal chef contribution process, which at the time required opening a Jira ticket that you can read if you’re interested in some of the details. In short there were some questions about the use case but when I explained that we weren’t trying to re-invent graphite in some horrible way, we were able to agree there could be some real world use cases. That being said, it was not accepted into the core yet because this does introduce very small race conditions since chef uses a read-modify-save model of changing the attribute data. There is a proposal to fix this, which divides attributes into different levels in which automated updates can access them without causing this issue. However in the wild this has not actually posed an issue for us even with several hundred nodes running it.
Installation
If you’re interested in this tool, you can install it with ruby gems using the command:
% gem install rehai
If you’d like to see the source, head on over to github.