A popular paradigm to consider when creating or retooling a complex development environment is bundling it in a virtual machine (VM). Tools such as Vagrant make this easy and appealing. This post will attempt to explore some of the pros and cons of development environments contained within a virtual machine.
Let's explore some basic concepts regarding development environments in VMs. Ideally these should be considered thoroughly before starting the implementation:
- The VM provider: Will this be a local hypervisor (ex: kvm, VirtualBox), or a remote one (ex: VMWare, OpenStack)?
- The VM provisioner: Generally, a layer of configurations need to be applied to your virtual machine. This could be as simple as a bash script, or as complex as running a configuration management tool such as Chef, Puppet or Ansible.
- The VM base OS: You'll have to pick which operating system the guest VM is running; ideally this would be the same as in production.
- Delineation between host and guest: What stays on the host, as opposed to living on the guest VM? Where does the code live?
A Concrete Example
This article will explore some advantages as well as some challenges with this approach. We'll look at a specific implementation of a development environment for a company I worked for. Specifically:
- VM Provider: The VM provider chosen was the default Vagrant provider, VirtualBox. This is the easiest to implement, as VirtualBox has support for many host operating systems and is proven software.
- VM Provisioner: Chef was picked primarily because of its ability to have automated pull capabilities from a central server. This allows the development environment to grow and receive updates which will automatically be rolled out to all developers, without requiring end-users to run any commands to pick up the updates.
- VM base OS: We decided to go with RedHat Enterprise Linux since this was the chosen OS for the production environment.
- Host/guest split: This is the most complicated decision. In our implementation, the host OS runs the IDE and hosts the code. Keeping the code on the host OS ensures that any mishaps with the VM does not result in code loss. All other build tools and runtimes (JDK, Gradle, NodeJS, Tomcat) run within the guest OS. Source code management (git) is installed both on the host and the guest, for extra flexibility.
A few more nice-to-haves were implemented:
- Automatic SSH key forwarding: the host OS' SSH key is automatically forwarded to the VM, allowing git identities to be preserved across boundaries.
- Automatic X11 forwarding: GUI applications can be started from the VM and forwarded to the host OS, if needed
- Automatic shell variable setup: A file was created by convention and copied to the VM on every start, allowing a user to preserve configurations/shell variables even after a VM destroy/create.
One of the main advantages of this approach is the speed of first-time setup. For a new developer coming along, the setup process is essentially 2 commands: 1 to setup his host OS (install Vagrant/VirtualBox/Git and get the necessary files for setup), and 1 to bring the VM up.
After only a few minutes (target time being 3-5 minutes at most), the developer has a fully functional development environment.
Ability to destroy and recreate quickly
Based on what we saw in the previous step, a developer is able to quickly recreate his development environment. If something goes wrong during development, many developers will choose to simply recreate the development environment instead of spending an hour troubleshooting the system.
Isolation from the host machine
Another advantage is isolution from the host machine. Many development tools may conflict with the host OS, for example installing or upgrading python packages using the system-level python may have undesired consequences. Later trying to undo those changes can be really problematic. Hosting everything within a VM lets developers experiment freely without fear of damaging their host OS configurations or packages.
Support of multiple host OSes
One of the requirements we had was to support multiple host OSes. Most developers traditionally did development on Linux workstations, so we had to support RedHat and Ubuntu. MacBooks were also being introduced, so support for Mac was a must. And finally, several QA engineers only had a Windows laptop but were still expected to develop tests, so Windows support was a nice to have.
Trying to tool or document all these setup steps for 4 different OSes would be near impossible. With a VM, this problem goes away almost completely.
Ease of troubleshooting
And finally, because everyone is using the same development OS and all resources have been setup the same way, when a developer needs support, that support can be provided much more efficiently. No more finding out which version of Java they installed or where their global npm
node_modules folder is - everything is at the same place and configured identically for every developer.
Re-use during CI
Since the development environment is a virtual machine, it can be re-used during continuous integration for the build slaves. That way, CI processes run on the identical environment as developers, removing the "it builds on my box, why doesn't it build on CI?" problem! Specifically using vagrant, you can simply pick another provider, for example OpenStack instead of VirtualBox, and run your build slaves on the cloud.
NFS mount for code share
The previous article on Shopify Developer Environments touched on this slighty: since the IDE and the code was on the host machine, but all build and runtime tools were on the development environment VM, this meant the code had to be shared. The default VirtualBox shared filesystem is notoriously slow, so the current fastest way to share code is with an NFS mount. This works and is pretty fast, but nowhere near as fast as having the code and the tools on the same physical drive.
Additionally, this extra NFS layer added some problems when using tools such as file watchers, and also exposed some buggy NFS implementations in macOS.
Since our code is now compiled and running in the VM, this means we need to reach out to a different local IP address to connect to it. This is fine locally, however if a coworker wants to reach your VM's private IP, it won't be reachable. Which is where NAT'ing (to enable VM -> outside connectivity) and port forwarding (to enable outside -> VM connectivity) need to be configured properly.
These can all be configured, however it introduces more complexity to the runtime environments as applications (and developers!) now need to work in a NAT'd environment and be aware of private and public IP addresses.
Finally, one specific downside of Chef is that each node that receives configurations must be registered centrally with the Chef server using a private key. We went with a convention of using a known VM name appended to the host machine's hostname to ensure uniquess across the organization, and auto-registration on VM up. However if a developer uncleanly deletes the VM, or if a workstation changes hostnames, the registration with the Chef server can be left behind, causing headaches for the developer.
Overall, this approach has been successful, although not perfect. The trade-offs can mostly be tooled around, and the advantages generally give us enough benefits to continue with this appraoch. However, work is always ongoing to iterate and improve on this approach. Hopefully this article has been useful to share our thoughts on VMs for developer environments!