The words "vendor lock" worry me more than I'd like to admit. Whether it's having too many virtual machines in ec2, an expensive lambda in Google Functions, or any random offering that I have been using to augment my on-premise Raspberry Pi cluster, it's really something I'vefeared. Over time, I realize it has impacted the way I have spoken about off-premises services. Why? Because I got burned a few times.
A few months back I was getting a classic 3 AM call asking to check in on a service that was failing to report back to an on premise Sensu server, and my superstitious mind immediately went to how that third-party service had let my coworkers down. After a quick check, nothing was broken badly, only an unruly agent had hung on an on-premise virtual machine.
I’ve had other issues and wanted to help dispel some of the myths around adopting hybrid cloud solutions. So, to those ends, what are some of these myths and are they actually true?
Given some of the places I’ve worked, one of my memories was using VMware to spin up new VMs—a process that could take up to ten minutes to get baseline provisioning. This was eventually corrected by using packer to create an almost perfect VM, getting that into VMware images was time consuming, but after boot the only thing left was informing the salt master that a new node had come online.
In this example, I was using those VMs to startup a Scala http4s application that would begin crunching through a mounted drive containing chunks of data. While the on-site solution was fine, there was still a lot of work that had to be done to orchestrate this solution. It worked fine, but I was bothered by the resources that were being taken for my task. No one likes to talk with their coworker about their 75 machine VM cluster that bursts to existence in the middle of the workday and sets off resource alarms.
Thus, I began reshaping the application using containers and Hyper.sh, which has lead to some incredible successes (and alarms that aren't as stressful), basically by taking the data (slightly modified), which needed to be crunched and adding that data to s3. Then pushing my single image to Hyper.sh, creating 100 containers, crunching data, removing those containers and finally sending the finalized results to an on premise service—not only was time saved, but the work flow has brought redundancy in data, better auditing and less strain on the on premise solution.
So, while you can usually do all the work you need on-site, sometimes leveraging the options that are available from different vendors can create a nice web of redundancy and auditing. Buzzword bingo aside, the solution ended up to be more cost effective than using spot instances in ec2.
I’ll keep this response brief; monitoring is hard, no matter if the service, VM, database or container,is on-site or off. The same can be said for alerting, resource allocation, and cost analysis, but that said, these are all aspects of modern infrastructure that are just par for the course. Letting superstition get the better of you when experimenting with a hybrid solution would be a mistake.
The way I like to think of it is that as long as you have a way into your on-site servers that are locked down to those external nodes you’re all set. If you need to setup more monitoring, go ahead; the slight modification to Nagios or Zappix rules won’t take much coding and the benefit will always be at hand for notifying on-call. The added benefit, depending on the service, which exists off-site is maybe having a different level of resiliency that wasn't accounted for on-site, being more highly available through a provider.
For example, sometimes I use Monit to restart a service or depend on systemd/upstart to restart a temperamental service. Using AWS, I can set up alarms that trigger different events to handle a predefined run-book’s, which can handle a failure and saves me from that aforementioned 3am wakeup. Note that both of these edge cases has their own solutions, which aren’t “taxing”—just par for the course.
Too many tools not enough adoption
You’re not wrong, but if your developers and operators are not embracing at least a rudimentary adoption of these new technologies, you may want to look culturally. People should want to try and reduce cost through these new choices, even if that change is cautious, taking a second look at that s3 bucket or Pivotal cloud foundry app nothing should be immediately discounted. Because taking the time to apply a solution to an off-site resource can often result in an immediate saving in manpower.
Think about it for a moment, given whatever internal infrastructure you’re dealing with, the number of people that are around to support that application. Sometimes it's nice to give them a break. To take that learning curve onto yourself and empower your team and wiki of choice to create a different solution to what is currently available to your local infrastructure.
Whether its a Friday code jam, or just taking a pain point in a difficult deployment, crafting better ways of dealing with those common difficulties through a hybrid cloud solution can create more options. Which, after all, is what a hybrid cloud is attempting to provide – optionsthat can be used to reduce costs, increase general knowledge and bolster an environment that invites more people to innovate.
Ben Neil is a polyglot engineer who has the privilege to fill a lot of critical roles, whether it's dealing with front/backend application development, system administration, integrating devops methodology or writing. He has spent 10+ years creating solutions, services, and full lifecycle automation for medium to large companies. He is currently focused on Scala, container and unikernel technology following a bleeding edge open source community, which brings the benefits to companies that strive to be at the foremost of technology. He can usually be found either playing dwarf fortress or contributing on Github.