Setting up the development environment
In this book, we started with the assumption that the right cloud computing vendor has been chosen already. But, in general, when planning a cloud approach, these are useful questions to answer before you start working:
- What do I need from the cloud vendor?
- Which vendors have the stuff I need?
- What does it cost, compared to other major cloud vendors?
- Are there some pros/cons of one platform over another?
- Where is my business located (in conjunction with the cloud vendor data centers) and, more importantly, where are my services going to be consumed by users?
These are just a few of the initial questions we need to answer before we go ahead with making any plans. To answer these questions, a little bit of foresight is required.
Some of the most relevant players in the cloud computing market of IaaS and PaaS are listed as follows:
IaaS might suffice for putting a development infrastructure in the cloud is something. However, as the scenario evolves into a more complex cloud positioning made of services for the customer base, the answer is not something as simple as "We need just VMs."
These other questions can help identify which services are required:
- Is my company planning to sell Software-as-a-Service (SaaS)?
- Building SaaS products on a PaaS can be effective in terms of pros versus cons. It is proven that a minor focus on managing the infrastructure can reduce costs but also improve the overall focus a company has on its core business.
- A PaaS pricing model can also be translated into a SaaS pricing model, helping companies find out the right price for an elastic service, avoiding the complexity of the CAPEX approach.
- Does my company have dedicated IT staff capable of maintaining infrastructures?
- If the answer is no, you should probably take into consideration only PaaS, avoiding the use of IaaS, which, however small, introduces some administrative effort.
- Does my company have a preference between make or buy choices?
- If make choices are always preferred (although the term always is a per-se limitation), PaaS can be avoided and IaaS should be enough to build end-to-end services.
- If buy is the choice, PaaS should be optimal. There are ready-to-go PaaS services where users just have to deploy some custom preexistent code in order to go live. But of course, since it is a value-added service, it costs more.
- If the correct balance between make and buy choices has been made, both IaaS and PaaS are useful depending on the specific situation.
Note
This is the preferable approach, since there is no an absolute best between IaaS and PaaS. PaaS is probably growing much faster due to being a natural fit in this business, but companies often need IaaS to manage specific business areas, so we think both are here to stay for a long time.
This is not the right place to perform a comparison between cloud vendors—first, because it is really hard to do, and second, because it is beyond the scope of this book. However, everybody knows the following vendors, beyond among the most important players in the Public Cloud market (in alphabetical order):
- Amazon Web Services
- Google Cloud Platform
- Microsoft Azure
At the time of writing, each of the three platform has an almost comparable cloud offer, so the choice has to be taken with some in-depth analysis.
If we need IaaS, these can be valid questions:
- Does the platform provide virtual machines?
- Can the VM environment (network, attached disks, or balancer) be customizable?
- How is it customizable and does it fit the needs?
- Is there any appliance we need (or should need) in order to work better with the chosen public cloud?
- Is that a mandatory choice or is it up to the user?
If, instead, we need PaaS, these can be valid questions:
- Do we have an existing product to deploy? In which language has it been written? Does it have constraints for deployment?
- Does the cloud vendor PaaS fit this product?
- Does the provided PaaS accelerate our existing development process?
- Does it propose a new one?
- Is there good integration between services of the same cloud vendor?
- Is there good integration from/to other cloud vendors (that is, to enable multicloud solutions)?
- Is there a lock-in while choosing a given PaaS product?
- How much could it cost in the case of an outside migration?
- Does is have a learning curve to go live?
- How much could it cost in the case of adoption?
We always prefer to exploit questions to analyze problems, since we should never take a decision without evaluating the pros and cons.
Finally, we should look at an aspect that seems to be secondary (but isn't): the data center location.
In the public cloud, there is often no private data center, so resources are almost always shared between different customers from around the world. This points to several potential issues about security, isolation, and more, but it is up to the cloud vendor to solve these issues.
Anyway, there comes a time when we need to decide where to place our data, our code, and our VMs. If the business is primarily located in the US and we choose a cloud vendor that has a single data center in Australia, this is probably not the better choice in terms of latency.
Note
Latency is a term that indicates the time between input arriving to the other side of the communication channel. If we send a signal to a satellite, an average latency could be around 600 ms, which means that more than a second is required to make a round-trip (that is, an HTTP request). Latency depends on the physical constraints of the channel plus the distance, so the second is of primary importance while evaluating a data center location.
Again, if we decided for a cloud vendor that has a data center near us (the company) but far from the customers (that is, a company located in the UK with the data center in the UK but users in the US), this is, again, not a good choice. The best idea is a cloud vendor that provides different locations for a service, where companies can choose appropriately where to place (or replicate) services.
Content Delivery Networks (CDNs), covered later on in the book, represent a solution to shorten the distance between content and the end users. While Data Centers (DCs) are few, a CDN has hundreds of nodes distributed in regions even outside the Azure perimeter. For instance, a web application serving photos for users throughout the whole planet can have this topology:
- The deployment of the web application code into a few DCs (or even just one)
- The replication of the contents (the photos) all around the globe with a CDN
The intention of the user is to gain access to resources faster, since the content is served from the nearest CDN location while the page is served by the central DC.
Building a development machine
A good point to start building the company's development environment is to build a development machine for employees. In the last few years, developing through a well-configured virtual machine (either on-premise or in the cloud) has been a growing trend that has had consistent advantages:
- It reduces the need to maintain several pieces of hardware in order to comply with performance needs
- It enables the use of legacy hardware (thin clients and netbooks, among others) to act as a workstation proxy
- It centralizes the governance of the involved VMs into a single place where it should be easier to provide support
- It provides an efficient method to build a new workstation, with the required software, to let a new employee work immediately
In Microsoft Azure, VMs are basically provided in two main operating systems: Windows and Linux. The requirements a VM has are as follows:
- An underlying storage account where the VMs disks are saved
- A containing Virtual Network (VN)
- A public IP if the VM should be accessed from outside the VN
- A Network Security Group (NSG), which defines the firewall/security rules between VMs and VNs in Azure
Since VNs, NSGs, and IPs will be covered in the next chapters, we'll now introduce the concept of a storage account. For this account, the other requirements discussed earlier can be met even during VM provisioning, but it is better to plan this in advance according to requirement analysis:
- How many storage accounts should we create?
- One per VM? One per division? Just one?
- What are the constraints of the storage account service?
- Can I create an indefinite number of accounts?
- Does each account have its own limits in terms of capacity and bandwidth?
- Is there a different type of storage we can use?
These questions should help you understand how it is very important to stop and think before creating a new resource in the cloud. It is true that one of the main pillars of cloud computing is to be "elastic" (and in fact you can create/destroy almost everything in a few seconds/clicks); however, since different services are often interdependent, it is very important to avoid the use of a blind Next | Next approach.
Let's start with creating a new storage account:
- Go to https://portal.azure.com (the new portal) and locate your desired directory/subscription in the upper-right corner.
- Locate the New action and choose Data + Storage (or another semantically equivalent group).
Note
As the Portal can change its layout and/or internal links frequently, a good advice is to abstract from the exact steps you need to do in order to complete an action. Since the Azure Portal can change frequently, it is useless to imprint solid step-by-step guidance where a reasoning-based approach is more efficient.
- Look up the Storage account item (with the green icon) and follow the steps required to create it. Ensure that you specify Resource Manager and Deployment model before starting with the creation process.
Note
Azure started transitioning from the original deployment model (classic) to a new one (Resource Manager) a few years ago. This new deployment model is almost complete, but we are still in an intermediate phase. Later in the chapter and in the book, always use the Resource Manager model (as well as the new portal) when available.
During the creation process, some inputs are required:
- Name: This is the account name as well as the DNS prefix of the root service domain (
*.core.windows.net
). Choose it according to the requirements (https://msdn.microsoft.com/en-us/library/azure/dd135715.aspx) and know that we cannot change it after the creation process. - Type/pricing: This is the type of offer (and consequently, the feature set) of the storage account. It is primarily divided into two areas:
- Premium: This is made just to host the disks of the virtual machine, and it has great performances since it is an SSD-based service. Of course, it is more expensive.
- Standard: This is a general-purpose service to host disks as well as generic resources, plus it has three other services (Queues, Tables, and Files) that we will explain later.
- The resource group: For now, think about this as a logical group of resources. So, depending on the planned life cycle of the storage account, choose the appropriate existing group, or, if new, name it appropriately (that is,
TestResourceGroup
, CM_Storage
, or CM_Development
).
In the next few steps, we provide a fresh Windows Server environment to later configure the applications required for the reference workstation:
- Locate the New action and navigate to Compute | Visual Studio.
- Select the appropriate image type for your needs (like in the following screenshot):
- Again, ensure that you choose Resource Manager as the deployment model before clicking on the Create button.
Following the creation steps, you will be prompted with some input requests, such as the following:
- Basics: This includes information about the administrator username and password, location, and the resource group.
- Size: This includes the actual size of the VM. There are several options since the available sizes span small machines to very large ones. Some instance types are named
DSX
(where X
is a number), indicating that they support the usage of Premium Storage and provisioned IOPS.As for the storage account (and for every single Azure service, actually), we can choose the service type, which maps the feature set of the service. In the case of VMs, the differences between them are as follows:
- Cores, memory, and data disks: Primarily, these are metrics that define the performance
- Load balancing: This provides the support of a load balancer on top of the VMs in the same set (in the case of just one VM, this option is useless)
- Auto scaling: This is the capability to dynamically create new VM instances based on some triggers
- Advanced features: Provisioned IOPS, SSD local disks, and RDMA support are included in some of the most powerful (and often, the most expensive) pricing tiers
- Settings: As mentioned earlier in this chapter, we use default settings for VNs, IPs, and NSGs, covering them in further detail later in the chapter.
The creation of a VM also implies the creation of one (or more) Virtual Hard Disks (VHDs) in order to store the operating system and the data of the running instance. It is possible to look at the contents of a storage account (containing the VHDs of the VMs) in the vhds
container of Blob service directly from Portal, as shown in the following figures:
When the VM is ready for use, it is possible to connect through RDP to customize it by deploying local software, such as SDKs, development tools, productivity suites, and so on. This is not in the scope of the book, so we assume it has already been done. What we want to do, instead, is create a generalized image of the VM in order to enable further image-based VM creation with minimal administrative effort.
In Windows-based operating systems, the sysprep
command is used to generalize the operating system instance, removing specialized details, such as users, keys, and profiles. In Linux, we need to use the Azure Agent with the waagent
command.
Earlier in this chapter, we used a portal to perform various administrative activities. However, Azure has a fully featured REST management API. In addition, Microsoft provides a .NET management library wrapped by PowerShell in order to perform almost every operation from the command line. We will properly use PowerShell later in the book.
But now, we use a third option to operate on Azure, the ARM Explorer. During the creation of the storage account, we briefly went over the concept of resource groups, the logical container for a group of service instances. The ARM Explorer (available here in preview: https://resources.azure.com/) is a lightweight tool used to inspect and administer Microsoft Azure using the Resource Manager deployment model (which is actually JSON-based).
The ARM Explorer tool is a great tool to learn what's under the hood of the ARM model. For example, for the given VM created previously, this is what the explorer shows:
As we can see in the preceding image, ARM Explorer lets us explore the various objects in the Azure subscriptions, also providing a PowerShell sample code to perform basic contextual operations.
In the previous figure, we selected the virtualMachines node. However, if we select the specific VM (in the preceding case, the CMDevReference tree node), the Action tab becomes available with a series of options. To perform the capture of the VM's image, we can proceed as follows:
- First, deallocate the VM by invoking the
deallocate
command.Tip
Note that the ARM Explorer must run in read/write mode in order to perform many configuration actions. Also, keep in mind that the tool just invokes the REST Management API on your behalf.
- After the status of the VM has passed from Stopped to Stopped (deallocated) (you can even check this in the ARM Explorer itself or in the portal), launch the
generalize
action. - Now the VM is ready to be captured, while the
capture
operation needs some arguments for it to be executed, as specified here:
We can check out the image that has been properly created by navigating the storage account, looking for the VHD we just created. Along with the VHD, Azure created a JSON representing the command to be used with ARM in order to create a fresh VM based on the captured image.
CloudMakers now has a generalized Windows-based image for development machines. In the Automating repetitive operations section, we'll learn how to automate many of these related activities.