Distributed computing is having a real impact on the way companies look at the cloud. The "Most Promising Jobs 2018" report published by LinkedIn pointed out that distributed and cloud Computing rank amongst the top 10 most in-demand skills.
Distributed computing solves many of the challenges that centralized computing systems pose today. These centralized systems - like IBM Mainframes - have been around for decades, but they’re beginning to lose favor. This is because centralized computing is ineffective and expensive in the context of increasing data and workloads. When you have a single central computer which controls a massive amount of computations - at the same time - it’s a massive strain on the system. Even one that’s particularly powerful. Centralized systems simply aren’t capable of processing huge volumes of transactional data and supporting tons of online users concurrently.
There’s also a big issue with reliability. If your centralized server fails, all data could be permanently lost if you have no disaster recovery strategy.
Fortunately, distributed computing offers solutions to many of these issues.
Distributed Computing comprises a group of systems located at different places, all connected over a network. They work on a single problem or a common goal. Each one of these systems is autonomous, programmable, asynchronous and failure-prone.
These systems provide a better price/performance ratio when compared to a centralized system. This is because it’s more economical to add microprocessors rather than mainframes to your network. They have more computational power as compared to their centralized (mainframe) computing systems.
Another major plus point of distributed computing systems is that they provide much greater agility than centralized computing systems. Without centralization, organizations can add and change software and computational power according to the demands and needs of the business. With the reduction in price for computing power and storage thanks to the rise of public cloud services like AWS, organizations all over the world have begun using distributed systems and service-oriented architectures, like microservices.
A perfect example of distributed computing in action is Google search. When a user submits a query, Google will use data from a number of different servers to deliver results, based on things like location, past searches, semantic keywords - and much, much more. These servers are located all around the world and are able to provide the search result in seconds or at time milliseconds.
Central to the adoption is the cloud. Today, cloud is mainstream and opens up the possibility of distributed systems to organizations in a number of different ways. Arguably, you’re not really seeing the full potential of cloud until you’ve moved to a distributed system.
Let’s take a look at the different ways cloud services are helping companies feel confident enough to successfully leverage distributed computing.
IaaS makes distributed systems accessible for many organizations by allowing them to host their infrastructure either internally on a private or public cloud. Essentially, they give an organization control over the operating system and platform that forms the foundation of their software infrastructure, but give an external cloud provider control over servers and virtualization technologies that make it possible to deploy that infrastructure.
In the context of a distributed system, this means organizations have less to worry about. As you can imagine, without an IaaS, the process of developing and deploying a distributed system becomes much more complex and even costly.
If IaaS effectively splits responsibilities between the organization and the cloud provider (the ‘service’), the platform as a Service (PaaS) ‘outsources’ even more to the cloud provider. Essentially, an organization simply has to handle the applications and data, leaving every other aspect of their infrastructure to the platform.
This brings many benefits, and, in theory, should allow even relatively small engineering teams to take advantage of the benefits of a distributed system. The underlying complexity and heavy lifting that a distributed system brings rests with the cloud provider, allowing an organization’s engineers to focus on what matters most - shipping code.
If you’re thinking about speed and innovation, then a PaaS opens that right up, provided your happy to allow your cloud provider to manage the bulk of your infrastructure.
SaaS solutions are perhaps the clearest example of a distributed system. Arguably, given the way we use Saas today, it’s easy to forget that it can be a part of a distributed system. The concept is simple: it’s a complete software solution delivered to the end-user.
If you’re trying to accomplish something particularly complex, something which you simply do not have the resources to do yourself, a SaaS solution could be effective. Users don’t need to worry about installing and maintaining software, they can simply access it via the internet
Distributed computing opens up your options when it comes to system architecture. Although you might rely on an external cloud service for some resources (like compute or storage), the architectural decisions are ultimately yours. This means that you can make decisions based on exactly what your organization needs and how it works.
In a sense, this is why distributed computing can bring you agility - but its not just about being agile in the strict sense, but also in a broader version of the word. It allows you to prioritize according to your own needs and demands.
Tasks can be partitioned into sub computations that can run concurrently. This, in turn, provides a total speedup of task completion.
What’s more, if a particular site is currently overloaded with jobs, some of them can be moved to lightly loaded sites. This technique of ‘load sharing’ can boost the performance of your system. Essentially, distributed systems minimize the latency and response time while increasing the throughput.
[caption id="attachment_23973" align="alignnone" width="1536"] [/caption]
Distributed networks offer a better price/performance ratio compared to centralized mainframe computers. This is because decentralized and modular applications can share expensive peripherals, such as high-capacity file servers and high-resolution printers.
Similarly, multiple components can be run on nodes with specialized processing. This further reduces the cost of multiple specialized processing systems.
Distributed systems involve services communicating through different machines. This is where message integrity, confidentiality and authentication comes into play. In such a case, distributed computing gives organizations the flexibility to deploy a 4 way mechanism to keep operations secure:
Another aspect of disaster recovery is reliability. If computation and the associated data effectively built into a single machine, and if that machine goes down, the entire service goes with it. With a distributed system, what could happen instead is that specific services might go down, but the whole thing should, in theory at least, stay standing.
So, if specific services can go down within a distributed system, you still do need to do something to increase resilience. You do this by replicating services across multiple nodes, minimizing potential points of failure.
This is what’s known as fault tolerance - it improves system reliability without affecting the system as a whole. It’s also worth pointing out that the hardware on which a distributed system is built is replaceable - this is better than depending on centralized hardware which, if it fails, will take everything with it…
A good example of a distributed system is SETI. SETI collects massive amounts of data from observatories around the world on activity in the sky, in a bid to identify possible signs of extraterrestrial life.
This information is then sliced into smaller pieces of data for easy analysis through distributed computing applications running as a screensaver on individual user PC’s, all around the world. The PC’s running the SETI screensaver will download a small file, and while a PC is unused, the screen saver downloads a data slice from SETI. It then runs the analytics application while the PC is idle, and when the analysis is complete, the analyzed data slice is uploaded back to SETI.
This massive data analytics is possible all because of distributed computing.
So, although distributed computing has become a bit of a buzzword, the technology is gaining traction in the minds of customers and service providers. Beyond the hype and debate, these services will ultimately help companies to be more responsive to market conditions while restraining IT costs.
Cloudflare’s decentralized vision of the web: InterPlanetary File System (IPFS) Gateway to create distributed websites
Oath’s distributed network telemetry collector- ‘Panoptes’ is now Open source!
Intelligent Edge Analytics: 7 ways machine learning is driving edge computing adoption in 2018