Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

How-To Tutorials - DevOps

49 Articles
article-image-mastering-promql-a-comprehensive-guide-to-prometheus-query-language
Rob Chapman, Peter Holmes
07 Nov 2024
15 min read
Save for later

Mastering PromQL: A Comprehensive Guide to Prometheus Query Language

Rob Chapman, Peter Holmes
07 Nov 2024
15 min read
This article is an excerpt from the book, "Observability with Grafana", by Rob Chapman, Peter Holmes. This book provides a holistic understanding of observability concepts using the Grafana Labs tools, teaching you how to fully leverage the LGTM stack.Introduction PromQL, or Prometheus Query Language, is a powerful tool designed to work with Prometheus, an open-source systems monitoring and alerting toolkit. Initially developed by SoundCloud in 2012 and later accepted by the Cloud Native Computing Foundation in 2016, Prometheus has become a crucial component of modern infrastructure monitoring. PromQL allows users to query data stored in Prometheus, enabling the creation of insightful dashboards and setting up alerts based on the performance metrics of applications and systems. This article will explore the core functionalities of PromQL, including how it interacts with metrics data and how it can be used to effectively monitor and analyze system performance. Introducing PromQL Prometheus was initially developed by SoundCloud in 2012; the project was accepted by the Cloud Native Computing Foundation in 2016 as the second incubated project (after Kubernetes), and version 1.0 was released shortly after. PromQL is an integral part of Prometheus, which is used to query stored data and produce dashboards and alerts. Before we delve into the details of the language, let’s briefly look at the following ways in which Prometheus-compatible systems  interact with metrics data: Ingesting metrics: Prometheus-compatible systems accept a timestamp, key-value labels, and a sample value. As the details of the Prometheus Time Series Database (TSDB) are  quite complicated, the following diagram shows a simplified example of how an individual sample for a metric is stored once it has been ingested:           Figure 5.1 – A simplified view of metric data stored in the TSDB The labels or dimensions of a metric: Prometheus labels provide metadata to identify data of interest. These labels create metrics, time series, and samples: * Each unique __name__ value creates a metric. In the preceding figure, the metric is app_ frontend_requests. * Each unique set of labels creates a time series. In the preceding figure, the set of all labels is the time series. * A time series will contain multiple samples, each with a unique timestamp. The preceding figure shows a single sample, but over time, multiple samples will be collected for each  time series. * The number of unique values for a metric label is referred to as the cardinality of the l abel. Highly cardinal labels should be avoided, as they signifi cantly increase the storage costs of the metric. The following diagram shows a single metric containing two time series and five samples:        Figure 5.2 – An example of samples from multiple time series In Grafana, we can see a representation of the time series and samples from a metric. To do this, follow these steps: 1. In your Grafana instance, select Explore in the menu. 2. Choose your Prometheus data source, which will be labeled as grafanacloud-<team>prom (default). 3. In the Metric dropdown, choose app_frontend_requests_total, and under Options, set Format to Table, and then click on Run query. Th is will show you all the samples and time series in the metric over the selected time range. You should see data like this:    Figure 5.3 – Visualizing the samples and time series that make up a metric Now that we understand the data structure, let’s explore PromQL. An overview of PromQL features In this section, we will take you through the features that PromQL has. We will start with an explanation of the data types, and then we will look at how to select data, how to work on multiple datasets, and how to use functions. As PromQL is a query language, it’s important to know how to manipulate data to produce alerts and dashboards. Data types PromQL offers three data types, which are important, as the functions and operators in PromQL will work diff erently depending on the data types presented: Instant vectors are a data type that stores a set of time series containing a single sample, all sharing the same timestamp – that is, it presents values at a specifi c instant in time:                             Figure 5.4 – An instant vector Range vectors store a set of time series, each containing a range of samples with different timestamps:                              Figure 5.5 – Range vectors Scalars are simple numeric values, with no labels or timestamps involved. Selecting data PromQL offers several tools for you to select data to show in a dashboard or a list, or just to understand a system’s state. Some of these are described in the following table: Table 5.1 – The selection operators available in PromQL In addition to the operators that allow us to select data, PromQL offers a selection of operators to compare multiple sets of data. Operators between two datasets Some data is easily provided by a single metric, while other useful information needs to be created from multiple metrics. The following operators allow you to combine datasets. Table 5.2 – The comparison operators available in PromQL Vector matching is an initially confusing topic; to clarify it, let’s consider examples for the three cases of vector matching – one-to-one, one-to-many/many-to-one, and many-to-many. By default, when combining vectors, all label names and values are matched. This means that for each element of the vector, the operator will try to find a single matching element from the second vector.  Let’s consider a simple example: Vector A: 10{color=blue,smell=ocean} 31{color=red,smell=cinnamon} 27{color=green,smell=grass} Vector B: 19{color=blue,smell=ocean} 8{color=red,smell=cinnamon} ‚ 14{color=green,smell=jungle} A{} + B{}: 29{color=blue,smell=ocean} 39 {color=red,smell=cinnamon} A{} + on (color) B{} or A{} + ignoring (smell) B{}: 29{color=blue} 39{color=red} 41{color=green} When color=blue and smell=ocean, A{} + B{} gives 10 + 19 = 29, and when color=red and smell=cinnamon, A{} + B{} gives 31 + 8 = 29. The other elements do not match the two vectors so are ignored. When we sum the vectors using on (color), we will only match on the color label; so now, the two green elements match and are summed. This example works when there is a one-to-one relationship of labels between vector A and vector B. However, sometimes there may be a many-to-one or one-to-many relationship – that is, vector A or vector B may have more than one element that matches the other vector. In these cases, Prometheus will give an error, and grouping syntax must be used. Let’s look at another example to illustrate this: Vector A: 7{color=blue,smell=ocean} 5{color=red,smell=cinamon} 2{color=blue,smell=powder} Vector B: 20{color=blue,smell=ocean} 8{color=red,smell=cinamon} ‚ 14{color=green,smell=jungle} A{} + on (color) group_left  B{}: 27{color=blue,smell=ocean} 13{color=red,smell=cinamon} 22{color=blue,smell=powder} Now, we have two different elements in vector A with color=blue. The group_left command will use the labels from vector A but only match on color. This leads to the third element of the combined vector having a value of 22, when the item matching in vector B has a different smell. The group_right operator will behave in the opposite direction. The final option is a many-to-many vector match. These matches use the logical operators and, unless, and or to combine parts of vectors A and B. Let’s see some examples: Vector A: 10{color=blue,smell=ocean} 31{color=red,smell=cinamon} 27{color=green,smell=grass} Vector B: 19{color=blue,smell=ocean} 8{color=red,smell=cinamon} ‚ 14{color=green,smell=jungle} A{} and B{}: 10{color=blue,smell=ocean} 31{color=red,smell=cinamon} A{} unless B{}: 27{color=green,smell=grass} A{} or B{}: 10{color=blue,smell=ocean} 31{color=red,smell=cinamon} 27{color=green,smell=grass} 14{color=green,smell=jungle} Unlike the previous examples, mathematical operators are not being used here, so the values of the elements are the values from vector A, but only the elements of A that match the logical condition in B are returned. ConclusionPromQL is an essential component of Prometheus, offering users a flexible and powerful means of querying and analyzing time-series data. By understanding its data types and operators, users can craft complex queries that provide deep insights into system performance. The language supports a variety of data selection and comparison operations, allowing for precise monitoring and alerting. Whether working with instant vectors, range vectors, or scalars, PromQL enables developers and operators to optimize their use of Prometheus for monitoring and alerting, ensuring systems remain performant and reliable. As organizations continue to embrace cloud-native architectures, mastering PromQL becomes increasingly vital for maintaining robust and efficient systems. Author BioRob Chapman is a creative IT engineer and founder at The Melt Cafe, with two decades of experience in the full application life cycle. Working over the years for companies such as the Environment Agency, BT Global Services, Microsoft, and Grafana, Rob has built a wealth of experience on large complex systems. More than anything, Rob loves saving energy, time, and money and has a track record for bringing production-related concerns forward so that they are addressed earlier in the development cycle, when they are cheaper and easier to solve. In his spare time, Rob is a Scout leader, and he enjoys hiking, climbing, and, most of all, spending time with his family and six children.Peter Holmes is a senior engineer with a deep interest in digital systems and how to use them to solve problems. With over 16 years of experience, he has worked in various roles in operations. Working at organizations such as Boots UK, Fujitsu Services, Anaplan, Thomson Reuters, and the NHS, he has experience in complex transformational projects, site reliability engineering, platform engineering, and leadership. Peter has a history of taking time to understand the customer and ensuring Day-2+ operations are as smooth and cost-effective as possible.
Read more
  • 0
  • 0
  • 1294

article-image-mastering-prometheus-sharding-boost-scalability-with-efficient-data-management
William Hegedus
28 Oct 2024
15 min read
Save for later

Mastering Prometheus Sharding: Boost Scalability with Efficient Data Management

William Hegedus
28 Oct 2024
15 min read
This article is an excerpt from the book, Mastering Prometheus, by William Hegedus. Become a Prometheus master with this guide that takes you from the fundamentals to advanced deployment in no time. Equipped with practical knowledge of Prometheus and its ecosystem, you’ll learn when, why, and how to scale it to meet your needs.IntroductionIn this article, readers will dive into techniques for optimizing Prometheus, a powerful open-source monitoring tool, by implementing sharding. As data volumes increase, so do the challenges associated with high cardinality, often resulting in strained single-instance setups. Instead of purging data to reduce load, sharding offers a viable solution by distributing scrape jobs across multiple Prometheus instances. This article explores two primary sharding methods: by service, which segments data by use case or team, and by dynamic relabeling, which provides a more flexible, albeit complex, approach to distributing data. By examining each method’s setup and trade-offs, the article offers practical insights for scaling Prometheus while maintaining efficient access to critical metrics across instances.Sharding Prometheus Chances are that if you’re looking to improve your Prometheus architecture through sharding, you’re hitting one of the limitations we talked about and it’s probably cardinality. You have a Prometheus instance that’s just got too much data in it, but… you don’t want to get rid of any data. So, the logical answer is… run another Prometheus instance! When you split data across Prometheus instances like this, it’s referred to as sharding. If you’re familiar with other database designs, it probably isn’t sharding in the traditional sense. As previously established, Prometheus TSDBs do not talk to each other, so it’s not as if they’re coordinating to shard data across instances. Instead, you predetermine where data will be placed by how you configure the scrape jobs on each instance. So, it’s more like sharding scrape jobs than sharding the data. Th ere are two main ways to accomplish this: sharding by service and sharding via relabeling. Sharding by service This is arguably the simpler of the two ways to shard data across your Prometheus instances. Essentially, you just separate your Prometheus instances by use case. This could be a Prometheus instance per team, where you have multiple Prometheus instances and each one covers services owned by a specific team so that each team still has a centralized location to see most of the data they care about. Or, you could arbitrarily shard it by some other criteria, such as one Prometheus instance for virtualized infrastructure, one for bare-metal, and one for containerized infrastructure. Regardless of the criteria, the idea is that you segment your Prometheus instances based on use case so that there is at least some unifi cation and consistency in which Prometheus gets which scrape targets. This makes it at least a little easier for other engineers and developers to reason when thinking about where the metrics they care about are located. From there, it’s fairly self-explanatory to get set up. It only entails setting up your scrape job in different locations. So, let’s take a look at the other, slightly more involved way of sharding your Prometheus instances. Sharding with relabeling Sharding via relabeling is a much more dynamic way of handling the sharding of your Prometheus scrape targets. However, it does have some tradeoff s. The biggest one is the added complexity of not necessarily knowing which Prometheus instance your scrape targets will end up on. As opposed to the sharding by service/team/domain example we already discussed, sharding via relabeling does not shard scrape jobs in a way that is predictable to users. Now, just because sharding is unpredictable to humans does not mean that it is not deterministic. It is consistent, but just not in a way that it will be clear to users which Prometheus they need to go to to find the metrics they want to see. There are ways to work around this with tools such as Th anos (which we’ll discuss later in this book) or federation (which we’ll discuss later in this chapter). The key to sharding via relabeling is the hashmod function, which is available during relabeling in Prometheus. The hashmod function works by taking a list of one or more source labels, concatenating them, producing an MD5 hash of it, and then applying a modulus to it. Then, you store the output of that and in your next step of relabeling, you keep or drop targets that have a specific hashmod value output. What’s relabeling again? For a refresher on relabeling in Prometheus, consult Chapter 4’s section on it. For this chapter, the type of relabeling we’re doing is standard relabeling (as opposed to metric relabeling) – it happens before a scrape occurs. Let’s look at an  example of how this works logically before diving into implementing it in our kubeprometheus stack. We’ll just use the Python REPL to keep it quick:  >>> from hashlib import md5 >>> SEPARATOR = ";" >>> MOD = 2 >>> targetA = ["app=nginx", "instance=node2"] >>> targetB = ["app=nginx", "instance=node23"] >>> hashA = int(md5(SEPARATOR.join(targetA).encode("utf-8")). hexdigest(), 16) >>> hashA 286540756315414729800303363796300532374 >>> hashB = int(md5(SEPARATOR.join(targetB).encode("utf-8")). hexdigest(), 16) >>> hashB 139861250730998106692854767707986305935 >>> print(f"{targetA} % {MOD} = ", hashA % MOD) ['app=nginx', 'instance=node2'] % 2 = 0 >>> print(f"{targetB} % {MOD} = ", hashB % MOD) ['app=nginx', 'instance=node23'] % 2 = 1As you can see, the hash of the app and instance labels has a modulus of 2 applied to it. For node2, the result is 0. For node23, the result is 1. Since the modulus is 2, those are the only possible values. Therefore, if we had two Prometheus instances, we would configure one to only keep targets where the result is 0, and the other would only keep targets where the result is 1 – that’s how we would shard our scrape jobs. The modulus value that you choose should generally correspond to the number of Prometheus instances that you wish to shard your scrape jobs across. Let’s look at how we can accomplish this type of sharding across two Prometheus instances using kube-prometheus. Luckily for us, kube-prometheus has built-in support for sharding Prometheus instances using relabeling by way of support via the Prometheus Operator. It’s a built-in option on Prometheus CRD objects. Enabling it is as simple as updating our prometheusSpec in our Helm values to specify the number of shards.  Additionally, we’ll need to clean up the names of our Prometheus instances; otherwise, Kubernetes won’t allow the new Pod to start due to character constraints. We can tell kube-prometheus to stop including kube-prometheus in the names of our resources, which will shorten the names. To do this, we’ll set cleanPrometheusOperatorObjectNames: true. The new values being added to our Helm values file from Chapter 2 look like this:  prometheus: prometheusSpec: shards: 2 cleanPrometheusOperatorObjectNames: trueThe full values file is available in this GitHub repository, which was linked at the beginning of this chapter. With that out of the way, we can apply these new values to get an additional Prometheus instance running to shard our scrape jobs across the two. The helm command to accomplish this is as follows:  $ helm upgrade --namespace prometheus \ --version 47.0.0 \ --values ch6/values.yaml \ mastering-prometheus \ prometheus-community/kube-prometheus-stackOnce that command completes, you should see a new pod named prometheus-masteringprometheus-kube-shard-1-0 in the output of kubectl get pods. Now, we can see the relabeling that’s taking place behind the scenes so that we can understand how it works and how to implement it in Prometheus instances not running via the Prometheus Operator. Port-forward to either of the two Prometheus instances (I chose the new one) and we can examine the configuration in our browsers at http://localhost:9090/config: $ kubectl port-forward \ pod/prometheus-mastering-prometheus-kube-shard-1-0 \ 9090The relevant section we’re looking for is the sequential parts of relabel_configs, where hashmod is applied and then a keep action is applied based on the output of hashmod and the shard number of the Prometheus instance. It should look like this:  relabel_configs: [ . . . ] - source_labels: [__address__] separator: ; regex: (.*) modulus: 2 target_label: __tmp_hash replacement: $1 action: hashmod - source_labels: [__tmp_hash] separator: ; regex: "1" replacement: $1 action: keepAs we can see, for each s crape job, a modulus of 2 is taken from the hash of the __address__ label, and its result is stored in a new label called __tmp_hash. You can store the result in whatever you want to name your label – there’s nothing special about __tmp_hash. Additionally, you can choose any one or more source labels you wish – it doesn’t have to be __address__. However, it’s recommended that you choose labels that will be unique per target – so instance and __address__ tend to be your best options. After calculating the modulus of the hash, the next step is the crucial one that determines which scrape targets the Prometheus shard will scrape. It takes the value of the __tmp_hash label and matches it against its shard number (shard numbers start at 0), and keeps only targets that match. The Prometheus Operator does the heavy lifting of automatically applying these two relabeling steps to all configured scrape jobs, but if you’re managing your own Prometheus configuration directly, then you will need to add them to every scrape job that you want to shard across Prometheus instances – there is currently no way to do it globally. It’s worth mentioning that sharding in this way does not guarantee that your scrape jobs are going to be evenly spread out across your number of shards. We can port-forward to the other Prometheus instance and run a quick PromQL query to easily see that they’re not evenly distributed across my two shards. I’ll port forward to port 9091 on my local host so that I can open both instances simultaneously: $ kubectl port-forward \ pod/prometheus-mastering-prometheus-kube-0 \ 9091:9090 Then, we can run this simple query to see how many scrape targets are assigned to each Prometheus instance: count(up) In my setup, there are eight scrape targets on shard 0 and 16 on shard 1. You can attempt to microoptimize scrape target sharding by including more unique labels in the source_label values for the hashmod operation, but it may not be worth the effort – as you add more unique scrape targets, they’ll begin to even out. One of the practical pain points you may have noticed already with sharding is that it’s honestly kind of a pain to have to navigate to multiple Prometheus instances to run queries. One of the ways we can try to make this easier is through federating our Prometheus instances. Conclusion In conclusion, sharding Prometheus is an effective way to manage the challenges posed by data volume and cardinality in your system. Whether you opt for sharding by service or through dynamic relabeling, both approaches offer ways to distribute scrape jobs across multiple Prometheus instances. While sharding via relabeling introduces more complexity, it also provides flexibility and scalability. However, it is important to consider the trade-offs, such as uneven distribution of scrape jobs and the need for tools like Thanos or federation to simplify querying across instances. By applying these strategies, you can ensure a more efficient and scalable Prometheus architecture. Author BioWill Hegedus has worked in tech for over a decade in a variety of roles, most recently in Site Reliability Engineering. After becoming the first SRE at Linode, an independent cloud provider, he came to Akamai Technologies by way of an acquisition.Now, Will manages a team of SREs focused on building an internal observability platform for Akamai&rsquo;s Connected Cloud. His team's responsibilities include managing a global fleet of Prometheus servers ingesting millions of data points every second.Will is an open-source advocate with contributions to Prometheus, Thanos, and other CNCF projects related to Kubernetes and observability. He lives in central Virginia with his wonderful wife, 4 kids, 3 cats, 2 dogs, and bearded dragon.
Read more
  • 0
  • 0
  • 622

article-image-rookout-and-appdynamics-team-up-to-help-enterprise-engineering-teams-debug-at-speed-with-deep-code-insights
Richard Gall
20 Feb 2020
3 min read
Save for later

Rookout and AppDynamics team up to help enterprise engineering teams debug at speed with Deep Code Insights

Richard Gall
20 Feb 2020
3 min read
It's not acknowledged enough that the real headache when it comes to software faults and performance problems isn't so much the problems themselves, but instead the process of actually identifying those problems. Sure, problems might slow you down, but wading though your application code to actually understand what's happened can sometimes grind engineering teams to a halt. For enterprise engineering teams, this can be particularly fatal. Agility is hard enough when you're dealing with complex applications and the burden of legacy software; but when things go wrong, any notion of velocity can be summarily discarded to the trashcan. However, a new partnership between debugging platform Rookout and APM company AppDynamics, announced at AppDynamics' Transform 2020 event, might just change that. The two organizations have teamed up, with Rookout's impressive debugging capabilities now available to AppDynamics customers in the form of a new product called Deep Code Insights. [caption id="attachment_31042" align="alignleft" width="696"] Live debugging of an application in production in Deep Code Insights[/caption]                 What is Deep Code Insights? Deep Code Insights is a new product for AppDynamics customers that combines the live-code debugging capabilities offered by Rookout with AppDynamic's APM platform. The advantage for developers could be substantial. Jerrie Pineda, Enterprise Software Architect at Maverik says that "Rookout helps me get the debugging data I need in seconds instead of waiting for several hours." This means, he explains, "[Maverik's] mean time to resolution (MTTR) for most issues is slashed up to 80%.” What does Deep Code Insights mean for AppDynamics? For AppDynamics, Deep Code Insights allows the organization to go one step further in its mission to "make it easier for businesses to understand their own software." At least that's how AppDynamics' VP of corporate development and strategy at Kevin Wagner puts it. "Together [with Rookout], we are narrowing the gaps between indicating a code-related problem impacting performance, pinpointing the direct issue within the line of code, and deploying a solution quickly for a seamless customer experience," he says. What does Deep Code Insights mean for Rookout? For Rookout, meanwhile, the partnership with AppDynamics is a great way for the company to reach out to a wider audience of users working at large enterprise organizations. The company received $8,000,000 in Series A funding back in August. This has provided a solid platform on which it is clearly looking to build and grow. Rookout's Co-Founder and CEO Or Weis describes the partnership as "obvious." "We want to bring the next-gen developer workflow to enterprise customers and help them increase product velocity," he says. Learn more about Rookout: www.rookout.com Learn more about AppDynamics: www.appdynamics.com  
Read more
  • 0
  • 0
  • 13041

article-image-10-tech-startups-for-2020-that-will-help-the-world-build-more-resilient-secure-and-observable-software
Richard Gall
30 Dec 2019
10 min read
Save for later

10 tech startups for 2020 that will help the world build more resilient, secure, and observable software

Richard Gall
30 Dec 2019
10 min read
The Datadog IPO in September marked an important moment for the tech industry. This wasn’t just because the company was the fourth tech startup to reach a $10 billion market cap in 2019, but also because it announced something that many people, particularly those in and around Silicon Valley, have been aware of for some time: the most valuable software products in the world aren’t just those that offer speed, and efficiency, they’re those that provide visibility, and security across our software systems. It shouldn’t come as a surprise. As software infrastructure becomes more complex, constantly shifting and changing according to the needs of users and businesses, the ability to assume some degree of control emerges as particularly precious. Indeed, the idea of control and stability might feel at odds with a decade or so that has prized innovation at speed. The mantra ‘move fast and break things’ is arguably one of the defining ones of the last decade. And while that lust for change might never disappear, it’s nevertheless the case that we’re starting to see a mindset shift in how business leaders think about technology. If everyone really is a tech company now, there’s now a growing acceptance that software needs to be treated with more respect and care. The Datadog IPO, then, is just the tip of an iceberg in which monitoring, observability, security, and resiliency tools have started to capture the imagination of technology leaders. While what follows is far from exhaustive, it does underline some of the key players in a growing field. Whether you're an investor or technology decision maker, here are ten tech startups you should watch out for in 2020 from across the cloud and DevOps space. Honeycomb Honeycomb has been at the center of the growing conversation around observability. Designed to help you “own production in hi-res,” what makes it unique in the market is that it allows you to understand and visualize your systems through high-cardinality dimensions (eg. at a user by user level, rather than, say, browser type or continent). The driving force behind Honeycomb is Charity Majors, its co-founder and former CEO. I was lucky enough to speak to her at the start of the year, and it was clear that she has an acute understanding of the challenges facing engineering teams. What was particularly striking in our conversation is how she sees Honeycomb as a tool for empowering developers. It gives them ownership over the code they write and the systems they build. “Ownership gives you the power to fix the thing you know you need to fix and the power to do a good job…” she told me. “People who find ownership is something to be avoided – that’s a terrible sign of a toxic culture.” Honeycomb’s investment status At the time of writing, Honeycomb has received $26.9 million in funding, with $11.4 million series A back in September. Firehydrant “You just got paged. Now what?” That’s the first line that greets you on the FireHydrant website. We think it sums up many of the companies on this list pretty well; many of the best tools in the DevOps space are designed to help tackle the challenges on-call developers face. FireHydrant isn't a tech startup with the profile of Honeycomb. However, as an incident management tool that integrates very neatly into a massive range of workflow tools, we’re likely to see it gain traction in 2020. We particularly like the one-click post mortem feature - it’s clear the product has been built in a way that allows developers to focus on the hard stuff and minimize the things that can just suck up time. FireHydrant’s investment status FireHydrant has raised $1.5 million in seed funding. NS1 Managing application traffic can be business-critical. That’s why NS1 exists; with DNS, DHCP and IP address management capabilities, it’s arguably one of the leading tools on the planet for dealing with the diverse and extensive challenges that come with managing massive amounts of traffic across complex interlocking software applications and systems. The company boasts an impressive roster of clients, including DropBox, The Guardian and LinkedIn, which makes it hard to bet against NS1 going from strength to strength in 2020. Like all software adoption, it might take some time to move beyond the realms of the largest and most technically forward-thinking organizations, but it’s surely only a matter of time until it the importance of smarter and more efficient becomes clear to even the smallest businesses. NS1’s investment status NS1 has raised an impressive $78.4 million in funding from investors (although it’s important to note that it’s one of the oldest companies on this list, founded all the way back in 2013). It received $33 million in series C funding at the beginning of October. Rookout “It’s time to liberate your data” Rookout implores us. For too long, the startup’s argument goes, data has been buried inside our applications where it’s useless for developers and engineers. Once it has been freed, it can help inform how we go about debugging and monitoring our systems. Designed to work for modern architectural and deployment patterns such as Kubernetes and serverless, Rookout is a tool that not only brings simplicity in the midst of complexity, it can also save engineering teams a serious amount of time when it comes to debugging and logging - the company claims by 80%. Like FireHydrant, this means engineers can focus on other areas of application performance and resilience. Rookout’s investment status Back in August, Rookout raised $8 million in Series A funding, taking its total funding amount to $12.2 million dollars. LaunchDarkly Feature flags or toggles are a concept that have started to gain traction in engineering teams in the last couple of years or so. They allow engineering teams to “modify system behavior without changing code” (thank you Martin Fowler). LaunchDarkly is a platform specifically built to allow engineers to use feature flags. At a fundamental level, the product allows DevOps teams to deploy code (ie. change features) quickly and with minimal risk. This allows for testing in production and experimentation on a large scale. With support for just about every programming language, it’s not surprising to see LaunchDarkly boast a wealth of global enterprises on its list of customers. This includes IBM and NBC. LaunchDarkly’s investment status LaunchDarkly raised $44 million in series C funding early in 2019. To date, it has raised $76.3 million. It’s certainly one to watch closely in 2020; it's ability to help teams walk the delicate line between innovation and instability is well-suited to the reality of engineering today. Gremlin Gremlin is a chaos engineering platform designed to help engineers to ‘stress test’ their software systems. This is important in today’s technology landscape. With system complexity making unpredictability a day-to-day reality, Gremlin lets you identify weaknesses before they impact customers and revenue. Gremlin’s mission is to “help build a more reliable internet.” That’s not just a noble aim, it’s an urgent one too. What’s more, you can see that the business is really living out its mission. With Gremlin Free launching at the start of 2019, and the second ChaosConf taking place in the fall, it’s clear that the company is thinking beyond the core product: they want to make chaos engineering more accessible to a world where resilience can feel impossible in the face of increasing complexity. Gremlin’s investment status Since being founded back in 2016 by CTO Matt Fornaciari and CEO Kolton Andrus, Gremlin has raised $26.8Million in funding from Redpoint Ventures, Index Ventures, and Amplify Partners. Cockroach Labs Cockroach Labs is the organization behind CockroachDB, the cloud-native distributed SQL database. CockroachDB’s popularity comes from two things: it’s ability to scale from a single instance to thousands, and it’s impressive resilience. Indeed, its resilience is where it takes its name from. Like a cockroach, CockroachDB is built to keep going even after everything else has burned to the ground. It’s been an interesting year for CockroachLabs and CockroachDB - in June the company changed the CockroachDB core licence from open source Apache license to the Business Source License (BSL), developed by the MariaDB team. The reason for this was ultimately to protect the product as it seeks to grow. The BSL still means the source code is accessible for any use other than for a DBaaS (you’ll need an enterprise license for that). A few months later, the company took another step in pushing forward in the market with $55 million series C funding. Both stories were evidence that CockroachLabs is setting itself up for a big 2020. Although the database market will always be seriously competitive, with resilience as a core USP it’s hard to bet against Cockroach Labs. Cockroaches find a way, right? CockroachLabs investment status CockroachLabs total investment, following on from that impressive round of series C funding is now $108.5 million. Logz.io Logz.io is another platform in the observability space that you really need to watch out for in 2020. Built on the ELK stack (ElasticSearch, Logstash, and Kibana), what makes Logz.io really stand out is the use of machine learning to help identify issues across thousands and thousands of logs. Logz.io has been on ‘ones to watch’ lists for a number of years now. This was, we think, largely down to the rising wave of AI hype. And while we wouldn’t want to underplay its machine learning capabilities, it’s perhaps with the increasing awareness of the need for more observable software systems that we’ll see it really pack a punch across the tech industry. Logz.io’s investment status To date, Logz.io has raised $98.9 million. FaunaDB Fauna is the organization behind FaunaDB. It describes itself as “a global serverless database that gives you ubiquitous, low latency access to app data, without sacrificing data correctness and scale.” The database could be big in 2020. With serverless likely to go from strength to strength, and JAMstack increasing as a dominant approach for web developers, everything the Fauna team have been doing looks as though it will be a great fit for the shape of the engineering landscape in the future. Fauna’s investment status In total, Fauna has raised $32.6 million in funding from investors. Clubhouse One thing that gets overlooked when talking about DevOps and other issues in software development processes is simple project management. That’s why Clubhouse is such a welcome entry on this list. Of course, there are a massive range of project management tools available at the moment. But one of the reasons Clubhouse is such an interesting product is that it’s very deliberately built with engineers in mind. And more importantly, it appears it’s been built with an acute sense of the importance of enjoyment in a project management product. Clubhouse’s investment status Clubhouse has, to date, raised $16 million. As we see a continuing emphasis on developer experience, the tool is definitely one to watch in a tough marketplace. Conclusion: embrace the unpredictable The tech industry feels as unpredictable as the software systems we're building and managing. But while there will undoubtedly be some surprises in 2020, the need for greater security and resilience are themes that no one should overlook. Similarly, the need to gain more transparency and build for observability are critical. Whether you're an investor, business leader, or even an engineer, then, exploring the products that are shaping and defining the space is vital.
Read more
  • 0
  • 0
  • 13566

article-image-beyond-kubernetes-key-skills-for-infrastructure-and-ops-engineers-in-2020
Richard Gall
20 Dec 2019
5 min read
Save for later

Beyond Kubernetes: Key skills for infrastructure and ops engineers in 2020

Richard Gall
20 Dec 2019
5 min read
For systems engineers and those working in operations, the move to cloud and the rise of containers in recent years has drastically changed working practices and even the nature of job roles. But that doesn’t mean you can just learn Kubernetes and then rest on your laurels. To a certain extent, the broad industry changes we’ve seen haven’t stabilised into some sort of consensus but rather created a field where change is only more likely - and where things are arguably even less stable. This isn’t to say that you have anything to fear as an engineer. But you should be open minded about the skills you learn in 2020. Here’s a list of 5 skills you should consider spending some time developing in the new year. Scripting and scripting languages Scripting is a well-established part of many engineers’ skill set. The reasons for this are obvious: they allow you to automate tasks and get things done quickly. If you don’t know scripting, then of course you should learn it. But even if you do it’s worth thinking about exploring some new programming languages. You might find that a fresh approach - like learning, for example, Go if you mainly use Python - will make you more productive, or will help you to tackle problems with greater ease than you have in the past. Learn Linux shell scripting with Learn Linux Shell Scripting: the Fundamentals of Bash 4.4. Find out how to script with Python in Mastering Python Scripting for System Administrators. Infrastructure automation tools and platforms With the rise of hybrid and multi-cloud, infrastructure automation platforms like Ansible and Puppet have been growing more and more important to many companies. While Kubernetes has perhaps dented their position in the wider DevOps tooling marketplace (if, indeed, that’s such a thing), they nevertheless remain relevant in a world where managing complexity appears to be a key engineering theme. With Puppet looking to continually evolve and Ansible retaining a strong position on the market, they remain two of the most important platforms to explore and learn. However, there are a wealth of other options too - Terraform in particular appears to be growing at an alarming pace even if it hasn’t reached critical mass, but Salt and Chef are also well worth learning too. Get started with Ansible, fast - learn with Ansible Quick Start Guide. Cloud architecture and design Gone are the days when cloud was just a rented server. Gone are the days when it offered a simple (or relatively simple, at least) solution to storage and compute problems. With trends like multi and hybrid cloud becoming the norm, serverless starting to gain traction at the cutting edge of software development, being able to piece together various different elements is absolutely crucial. Indeed, this isn’t a straightforward skill that you can just learn with some documentation and training materials. Of course those help, but it also requires sensitivity to business needs, an awareness of how developers work, as well as an eye for financial management. However, if you can develop the broad range of skills needed to architect cloud solutions, you will be a very valuable asset to a business. Become a certified cloud architect with Packt's new Professional Cloud Architect – Google Cloud Certification Guide. Security and resilience With the increase in architectural complexity, the ability to ensure security and resilience is now both vital but also incredibly challenging. Fortunately, there are many different tools and techniques available for doing this, each one relevant to different job roles - from service meshes to monitoring platforms, to chaos engineering, there are many ways that engineers can take on stability and security challenges head on. Whatever platforms you’re working with, make it your mission to learn what you need to improve the security and resilience of your systems. Learn how to automate cloud security with Cloud Security Automation. Pushing DevOps forward No one wants to hear about the growth of DevOps - we get that. It’s been growing for almost a decade now; it certainly doesn’t need to be treated to another wave of platitudes as the year ends. So, instead of telling you to simply embrace DevOps, a smarter thing to do would be to think about how you can do DevOps better. What do your development teams need in terms of support? And how could they help you? In theory the divide between dev and ops should now be well and truly broken - the question that remains is that how should things evolve once that silo has been broken down? Okay, so maybe this isn’t necessarily a single skill you can learn. But it’s something that starts with conversation - so make sure you and those around you are having better conversations in 2020. Search the latest DevOps eBooks and videos on the Packt store.
Read more
  • 0
  • 0
  • 5151

article-image-new-for-2020-in-operations-and-infrastructure-engineering
Richard Gall
19 Dec 2019
5 min read
Save for later

New for 2020 in operations and infrastructure engineering

Richard Gall
19 Dec 2019
5 min read
It’s an exciting time if you work in operations and software infrastructure. Indeed, you could even say that as the pace of change and innovation increases, your role only becomes more important. Operations and systems engineers, solution architects, everyone - you’re jobs are all about bringing stability, order and control into what can sometimes feel like chaos. As anyone that’s been working in the industry knows, managing change, from a personal perspective, requires a lot of effort. To keep on top of what’s happening in the industry - what tools are being released and updated, what approaches are gaining traction - you need to have one eye on the future and the wider industry. To help you with that challenge and get you ready for 2020, we’ve put together a list of what’s new for 2020 - and what you should start learning. Learn how to make Kubernetes work for you It goes without saying that Kubernetes was huge in 2019. But there are plenty of murmurs and grumblings that it’s too complicated and adds an additional burden for engineering and operations teams. To a certain extent there’s some truth in this - and arguably now would be a good time to accept that just because it seems like everyone is using Kubernetes, it doesn’t mean it’s the right solution for you. However, having said that, 2020 will be all about understanding how to make Kubernetes relevant to you. This doesn’t mean you should just drop the way you work and start using Kubernetes, but it does mean that spending some time with the platform and getting a better sense of how it could be used in the future is a useful way to spend your learning time in 2020. Explore Packt's extensive range of Kubernetes eBooks and videos on the Packt store. Learn how to architect If software has eaten the world, then by the same token perhaps complexity has well and truly eaten software as we know it. Indeed, Kubernetes is arguably just one of the symptoms and causes of this complexity. Another is the growing demand for architects in engineering and IT teams. There are a number of different ‘architecture’ job roles circulating across the industry, from solutions architect to application architect. While they each have their own subtle differences, and will even vary from company to company, they’re all roles that are about organizing and managing different pieces into something that is both stable and value-driving. Cloud has been particularly instrumental in making architect roles more prominent in the industry. As organizations look to resist the pitfalls of lock-in and better manage resources (financial and otherwise), it will be down to architects to balance business and technology concerns carefully. Learn how to architect cloud native applications. Read Architecting Cloud Computing Solutions. Get to grips with everything you need to know to be a software architect. Pick up Software Architect's Handbook. Artificial intelligence It’s strange that the hype around AI doesn’t seem to have reached the world of ops. Perhaps this is because the area is more resistant to the spin that comes with AI, preferring instead to focus more on the technical capabilities of tools and platforms. Whatever the case, it’s nevertheless true that AI will play an important part in how we manage and secure infrastructure. From monitoring system health, to automating infrastructure deployments and configuration, and even identifying security threats, artificial intelligence is already an important component for operations engineers and others. Indeed, artificial intelligence is being embedded inside products and platforms that ops teams are using - this means the need to ‘learn’ artificial intelligence is somewhat reduced. But it would be wrong to think it’s something that can just be managed from a dashboard. In 2020 it will be essential to better understand where and how artificial intelligence can fit into your operations and architectural toolchain. Find artificial intelligence eBooks and videos in Packt's collection of curated data science bundles. Observability, monitoring, tracing, and logging One of the challenges of software complexity is understanding exactly what’s going on under the hood. Yes, the network might be unreliable, as the saying goes, but what makes things even worse is that we’re not even sure why. This is where observability and the next generation of monitoring, logging and tracing all come into play. Having detailed insights into how applications and infrastructures are performing, how resources are being managed, and what things are actually causing problems is vitally important from a team perspective. Without the ability to understand these things, it can put pressure on teams as knowledge becomes siloed inside the brains of specific engineers. It makes you vulnerable to failure as you start to have points of failure at a personnel level. There are, of course, a wide range of tools and products available that can make monitoring and tracing easy (or easier, at least). But understanding which ones are right for your needs still requires some time learning and exploring the options out there. Make sure you do exactly that in 2020. Learn how to monitor distributed systems with Learn Centralized Logging and Monitoring with Kubernetes. Making serverless a reality We’ve talked about serverless a lot this year. But as a concept there’s still considerable confusion about what role it should play in modern DevOps processes. Indeed, even the nomenclature is a little confusing. Platforms using their own terminology, such as ‘lambdas’ and ‘functions’, only adds to the sense that serverless is something amorphous and hard to pin down. So, in 2020, we need to work out how to make serverless work for us. Just as we need to consider how Kubernetes might be relevant to our needs, we need to consider in what ways serverless represents both a technical and business opportunity. Search Packt's library for the latest serverless eBooks and videos. Explore more technology eBooks and videos on the Packt store.
Read more
  • 0
  • 0
  • 4242
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-operations-and-infrastructure-engineering-in-2019-what-really-mattered
Richard Gall
18 Dec 2019
6 min read
Save for later

Operations and infrastructure engineering in 2019: what really mattered

Richard Gall
18 Dec 2019
6 min read
Everything is unreliable, right? If we didn’t realise it before, 2019 was the year when we fully had to accept the reality of the systems we’re building and managing. That was scary, sure, but it was also liberating. But we shouldn’t get carried away: given how highly distributed software systems are now part and parcel in a range of different industries, the issue of reliability and resilience isn’t purely an academic issue: in many instances, it’s urgent and critical. That makes the work of managing and building software infrastructure an incredibly vital role. Back in 2015 I wrote that Docker had turned us all into SysAdmins, but on reflection it may be more accurate to say that we’ve now entered a world where cloud and the infrastructure-as-code revolution has turned everyone into a software developer. Kubernetes is everywhere Kubernetes is arguably the definitive technology of 2019. With the move to containers now fully mainstream, Kubernetes is an integral in helping engineers to deploy and manage containers at scale. The other important element to Kubernetes is that it all but kills off dreaded infrastructure lock-in. It gives you the freedom to build across different environments, and inside a more heterogeneous software infrastructure. From a tooling and skill set perspective that’s a massive win. Although conversations about flexibility and agility have been ongoing in the tech industry for years, with Kubernetes we are finally getting to a place where that’s a reality. This isn’t to say it’s all plain sailing - Kubernetes’ complexity is a point of complaint for many, with many people suggesting that compared to, say, Docker, the developer experience leaves a lot to be desired. But insofar as DevOps and cloud-native have almost become the norm for many engineering teams, Kubernetes casts a huge shadow. Indeed, even if it’s not the right option for you right now, it’s hard to escape the fact that understanding it, and being open to using it in the future, is crucial. Find an extensive range of Kubernetes content in our new cloud bundles.  Serverless and NoOps This year serverless has really come into its own. Although it was certainly gaining traction in 2018, the last 12 months have demonstrated its value as more and more teams have been opting to forgo servers completely. There have been a few arguments about whether serverless is going to kill off containers. It’s not hard to see where this comes from, but in reality there’s no chance that this is going to happen. The way to think of serverless is to see it as an additional option that can be used when speed and agility are particularly important. For large-scale application development and deployment, containers running on ‘traditional’ cloud servers will be the dominant architectural approach. The companion trend to serverless is NoOps. Given the level of automation and abstraction that serverless can give you, the need to configure environments to ensure code runs properly all but disappears - code runs through ‘functions’ that get fired when needed. So, the thinking goes, the need for operations becomes very small indeed. But before anyone starts worrying about their jobs, the death of operations is greatly exaggerated. As noted above, serverless is just one option - it’s not redefining the architectural landscape. It might mean that the way we understand ‘ops’ evolves (just as ‘dev’ has), but it certainly won’t kill it off. Discover and search serverless eBooks and videos on the Packt store. Chaos engineering In the introduction I mentioned that one of the strange quandaries of our contemporary distributed software world is that we’ve essentially made things more unreliable at a time when software systems are being used in ever more critical applications. From healthcare to self-driving cars, we’re entering a world where unreliability is both more common and potentially more damaging. This is where chaos engineering comes in. Although it first appeared on ThoughtWorks Radar back in November 2017 and hasn’t yet moved out of its ‘Trial’ quadrant, in reality chaos engineering has been manifesting itself in a whole host of ways in 2019. Indeed, it’s possible that the term itself is misleading. While it suggests a wholesale methodology, in truth, there are different ways in which the core principles behind it - essentially stress-testing your software in order to manage unpredictability and improve resilience - are being used in different ways for both testing and security purposes. Tools like Gremlin have done a lot to help promote chaos engineering and make it more accessible to organizations that maybe wouldn't see themselves as having the resources to perform cutting-edge approaches. It appears the ground-work has been done, which means it will be interesting to see how it evolves in 2020. Observability: service meshes and tracing One of the biggest challenges when dealing with complex software systems - and one of the reasons why they are necessarily unreliable - is because it can be difficult (sometimes impossible) to get an understanding of what’s actually going on. This is why the debate around observability and monitoring has moved on. It’s no longer enough to have a set of discrete logs and metrics. Chances are that they won’t capture the subtleties of what’s happening, or won’t be able to provide you with context that helps you to actually understand where errors are coming from. What’s more, a lack of observability and the wrong monitoring set up can cause all sorts of issues inside a team. At a time when the role of the on call developer has never been more discussed and, indeed, important, ensuring there’s a level of transparency is the only way to guarantee that all developers are able to support each other and solve problems as they emerge. From this perspective, then, observability has a cultural impact as much as it does a technical one. Learn distributed tracing with Yuri Shkuro from Uber's observability engineering team: find Mastering Distributed Tracing on the Packt store.         Not sure what to learn for 2020? Start exploring thousands of tech eBooks and videos on the Packt store.
Read more
  • 0
  • 0
  • 3466

article-image-was-2019-the-year-the-world-caught-the-kubernetes-fever
Guest Contributor
17 Dec 2019
8 min read
Save for later

Was 2019 the year the world caught the Kubernetes fever?

Guest Contributor
17 Dec 2019
8 min read
In the current IT landscape, phrases such as “containerized applications” and “container deployment” are thrown around so often, that the meanings and connotations behind them often get tampered, and ultimately forgotten. In the case of Kubernetes, however, the opposite seems to be coming true. Although it might seem hyperbolic to refer to the modern interaction with software management as being heavily influenced by the “Age of Kubernetes”-  the accelerating growth of Kubernetes as one of the most widely adopted open-source project, with over 2300 active contributors to Kubernetes’s repository on GitHub bears witness to the massive influence that the orchestration platform has had. Originally developed by Google, and launched in 2014- Kubernetes has come a really long way since it’s advent. Although there are other similar container orchestration platforms available on the market, the most notable ones being Docker Swarm and Apache Mesos; Kubernetes has established itself as the de-facto orchestration platform in use today. Having said that, as a quick Google search might reveal- with a whopping 26,400,000 results- Kubernetes has risen to the top of the totem pole over the course of the year. However, before we can get into rationalizing the reasons that drive the world’s obsession with the container orchestration platform, we’d like to provide our readers with a quick snapshot of everything Kubernetes is and everything that it is not. Kubernetes: A Brief Overview The transition from the traditional deployment era, where organizations used to rely on applications being run on physical servers to the virtual deployment era, in which the highly popular concept of virtualization was introduced- to the container deployment era, which saw the employment of  ‘containers’ that are significantly lighter in weight, as compared to virtual machines (VMs)- these changes ultimately led to the creation of a container orchestration market, which is a huge contributing factor to the growing popularity of Kubernetes and other similar platforms. Having said that, however, as we’ve already mentioned above- the features that Kubernetes offers to organizations enable it to have a certain edge over its competition. Originally developed by Google in 2014, having descended from an old-school container orchestration platform called ‘Borg,’ Kubernetes is an open-source container orchestration platform that reduces the workload for both large and small companies, by automating the deployment, scaling and management of containerized applications. Bearing witness to the effectiveness and reliability of the container orchestration application is the fact that it is imbursed by gigantic digital entities such as Google, Microsoft, Cisco, Intel, and Red Hat. Furthermore, on their website, Kubernetes cites several testimonials from colossal corporations such as Spotify, Nav, Capital One, Comcast- which further goes on to demonstrate the reliability of the benefits offered by the container orchestration platform. What functions does Kubernetes perform? Taking into consideration the fact that most organizations, regardless of how large or small they might be, are deploying hundreds and thousands of containerized instances daily- the complexity of the situation requires platforms such as Kubernetes to step in and help organizations manage and automate containerized processes while taking into account the context of the microservice architecture as well. Kubernetes aids development teams by deploying applications and helping in the management of the containerized applications by performing the following functions: Deployment: Perhaps the most significant function that Kubernetes performs includes the deployment of a specified number of containers to a host, along with ensuring that the containers are functioning as they are supposed to, that is, without any malfunctions, etc. Rollouts: A rollout refers to a change in the original deployment of a container. Kubernetes allows development teams to take the management of their containerized tasks to the next level, by automating the initiation of the container deployment, along with offering them the option of pausing, resuming or rolling back any rollouts. Discovery of service: Kubernetes automates the exposure of a specified container to the internet, or to other containers, by allotting to containers a DNS name or an IP address. Since the increasing threats and risks of cyber-attacks, it has become essential to protect your IP address. To do so use a VPN as it not only hides the IP address but also provides protection against IP spoofing. Managing storage: A monumental advantage that Kubernetes offers organization is the liberty to allocate persistent local or cloud storage to specified containers as needed. Load scaling and balancing: Kubernetes allows for organizations to maintain stability across the network by automatically load balancing and scaling in the instance that traffic to a certain container increases. Self-healing: A feature unique to Kubernetes, the widely popular container orchestration platform seeks to improve the availability on the network through restarting or replacing a failed container. Moreover, Kubernetes can also automate the removal of containers that appear to be damaged, or fail to meet the health-check requirements. Are there any limitations to Kubernetes’s power? Up till now, we’ve done nothing but present facts regarding Kubernetes. Often times, however, organizations tend to overlook the limitations of an effective management tool. Despite the numerous advantages that organizations get to reap with the integration of Kubernetes, the fact that Kubernetes is not a traditional software and functions on a container level, rather than at the hardware-level should always be kept in mind. In order to make the most effective use of the container orchestration platform, it is essential that companies take into account the limitations of Kubernetes- which consist of the following: Kubernetes does not build applications, neither does it deploy source code. Kubernetes is not responsible for providing organizations with services centric to applications. Examples of these application-level services include middleware (message buses) and other data-processing frameworks such as Spark, caches, amongst many others. Kubernetes does not offer to organizations logging, monitoring, and alerting solutions, instead it provides integrations and mechanisms which then enable organizations to collect and export metrics. In addition to these limitations, it should also be mentioned that despite the constant referral of Kubernetes as an orchestration tool- it is not just that. Instead of simply orchestrating or managing the containerized applications by propagating a defined workflow, Kubernetes eliminates the need for orchestration altogether and consists of components that constantly drive the current state of the network into providing the desired result to the organization. Furthermore, Kubernetes also gives rise to a system without any centralized control, which makes it much more easier to use. Explaining Kubernetes’s popularity Now that we’ve hopefully jogged up our reader’s memories by providing them with a rundown of everything Kubernetes- let’s get down to business. Taking into consideration the ever-increasing growth and popularity of the container orchestration platform, particularly it’s a spike in 2019- readers might be left wondering with the question; “Why is Kubernetes so popular?” Well, the short explanation behind Kubernetes’s popularity is simple- it’s highly effective. The longer explanation, on the other hand, however, can be broken down into the following main reasons: Kubernetes saves time: In the digital age, time is more crucial than ever. As more and more organizations get digitized, time plays a monumental role in routine operations, especially where development teams are concerned. The staggering popularity of Kubernetes is deeply rooted in how time-effective, a platform is since it allows organizations to effectively handle all facets of container orchestration without having to fill out forms or send emails to request new machines to run applications. 2. Kubernetes is highly cost-effective: For most enterprises, the driving force behind their operations is the knowledge that their business goal is being fulfilled. Kubernetes can actually contribute to that since it allows for organizations to partake in better resource utilization. As we’ve already mentioned above, Kubernetes is a much more improved alternative to VMs, since it focuses solely on containers, which are light-weight, and thus require less CPU and memory resources. 3. Kubernetes can run on the cloud, as well as on-premise: An unprecedented, but widely welcomed feature that Kubernetes offers is that it is cloud-agnostic. The term ‘cloud-agnostic’ implies that Kubernetes can run on cloud-based services, as well as on-premise. This offers organizations with the luxury of not having to redesign or alter their infrastructure or applications to accommodate Kubernetes. Additionally, companies are also providing software that helps organizations manage the running of Kubernetes, whether it is on a cloud-based server or on-premise. Final Words We hope that we’ve made it clear what Kubernetes does, and the reasons that led to its rise in popularity. Having said that, however, it is still equally important that organizations take into consideration the limitations of the container orchestration system, and integrate it within their companies smartly- which ultimately enables organizations to leverage better benefits! Author Bio Rebecca James is an enthusiastic cybersecurity journalist. A creative team leader, editor of PrivacyCrypts. DevOps mistakes which developers should avoid! Chaos engineering comes to Kubernetes thanks to Gremlin Understanding the role AIOps plays in the present-day IT environment
Read more
  • 0
  • 0
  • 3219

article-image-devops-mistakes-which-developers-should-avoid
Guest Contributor
16 Dec 2019
9 min read
Save for later

DevOps mistakes which developers should avoid!

Guest Contributor
16 Dec 2019
9 min read
DevOps is becoming recognized as a vital pillar of digital transformation. Because of this, CIOs are becoming enthusiastic regarding how DevOps and open source can completely transform the enterprise culture. All organizations want to succeed and reach their development goals across all projects. However, in reality, the entire journey is not at all easy as it seems, and it often requires collective efforts and time. In this entire journey, there are some common failures which teams are likely to come across. In this post, we’ll discuss DevOps mistakes which everyone should know and must avoid. Before that, it is necessary to understand the importance of DevOps in today’s world. Importance of DevOps in today’s world DevOps often describes a culture and a set of processes which brings development and operations together to complete software development. It enables organizations not just to create but also improve products at a faster pace than they can with some approaches to software development. DevOps adoption rate is increasing by each passing day. According to Statista many business organizations are shifting towards the DevOps culture and there is an increase of 17% in 2018 from the previous year.  DevOps culture is instrumental in today’s world. The following points briefly highlight the need for DevOps in this era. It reduces costs and other IT stuff. Results in greater competencies. It provides better communication and cooperation opportunities. The development cycle is fast and innovative. Deployment failures are reduced to a great extent. Eight DevOps Mistakes Many people still don’t fully understand what DevOps means. Without prior knowledge and understanding, many DevOps initiatives fail to get off the ground successfully. Following is a brief description of DevOps’ mistakes, and how they can be avoided to start a successful DevOps journey. 1. Rigid DevOps Process Compliance with core DevOps tenets is vital for DevOps success; organizations have to make adjustments in active response to meet organization demands. Enterprises have to make sure that while the main DevOps pillars remain stable while implementing DevOps; they make the internal adjustments needed in internal benchmarking of the expected consequences. Instrumenting codebases in a gritty manner and making them more and more partitioned results in more flexibility and provide DevOps team the ultimate power to backtrack and recognize the root cause of diversion in the event of failed outcomes. But, all adjustments have to be made while staying within the boundaries defined by DevOps. 2. Oversimplification of the process Indeed DevOps is a complex process. To implement DevOps, enterprises often go on a DevOps Engineer hiring spree or at times, create a new and isolated one. The DevOps department is then responsible for managing the DevOps framework and strategy, and it needlessly adds new processes that are often lengthy and complicated. Instead of creating an isolated DevOps department, organizations should focus on optimizing their processes to make operational products that leverage the right set of resources. For successful implementation of DevOps, organizations must be capable enough to manage the DevOps framework, leverage functional experts, and other resources that can manage DevOps related tasks like budgeting goals, resource management, and process tracking. DevOps requires a cultural overhaul. Organizations must consider a phased and measured transition to DevOps implementation by educating and training employees on these new processes. Also, they should have the right frameworks to enable careful collaboration. 3. Not preparing for a cultural change When you have the right tools for DevOps practices, you likely might come across a new challenge. The challenge will be trying to make your teams use the tools for fast development, continuous delivery, automated testing, and monitoring. Is your DevOps culture ready for all such things? For example, agile methodologies usually mandate that you ship new code once a week, or once a day. It results in the failure of agile methods. You might also face the same conceptual issues with DevOps. It can be like pulling on a smooth road with a car with no fuel in it. To prevent this situation, plan for a transition period. Leave enough time for the development and operational team to get used to new practices. Also, make sure that they have a chance to gain experience with the new processes and tools. Ensure that before adopting DevOps, you've got a matured Dev and Ops culture. 4. Creating a single DevOps team The most common mistake which most organizations and enterprises make is to create a brand-new team and task them with addressing all the burdens of a DevOps initiative. It is challenging and complicated for both development and operations to deal with a new group that coordinates with everyone. DevOps started with the idea of enhancing collaborations between teams involved in the development of software like security, DBMS, and QA. However, it is not only about development and operations. If you create a new side to address DevOps, you’re making things more complicated. The secret ingredient here is simplicity. Focus on culture by encouraging a mindset of automation, quality, and stability. For instance, you might involve everyone in a conversation regarding your architecture, or about some problems found in production environments in which all the relevant players need to be well aware of how their work influences others. “DevOps is not about a single dedicated team but about organizations that progress together as a DevOps team.” 5. Not including the security team DevOps is about more than merely putting the development and operations teams together. It is a continuous process of automation and software development, including audit, compliance, and security. Many organizations make the mistake of not following their security practices in advance. According to a CA Technologies survey, security concerns were the number-one obstacle to DevOps as cited by 38% of the respondents. Similarly, the Puppet survey found that high-performing DevOps teams spend 50% less time remediating security issues than low performers. These high performing teams found different ways to communicate their security objectives and to establish security in the early phases of their development process. All DevOps practitioners should evaluate the controls, recognize the risks, and understand the processes. In the end, security is always an integral part of DevOps practices, such as DevSecOps (a practice in which development and operations is integrated with security). For example, if you have some security issues in production, you can address them within your DevOps pipeline through the tools which the security team already uses. DevOps and security practices should be followed strictly, and there should be no compromises. Moreover, other measures should be adopted to avoid cyber-criminals invading the DevOps culture. Invest in cybersecurity markets has become a necessity to avoid situations where attacker can carry out attacks like that of spear phishing and phishing. It is found that out of all attacks on various organizations, 95% of them were a result of spear phishing. 6. Incorrect use of incident management DevOps teams must have a robust incident management process in place. The incident management needs to be utterly proactive and an ongoing process. It means that having a documented incident management process is imperative to define the incident responses. For example, a total downtime event will have a different response workflow in comparison to a minor latency problem. The failure to not do so can often lead to missed timelines and preventable projects delay. 7. Not utilizing purposeful automation DevOps needs organizations to adopt and implement purposeful automation. For DevOps, it is essential to take automation across the complete development lifecycle. It includes continuous delivery, continuous integration, and deployment for velocity and quality outcomes. Purposeful end-to-end automation is a crucial successful DevOps implementation. Therefore, organizations should look at the complete automation of the CI and CD pipeline. However, at the same time, organizations need to identify various opportunities for automation across functions and processes. This helps to reduce the need for manual handoffs for complicated integrations that need new management in multiple format deployments. Editor’s Note: Do you use or plan to use Azure for DevOps? If you want to know all about Azure DevOps services, we recommend our latest cookbook, ‘Azure DevOps Server 2019 Cookbook - Second Edition’ written by Tarun Arora and Utkarsh Shigihalli. The recipes in this book will help you achieve skills you need to break down the invisible silos between your software development teams and transform them into a modern cross-functional software development team. 8. Wrong-way to measure project success DevOps promises for faster delivery. But, if that acceleration comes at the cost of quality, then the DevOps program is a failure. Enterprises looking at deploying DevOps should use the right metrics to understand project growth and success. For this reason, it is imperative to consider metrics that align velocity with success. Do focus on the right parameters as it is essential to drive intelligent automation decisions. Conclusion Now organizations are rapidly running towards DevOps to stand with competition and become successful but they often make big mistakes. There are mistakes that people commit while implementing a DevOps culture. However, all these mistakes are avoidable, and hopefully, the points mentioned above have successfully cleared your vision to a great extent. After you overcome all the mistakes and adopt DevOps practices, your organization will surely enjoy improved client satisfaction, and employee morale increased productivity, and agility- all of which helps in growing your business. If you plan to accelerate deployment of high-quality software by automating build and releases using CI/CD pipelines in Azure, we suggest you to check out Azure DevOps Server 2019 Cookbook - Second Edition, which will help you create and release extensions to the Azure DevOps marketplace and reach the million-strong developer ecosystem for feedback. Author Bio Rebecca James is an enthusiastic cybersecurity journalist. A creative team leader, editor of PrivacyCrypts. Abel Wang explains the relationship between DevOps and Cloud-Native Can DevOps promote empathy in software engineering? 7 crucial DevOps metrics that you need to track
Read more
  • 0
  • 0
  • 4391

article-image-abel-wang-explains-the-relationship-between-devops-and-cloud-native
Savia Lobo
12 Dec 2019
5 min read
Save for later

Abel Wang explains the relationship between DevOps and Cloud-Native

Savia Lobo
12 Dec 2019
5 min read
Cloud-native is microservices containers and serverless apps that run in multi-cloud environments and are managed by DevOps processes. However, the relationship between these is not always clearly defined. Shayne Boyer, Principal Cloud Advocate, in a conversation with Abel Wang, Principal cloud Advocate, and DevOps lead discussed the relationship between DevOps and Cloud-Native on the Microsoft developer channel. Do you wish to further learn how to implement DevOps using Azure DevOps, and also want to learn the entire serverless stack available in Azure including Azure Event Grid, Azure Functions, and Azure Logic Apps, you should check out the book Azure for Architects - Second Edition to know more. What is DevOps? Abel starts off by saying, DevOps at Microsoft is the union of people, processes, and products to enable the continuous delivery of value to our end-users. The reason one should care about DevOps when it comes to cloud native apps is that the key here is continuously delivering value. One of the powers of Cloud Native is because all of your infrastructures is out in the cloud, you're able to iterate it very quickly. This is what DevOps helps you do as well, continuously deliver value to your end-users. These days the speed of business is so quick that if we can't iterate quickly and give value quickly, our competitors will and once they do it the rest will be obsolete. Hence, it is extremely important to iterate quickly which Cloud-Native helps enable. The concept of continuously delivering value remains similar to the concept we carry out on our local machine during a standard deployment. Where Cloud-Native become completely unique is, All your infrastructures are out in the cloud. Hence, deploying to the cloud is easier to do than deploying on to like a mobile app. One of the most powerful things about cloud-native is that it is a microservice-based architecture. With these advantages, we're able to iterate quickly because instead of deploying this massive model if we make one tiny little change, we can just deploy that one service. This will simplify and speed up the process. With this every developer check-in can go through our gates, can go through our pipeline, reach production at a quicker rate, and so we're able to give value even faster and better. Key DevOps and Cloud-Native Apps concepts Wang says to aim for a CI/CD pipeline that can process code as soon as somebody adds it in. The pipeline should further make it easy to build and then finally deploy it to the infrastructure present. Wang demonstrates a cloud-native application with a slightly complicated infrastructure. The application consists of a static website that is held in Azure storage, it has a back-end written in .Net Core which is held in an Azure function and they both connect up to a Cosmos DB. If microservices are deployed independently, the services need to be smart enough to realize what version the other services are on so that the entire application is not disturbed if additional service is uploaded. He further demonstrates an instance for deploying the entire infrastructure all at once. You can check out the video to know more about the demonstration in detail. How to ensure quality across environments in a DevOps practice? In a cloud-native application, we need not worry about deploying similar infrastructures while moving from one environment to the next. This is because the dev environment will be exactly the same as the QA environment and further the same throughout all the way out into production. This will be cost-effective because we can just spin it up to run whatever we need to and as soon as it's done, tear it down so we're not paying for anything. Automating the Cloud Native processes include a few manual steps such as approving an email. Wang says one could technically automate everything, however, he prefers having manual approvers. Within Azure DevOps, there is a concept of an automated approval gate as well. So one can use automation to help decide if they should postpone approval or not. Wang says he uses an automated approval gate to conduct DNS checks that can inform him whether or not the DNS has propagated. Wang says, trying to keep quality in your pipelines is really difficult to do. You can do things like run all of the automated UI tests for a particular environment. “so by the time let's say I deployed this into a QA environment by the time my QA testers even look at it it could have run through like hundreds of thousands of automated UI tests already. So there's a lot less that a human needs to do,” Wang adds. To learn comprehensively how to develop Azure cloud architecture and a pipeline management system and also to know about some security best practices for your Azure deployment, you can check out the book, Azure for Architects - Second Edition by Ritesh Modi. Pivotal and Heroku team up to create Cloud Native Buildpacks for Kubernetes Can DevOps promote empathy in software engineering? Is DevOps really that different from Agile? No, says Viktor Farcic [Podcast]
Read more
  • 0
  • 0
  • 2854
article-image-chaos-engineering-comes-to-kubernetes-thanks-to-gremlin
Richard Gall
18 Nov 2019
2 min read
Save for later

Chaos engineering comes to Kubernetes thanks to Gremlin

Richard Gall
18 Nov 2019
2 min read
Kubernetes causes problems. Just last week Cindy Sridharan wrote on Twitter that while Docker "succeeded... because it was a great developer tool," Kubernetes "decided to be all things tech and not much by way of UX. It was and remains a hostile piece of software to learn, run, operate, maintain." https://twitter.com/copyconstruct/status/1194701905248673792?s=20 That's just a drop in the ocean - you don't have to look hard to find more hot takes, jokes, and memes about how complicated working with Kubernetes can feel. Despite all this, it's certainly here to stay. That makes chaos engineering platform Gremlin's announcement that the platform will offer native support for Kubernetes particularly welcome. Citing container orchestration research done by Datadog in the press release, which indicates the rapid rate of Kubernetes adoption, Gremlin is hoping that it can provide some additional support for users that might be concerned about the platforms complexity. From last year: Gremlin makes chaos engineering with Docker easier with new container discovery feature Gremlin CTO Matt Fornaciari said "our goal is to provide SRE and DevOps teams that are building and deploying modern applications with the tools and processes necessary to understand how their systems handle failure, before that failure has the chance to impact customers and business." The new feature is designed to help engineers do exactly that by allowing them "to automate the process of identifying Kubernetes primitives such as nodes and pods," and to select and attack traffic from different services within Kubernetes. The other important element to all this is that Gremlin wants to make things as straightforward as possible for engineering teams. With a neat and easy to use UI, it would seem that, to return to Sridharan's words, the team are eager to make sure their product is "a great developer tool."   The tool has already been tried and tested in the wild. Simon Govier, Expedia's Director of Program Management described how performing chaos experiments on Kubernetes with Gremlin "significantly reduces the amount of time it takes to do fault injection and increases our systems' resilience to failure." Learn more on the Gremlin website.
Read more
  • 0
  • 0
  • 3639

article-image-can-devops-promote-empathy-in-software-engineering
Richard Gall
14 Oct 2019
8 min read
Save for later

Can DevOps promote empathy in software engineering?

Richard Gall
14 Oct 2019
8 min read
If DevOps is, at a really basic level, about getting different teams to work together, then you could say that DevOps is a discipline that promotes empathy. It’s an interesting idea, and one that’s explored in Viktor Farcic’s book The DevOps Paradox. What’s particularly significant about the notion of empathy existing inside DevOps is that it could help us to focus on exactly what we’re trying to achieve by employing it. In turn, this could help us develop or evolve the way we actually do DevOps - so, instead of worrying about a new toolchain, or new platform purchases, we could, perhaps, just explore ways of getting people to simply understand what their respective needs are. However, in the DevOps Paradox there are a number of different insights on empathy in DevOps and what it means not just for the field itself, but also its place in a modern business. Let’s take a look at some DevOps experts thoughts on empathy and DevOps. "Empathy helps developers put the user at the center of what they do" Jeff Sussna (@jeffsussna) is an IT consultant and coach that helps organizations to design, build and deliver products quickly. "There's a lot of confusion and anxiety about [empathy’s] meaning, and a lot of people tend to misunderstand it. Sometimes people think empathy means wallowing in someone else's pain. In fact, there's actually a philosopher from Yale University who is now putting out the idea that empathy is actually bad, and that it's the cause of all of the world's problems and what we need instead is compassion. "From my perspective, that represents a misunderstanding of both empathy and compassion, but my favorite is when people say things like, "Sociopaths are really good at empathizing". My answer to that is, if you have a sociopath in your organization, you have a much bigger problem, and DevOps isn't going to solve it. At that point, you have an HR problem. What you need to distinguish between is emotional empathy and cognitive empathy, and I use cognitive empathy in the context of DevOps in a very simple way, which is the ability to think about things as if from another's perspective. "If you're a developer and you think, ‘What is the experience of deploying and running my application going to be?’ you're thinking about it from the perspective of the operations person. "If you're an operations person and you're thinking in terms of, ‘What is the experience going to be when you need to spin up a test server in a matter of hours in order to test a hotfix because all of your testing swim lanes are full of other things, and what does that mean for my process of provisioning servers?’ then you're thinking about things from the tester's point of view. "And so, to me, that's empathy, and that's empathizing, which is really at the heart of customer service. It's at the heart of design thinking, and it's at the heart of product development. What is it that our customers are trying to accomplish, what help do they need from us, and how can we help them?" Listen: Is DevOps really that different from Agile? No, says Viktor Farcic [Podcast] "As soon as you have empathy you can understand why you provide value" Damien Duportal (@DamienDuportal) is a Developer Advocate at Traefik, Containous’ cloud native edge router. "If you have a tool that helps you to share empathy, then you have a great foundation for starting the conversation. Even if this seems boring to engineers, at least they'll start talking and listening to each other. I mean, once they've stopped debating sterile tabs versus spaces or JavaScript versus Java—or whatever sterile debate it is—they'll have to focus on the value they're going to provide. So, this is really how I would sum up DevOps, which again is about how you bring empathy back and focus on the value creation and interaction side of IT. "Empathy is one of the most advanced bricks you can have for building human interaction. If we are able to achieve so many different things—with different people, different opinions, and different cultures—it's because we, as humans, are capable of having high levels of empathy. "As soon as you have empathy, you can understand why you provide value. If you don't, then what's the point of trying to create value? It will only be from your point of view, and there are over seven billion other people in the world. So, ultimately, we need empathy to understand what we are going to do with our tools." Read next: DevOps engineering and full-stack development – 2 sides of the same agile coin “Let’s not wait for culture to change: culture is in the rearview mirror” As CSO of PraxisFlow, Kevin Behr spends his time working with clients who seek to develop their DevOps process. His 25 years of experience have been driven by a passion for engaging with the complex problems that large IT organizations face, and how we can use DevOps to solve them. You can follow Kevin on Twitter at @kevinbehr. What do we mean when we talk about empathy in DevOps? We're saying that we understand what it feels like to do what you're doing and that I'll never do that to you again. So, let's build a system together that will allow us to never be there. DevOps to me has evolved into a lot of tools because we're humans, and humans love tools of all kinds. As a species, we've defined ourselves by our tools and technologies. And, as a species, we also talk about culture a lot, but, to my mind, culture is a rearview mirror. Culture is just all the things that we've done: our organizational disposition. The way to change culture is to do things differently. Let's not wait for culture, because culture is in the rearview mirror: it's the past. If you're in a transition, then what are you transitioning toward and what does that mean about how you need to act? The very interesting thing about DevOps is that while frequently, its mission is to create a change in the culture of an organization, this change requires far more than coordination: it also requires pure collaboration, and co-laboring. These can be particularly awkward to achieve given the likelihood that we haven't worked with the people in an organization before. And it can become intensely awkward, when those people may have already made villains out of each other because they couldn't get what they wanted. The goal of the DevOps process is to create a new culture, despite these challenges.... ....When you manage to introduce empathy to a team, the development and the operations people seem finally to come together. You suddenly hear someone in operations say, 'Oh, can we do that differently? When you threw that thing at me last time, it gave me a black eye and I had to stay up for four days straight!' And the developer is like, 'It did? How did it do that? Next time, if something happens, please call me, I want to come help.' That empathy of figuring out what went wrong, and working together, is what builds trust." “The CFO doesn’t give a shit about empathy” Chris Riley (@HoardingInfo) is a self-proclaimed bad coder turned editor of Sweetcode.io at fixate.io, a content marketing firm for those who sell to technical audiences. Through this, he's involved with DevOps, SecOps, big data, machine learning, and blockchain. He's a member of the DevOps Institute Board of Regents, a position he's held for over four years. “...The CFO doesn't give a shit about empathy, and the person with the money may not care about that at all. The HR department might, but that's the problem with selling anything. You have to speak their language, and the CFO is going to respond to money. Either you're saving us money, or you're making us more money, and I think DevOps is doing both, which is cool. I think what's nice about that explanation is the fact it doesn't seem insurmountable. It's kind of like how Pixar was structured. “After Steve Jobs started at Pixar, he structured all of the work environments where the idea was to create chance encounters among the employees, so that the graphic designer of one movie would talk to the application developer of another, even when they don't even have any real reason to interact with each other. The way they did it at Pixar was that, as everybody has to go to the bathroom, they put the bathrooms in a large communal area where these people are going to run into each other—that's what created that empathy. They understand what each other's job is. They're excited about each other's movies. They're excited about what they're working on, and they're aware of that in everything they do. It's a really good explanation.” What do you think? Can DevOps help promote empathy inside engineering teams and across wider businesses? Or is there anything else we should be doing?
Read more
  • 0
  • 0
  • 3685

article-image-puppet-announces-the-public-beta-of-project-nebula
Savia Lobo
10 Oct 2019
3 min read
Save for later

Puppet announces the public beta of Project Nebula

Savia Lobo
10 Oct 2019
3 min read
Today, Puppet announced the public beta of their Project Nebula at the Puppetize PDX, a two-day event (October 9-10) featuring user-focused DevOps and infrastructure delivery talks, and hands-on workshops. Project Nebula is a simplified workflow automation for the continuous deployment of cloud-native applications and infrastructure. It is designed for teams that are adopting cloud-native and serverless technologies and need an end-to-end workflow management system. Also Read: Puppet’s 2019 State of DevOps Report highlight security integration into DevOps practices result into higher business outcome Why Project Nebula? Puppet has worked closely with its private beta participants to understand their deployment workflows and pain points. On interviewing these participants they realized that they want to adopt cloud native technologies; however, they face multiple challenges in adopting containers, serverless infrastructure, microservices, and observability for even simple cloud-native applications. They also said that a major roadblock is a lack of simple automation today to easily compose multiple tools together for infrastructure provisioning, application deployment, and notifications into an end-to-end deployment. Another roadblock highlighted is the lack of a cohesive platform that multiple teams can use to share workflows and best practices and build them into their own deployments. “In-house efforts to build a deployment platform like this can take years, incur large maintenance and support costs, and often require specialized skill sets that many companies do not have today,” the company states. Project Nebula tries to help users in eliminating these roadblocks and gives teams a consistent, easy-to-use experience for deploying cloud-native apps in a safe, secure and continuous manner. Listen: Puppet’s VP of Ecosystem Engineering Nigel Kersten talks about key DevOps challenges [Podcast] Few features in Project Nebula With the focus on ease of use and improved productivity, Project Nebula also provides a single place to build, provision, and deploy cloud-native applications. Other notable features include: Built-in example Workflows: This will help users to get started with their deployments. Don’t start with a blank slate. Support for 20+: This is one of the most popular cloud-native deployment tools as configurable steps within your deployment, including Terraform, CloudFormation, Helm, Kubectl, Kustomize and more. Intuitive visualization: This provides a bird’s eye view of the entire deployment workflow. Easy to compose deployment workflows: One can easily compose deployment workflows that are checked into the source control repository and eliminate the process of writing messy, ad-hoc bash scripts. Know more about Puppet’s Project Nebula in detail, on its official website. Puppet launches Puppet Remediate, a vulnerability remediation solution for IT Ops Puppet announces updates in a bid to help organizations manage their “automation footprint” “This is John. He literally wrote the book on Puppet” – An Interview with John Arundel
Read more
  • 0
  • 0
  • 1998
article-image-chaos-engineering-company-gremlin-launches-scenarios-making-it-easier-to-tackle-downtime-issues
Richard Gall
26 Sep 2019
2 min read
Save for later

Chaos engineering company Gremlin launches Scenarios, making it easier to tackle downtime issues

Richard Gall
26 Sep 2019
2 min read
At the second ChaosConf in San Francisco, Gremlin CEO Kolton Andrus revealed the company's latest step in its war against downtime: 'Scenarios.' Scenarios makes it easy for engineering teams to simulate a common issues that lead to downtime. It's a natural and necessary progression for Gremlin that is seeing even the most forward thinking teams struggling to figure out how to implement chaos engineering in a way that's meaningful to their specific use case. "Since we released Gremlin Free back in February thousands of customers have signed up to get started with chaos engineering," said Andrus. "But many organisations are still struggling to decide which experiments to run in order to avoid downtime and outages." Scenarios, then, is a useful way into chaos engineering for teams that are reticent about taking their first steps. As Andrus notes, it makes it possible to inject failure "with a couple of clicks." What failure scenarios does Scenarios let engineering teams simulate? Scenarios lets Gremlin users simulate common issues that can cause outages. These include: Traffic spikes (think Black Friday site failures) Network failures Region evacuation This provides a great starting point for anyone that wants to stress test their software. Indeed, it's inevitable that these issues will arise at some point so taking advance steps to understand what the consequences could be will minimise their impact - and their likelihood. Why chaos engineering? Over the last couple of years plenty of people have been attempting to answer why chaos engineering? But in truth the reasons are clear: software - indeed, the internet as we know it - is becoming increasingly complex, a mesh of interdependent services and platforms. At the same time, the software being developed today is more critical than ever. For eCommerce sites downtime means money, but for those in IoT and embedded systems world (like self-driving cars, for example), it's sometimes a matter of life and death. This makes Gremlin's Scenarios an incredibly exciting an important prospect - it should end the speculation and debate about whether we should be doing chaos engineering, and instead help the world to simply start doing it. At ChaosConf Andrus said that Gremlin's mission is to build a more reliable internet. We should all hope they can deliver.
Read more
  • 0
  • 0
  • 3656

article-image-the-accelerate-state-of-devops-2019-report-key-findings-scaling-strategies-and-proposed-performance-productivity-models
Vincy Davis
28 Aug 2019
8 min read
Save for later

The Accelerate State of DevOps 2019 Report: Key findings, scaling strategies and proposed performance &amp; productivity models

Vincy Davis
28 Aug 2019
8 min read
The State of DevOps report is a widely referenced body of DevOps research which helps organizations achieve high organizational performance and productivity. The DORA (DevOps Research and Assessment) team have published the 2019 Accelerate State of DevOps Report which is an independent review of the practices and capabilities of DevOps. Teams across the globe can take advantage of the survey findings and identify specific capabilities to improve their software delivery performance. Key findings in DevOps 2019 report Increase in Elite performers The DevOps 2019 report uses a cluster analysis method to identify the software delivery performance of the participants involved in the survey. All the respondents are categorized into four distinct groups called as Elite, High, Medium and Low Performers. The report states, “In this approach, those in one group are statistically similar to each other and dissimilar from those in other groups, based on our performance behaviors of throughput and stability: deployment frequency, lead time, time to restore service, and change fail rate.” According to the survey, the proportion of elite performers have jumped to 20% from the 7% last year. The percentage of medium performers have also increased, leading to a drop in the percentage of low performers. Hence, the cluster analysis method depicts a continued shift in the industry, as organizations continue to transform their technology. Image Source: 2019 Accelerate State of DevOps Report Read Also: Does it make sense to talk about DevOps engineers or DevOps tools? Strategies for scaling DevOps in organizations The Accelerate State of DevOps 2019 report have framed out several strategies for scaling DevOps in an organization. The strategies are created based on commonly used approaches observed by the DORA team across the industry. Training Center (DOJO): Employees are taken out of their usual work routines to learn new tools or technologies. Next, they are required to implement the new learnt methods in their work and also inspire others to do so. Center of Excellence: This strategy uses all the available skills for consultation purpose. Proof of Concept but Stall: A central team is given the freedom to build in any possible way (often by breaking organizational norms). However, the effort stalls after the PoC. Proof of Concept as a Template: It begins with the PoC but Stall project which is then replicated in other groups using the same pattern. Proof of Concept as a Seed: This approach ensures that the PoC members stay active in the new groups indefinitely or just as long to ensure that the new practices are sustainable. Communities of Practice: Groups sharing the same common interests in tooling, language, or methodologies are encouraged within an organization to share knowledge and expertise with each other and across teams. Big Bang: When an entire organization is transformed as per DevOps methodologies at once. Bottom-up or Grassroots: Small teams are put together to transform resources and share their success throughout the organization in an informal way. Mashup: When an organization implements several of the above described approaches with partial execution or with insufficient resources. The distribution of DevOps transformation strategies as per performance profiles are shown below. Image Source: 2019 Accelerate State of DevOps Report Cloud computing usage is the driving force behind elite performers The DORA team uses the five essential characteristics of cloud computing, as defined by the National Institute of Standards and Technology (NIST) to understand cloud usage patterns by the categorized performers. The essential characteristics of cloud computing are given below. On-demand self-service: DevOps users can use the cloud on an on-demand self-service premise to use the computing resources whenever needed, without any human intervention. This is the least used characteristic by users with only 57% of the survey respondents agreeing  on using it. Broad network access: Such networks can be used through heterogeneous platforms such as mobile phones, tablets, laptops, and workstations. 60% of the survey respondents agreed on using the broad network access. Resource pooling: The provider resources in a cloud are pooled in a multi-tenant model such that the physical and virtual resources are dynamically assigned on-demand. This allows the  customer to specify a location at a higher level of abstraction such as country, state, or datacenter. Among all the survey respondents, 58% of them agreed on using the resource pooling characteristics. Rapid elasticity: The cloud capabilities are elastically provisioned to the extent that it can be released rapidly to scale outward or inward on demand. Hence, the cloud capabilities seem to be unlimited and can be appropriated at any point of time in any quantity. When compared to 2018, this characteristic saw a growth of +15% this year, with 58% of the survey respondents utilizing it. Measured service: This is the most used characteristic of cloud computing as 62% of the respondents agreed on using the cloud measured service. It allows the cloud systems to be automatically used to control, optimize, and report resource usage based on the type of services like storage, processing, bandwidth, and active user accounts. The DevOps 2019 survey found that only 29% of all the respondents agreed on utilizing all the essential cloud computing characteristics. The survey also revealed that the elite performers in the survey used the cloud characteristics 24% times more than the low performers. Importance of psychological safety to SDO performance The DevOps survey found that a culture of psychological safety where team members feel safe to take risks and be vulnerable in front of each other are able to better deliver software delivery performance, organizational performance, and high productivity. This enables an organization to come up with new products and features without impacting the existing users, hence leading to desirable levels of software delivery and operational performance (SDO performance). The report concludes that due to psychological safety elite performers can twice achieve or exceed their organizational performance goals. Size of an industry does not correspond to its success This year, the retail industry saw significantly better SDO performance, in terms of speed and stability of software delivery. However, the DevOps report states that, “We found no evidence that industry has an impact with the exception of retail, suggesting that organizations of all types and sizes, including highly regulated industries such as financial services and government, can achieve high levels of performance.” The report suggests that size is not a factor for poor performance of other industries and high levels of performance can be achieved by adopting DevOps practices. Read Also: Listen: Puppet’s VP of Ecosystem Engineering Nigel Kersten talks about key DevOps challenges [Podcast] Proposed steps to improve performance and productivity in DevOps The Accelerate State of DevOps Report of 2019 proposes two research models to improve DevOps performance and productivity. Performance model The DevOps report states, “A key goal in digital transformation is optimizing software delivery performance.” To improve software delivery and operational (SDO) performance and organizational performance, teams can first start with basic automation such as version control and automated testing, monitoring, clear change approval processes, and a healthy culture. The DevOps 2019 survey finds that low performers use more proprietary software than high and elite performers. Image Source: 2019 Accelerate State of DevOps Report Productivity model The report defines productivity as “the ability to get complex, time-consuming tasks completed with minimal distractions and interruptions.” Teams can use the useful and easy-to-use tools along with internal and external search to accelerate productivity. The DevOps survey reveals that the highest performing engineers are 1.5 times more likely to use the easy-to-use tools. An improved amount of productivity also helps employees have a better work/life balance and less burnout. Image Source: 2019 Accelerate State of DevOps Report DevOps teams can use the above two models to locate their goal, identify the dependent factors and increase their overall performance and productivity output. Demographic makeup of the DevOps 2019 survey This year, the DORA survey had almost 1,000² participants. Among all the responses, 26% responses came from employees working in very large companies (10,000+). Compared to last year, there has been a drop in response from employees working in companies with 500-1,999 employees. In contrast, more responses have been received from people working in company sizes of 100-499 employees. The majority of participants worked in the Development or Engineering department. The DevOps or Site Reliability Engineering (SRE) department came next, followed by Manager, IT Operations or Infrastructure, and others. Half of the participants in the 2019 research are from North America, followed by EU/ UK at 29%. This year also saw a fall in responses from Asia, which is only 9% when compared to the 18% last year. The percentage of women on teams have also reduced to 16% (median) from the 25% reported last year. Image Source: 2019 Accelerate State of DevOps Report Developers across the world love the Accelerate State of DevOps 2019 report and are thanking the DORA team for major takeaways like cloud being a key differentiator, how to be a elite performer in software delivery performance, and more. https://twitter.com/kniklas/status/1165681512538398720 https://twitter.com/nickj69/status/1164707063907225600 https://twitter.com/DawieO/status/1164703456675819522 https://twitter.com/kylekyle/status/1164692941559877632 https://twitter.com/tottiLFC/status/1164800298885402624 https://twitter.com/mirko_novakovic/status/1164606178615341060 Here’s a two minute summary video of the Accelerate State of DevOps 2019 report, published by Google Cloud. https://www.youtube.com/watch?v=8M3WibXvC84 Interested readers can check out the report to see a detailed comparison of tool usage, according to low, medium, high, and elite profiles. Also, you can read the full 2019 Accelerate State of DevOps Report for more information. 5 reasons poor communication can sink DevSecOps 7 crucial DevOps metrics that you need to track Introducing kdevops, a modern DevOps framework for Linux kernel development
Read more
  • 0
  • 0
  • 4186