Chapter 10. Caching, Proxies and Improved Performance
We have covered a great deal about the web application that you'll need to connect to data sources, render templates, utilize SSL/TLS, build APIs for single-page applications, and so on.
While the fundamentals are clear, you may find that putting an application built on these guidelines into production would lead to some quick problems, particularly under heavy load.
We've implemented some of the best security practices in the last chapter by addressing some of the most common security issues in web applications. Let's do the same here in this chapter, by applying the best practices against some of the biggest issues of performance and speed.
To do this, we'll look at some of the most common bottlenecks in the pipeline and see how we can reduce these to make our application as performant as possible in production.
Specifically, we'll be identifying those bottlenecks and then looking to reverse proxies and load balancing, implementing caching into our application, utilizing SPDY, and look at how to use managed cloud services to augment our speed initiatives by reducing the number of requests that get to our application.
By this chapter's end, we hope to produce tools that can help any Go application squeeze every bit of performance out of our environment.
In this chapter, we will cover the following topics:
- Identifying bottlenecks
- Implementing reverse proxies
- Implementing caching strategies
- Implementing HTTP/2
Identifying bottlenecks
To simplify things a little, there are two types of bottlenecks for your application, those caused by development and programming deficiencies and those inherent to an underlying software or infrastructure limitation.
The answer to the former is simple, identify the poor design and fix it. Putting patches around bad code can hide the security vulnerabilities or delay even bigger performance issues from being discovered in a timely manner.
Sometimes these issues are born from a lack of stress testing; a code that is performant locally is not guaranteed to scale without applying artificial load. A lack of this testing sometimes leads to surprise downtime in production.
However, ignoring bad code as a source of issues, lets take a look at some of the other frequent offenders:
- Disk I/O
- Database access
- High memory/CPU usage
- Lack of concurrency support
There are of course hundreds of offenders, such as network issues, garbage collection overhead in some applications, not compressing payloads/headers, non-database deadlocks, and so on.
High memory and CPU usage is most often the result rather than the cause, but a lot of the other causes are specific to certain languages or environments.
For our application, we could have a weak point at the database layer. Since we're doing no caching, every request will hit the database multiple times. ACID-compliant databases (such as MySQL/PostgreSQL) are notorious for failing under loads, which would not be a problem on the same hardware for less strict key/value stores and NoSQL solutions. The cost of database consistency contributes heavily to this and it's one of the trade-offs of choosing a traditional relational database.
Implementing reverse proxies
As we know by now, unlike a lot of languages, Go comes with a complete and mature web server platform with net/http
.
Of late, some other languages have been shipped with small toy servers intended for local development, but they are not intended for production. In fact, many specifically warn against it. Some common ones are WEBrick for Ruby, Python's SimpleHTTPServer, and PHP's -S. Most of these suffer from concurrency issues that prevent them from being viable choices in production.
Go's net/http
is different; by default, it handles these issues with aplomb out of the box. Obviously, much of this depends on the underlying hardware, but in a pinch you could use it natively with success. Many sites are using net/http
to serve non-trivial amounts of traffic.
But even strong underlying web servers have some inherent limitations:
- They lack failover or distributed options
- They have limited caching options upstream
- They cannot easily load balance the incoming traffic
- They cannot easily concentrate on centralized logging
This is where a reverse proxy comes into play. A reverse proxy accepts all the incoming traffic on behalf of one or more servers and distributes it by applying the preceding (and other) options and benefits. Another example is URL rewriting, which is more applicable for underlying services that may not have built-in routing and URL rewriting.
There are two big advantages of throwing a simple reverse proxy in front of your web server, such as Go; they are caching options and the ability to serve static content without hitting the underlying application.
One of the most popular options for reverse proxying sites is Nginx (pronounced Engine-X). While Nginx is a web server itself, it gained acclaim early on for being lightweight with a focus on concurrency. It quickly became the frontend du jour for front line defense of a web application in front of an otherwise slower or heavier web server, such as Apache. The situation has changed a bit in recent years, as Apache has caught up in terms of concurrency options and utilization of alternative approaches to events and threading. The following is an example of a reverse proxy Nginx configuration:
server { listen 80; root /var/; index index.html index.htm; large_client_header_buffers 4 16k; # Make site accessible from http://localhost/ server_name localhost location / { proxy_pass http://localhost:8080; proxy_redirect off; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } }
With this in place, make sure that your Go app is running on port 8080
and restart Nginx. Requests to http//:port 80
will be served through Nginx as a reverse proxy to your application. You can check this through viewing headers or in the Developer tools in your browser:
Remember that we wish to support TLS/SSL whenever possible, but providing a reverse proxy here is just a matter of changing the ports. Our application should run on another port, likely a nearby port for clarity and then our reverse proxy would run on port 443
.
As a reminder, any port is legal for HTTP or HTTPS. However, when a port is not specified, the browsers automatically direct to 443
for secure connections. It's as simple as modifying the nginx.conf
and our app's constant:
server { listen 443; location / { proxy_pass http://localhost:444;
Lets see how to modify our application as shown in the following code:
const ( DBHost = "127.0.0.1" DBPort = ":3306" DBUser = "root" DBPass = "" DBDbase = "cms" PORT = ":444" )
This allows us to pass through SSL requests with a frontend proxy.
Tip
On many Linux distributions, you'll need SUDO or root privileges to use ports below 1000.
Implementing caching strategies
There are a number of ways to decide when to create and when to expire the cache items, so we'll look at one of the easier and faster methods for doing so. But if you are interested in developing this further, you might consider other caching strategies; some of which can provide efficiencies for resource usage and performance.
Using Least Recently Used
One common tactic to maintain cache stability within allocated resources (disk space, memory) is the Least Recently Used (LRU) system for cache expiration. In this model, utilizing information about the last cache access time (creation or update) and the cache management system can remove the oldest entry in the list.
This has a number of benefits for performance. First, if we assume that the most recently created/updated cache entries are for entries that are presently the most popular, we can remove entries that are not being accessed much sooner; in order to free up the resources for the existing and new resources that might be accessed much more frequently.
This is a fair assumption, assuming the allocated resources for caching is not inconsequential. If you have a large volume for file cache or a lot of memory for memcache, the oldest entries, in terms of last access, are quite likely not being utilized with great frequency.
There is a related and more granular strategy called Least Frequently Used that maintains strict statistics on the usage of the cache entries themselves. This not only removes the need for assumptions about cache data but also adds overhead for the statistics maintenance.
For our demonstrations here, we will be using LRU.
Caching by file
Our first approach is probably best described as a classical one for caching, but a method not without issues. We'll utilize the disk to create file-based caches for individual endpoints, both API and Web.
So what are the issues associated with caching in the filesystem? Well, previously in the chapter, we mentioned that disk can introduce its own bottleneck. Here, we're doing a trade-off to protect the access to our database in lieu of potentially running into other issues with disk I/O.
This gets particularly complicated if our cache directory gets very big. At this point we end up introducing more file access issues.
Another downside is that we have to manage our cache; because the filesystem is not ephemeral and our available space is. We'll need to be able to expire cache files by hand. This introduces another round of maintenance and another point of failure.
All that said, it's still a useful exercise and can still be utilized if you're willing to take on some of the potential pitfalls:
package cache const ( Location "/var/cache/" ) type CacheItem struct { TTL int Key string } func newCache(endpoint string, params ...[]string) { } func (c CacheItem) Get() (bool, string) { return true, "" } func (c CacheItem) Set() bool { } func (c CacheItem) Clear() bool { }
This sets the stage to do a few things, such as create unique keys based on an endpoint and query parameters, check for the existence of a cache file, and if it does not exist, get the requested data as per normal.
In our application, we can implement this simply. Let's put a file caching layer in front of our /page
endpoint as shown:
func ServePage(w http.ResponseWriter, r *http.Request) { vars := mux.Vars(r) pageGUID := vars["guid"] thisPage := Page{} cached := cache.newCache("page",pageGUID)
The preceding code creates a new CacheItem
. We utilize the variadic params
to generate a reference filename:
func newCache(endpoint string, params ...[]string) CacheItem { cacheName := endponit + "_" + strings.Join(params, "_") c := CacheItem{} return c }
When we have a CacheItem
object, we can check using the Get()
method, which will return true
if the cache is still valid, otherwise the method will return false
. We utilize filesystem information to determine if a cache item is within its valid time-to-live:
valid, cachedData := cached.Get() if valid { thisPage.Content = cachedData fmt.Fprintln(w, thisPage) return }
If we find an existing item via the Get()
method, we'll check to make sure that it has been updated within the set TTL
:
func (c CacheItem) Get() (bool, string) { stats, err := os.Stat(c.Key) if err != nil { return false, "" } age := time.Nanoseconds() - stats.ModTime() if age <= c.TTL { cache, _ := ioutil.ReadFile(c.Key) return true, cache } else { return false, "" } }
If the code is valid and within the TTL, we'll return true
and the file's body will be updated. Otherwise, we will allow a passthrough to the page retrieval and generation. At the tail of this we can set the cache data:
t, _ := template.ParseFiles("templates/blog.html") cached.Set(t, thisPage) t.Execute(w, thisPage)
We then save this as:
func (c CacheItem) Set(data []byte) bool { err := ioutil.WriteFile(c.Key, data, 0644) }
This function effectively writes the value of our cache file.
We now have a working system that will take individual endpoints and innumerable query parameters and create a file-based cache library, ultimately preventing unnecessary queries to our database, if data has not been changed.
In practice we'd want to limit this to mostly read-based pages and avoid putting blind caching on any write or update endpoints, particularly on our API.
Caching in memory
Just as file system caching became a lot more palatable because storage prices plummeted, we've seen a similar move in RAM, trailing just behind hard storage. The big advantage here is speed, caching in memory can be insanely fast for obvious reasons.
Memcache, and its distributed sibling Memcached, evolved out of a need to create a light and super-fast caching for LiveJournal and a proto-social network from Brad Fitzpatrick. If that name feels familiar, it's because Brad now works at Google and is a serious contributor to the Go language itself.
As a drop-in replacement for our file caching system, Memcached will work similarly. The only major change is our key lookups, which will be going against working memory instead of doing file checks.
Note
To use memcache with Go language, go to godoc.org/github.com/bradfitz/gomemcache/memcache from Brad Fitz, and install it using go get
command.
Using Least Recently Used
One common tactic to maintain cache stability within allocated resources (disk space, memory) is the Least Recently Used (LRU) system for cache expiration. In this model, utilizing information about the last cache access time (creation or update) and the cache management system can remove the oldest entry in the list.
This has a number of benefits for performance. First, if we assume that the most recently created/updated cache entries are for entries that are presently the most popular, we can remove entries that are not being accessed much sooner; in order to free up the resources for the existing and new resources that might be accessed much more frequently.
This is a fair assumption, assuming the allocated resources for caching is not inconsequential. If you have a large volume for file cache or a lot of memory for memcache, the oldest entries, in terms of last access, are quite likely not being utilized with great frequency.
There is a related and more granular strategy called Least Frequently Used that maintains strict statistics on the usage of the cache entries themselves. This not only removes the need for assumptions about cache data but also adds overhead for the statistics maintenance.
For our demonstrations here, we will be using LRU.
Caching by file
Our first approach is probably best described as a classical one for caching, but a method not without issues. We'll utilize the disk to create file-based caches for individual endpoints, both API and Web.
So what are the issues associated with caching in the filesystem? Well, previously in the chapter, we mentioned that disk can introduce its own bottleneck. Here, we're doing a trade-off to protect the access to our database in lieu of potentially running into other issues with disk I/O.
This gets particularly complicated if our cache directory gets very big. At this point we end up introducing more file access issues.
Another downside is that we have to manage our cache; because the filesystem is not ephemeral and our available space is. We'll need to be able to expire cache files by hand. This introduces another round of maintenance and another point of failure.
All that said, it's still a useful exercise and can still be utilized if you're willing to take on some of the potential pitfalls:
package cache const ( Location "/var/cache/" ) type CacheItem struct { TTL int Key string } func newCache(endpoint string, params ...[]string) { } func (c CacheItem) Get() (bool, string) { return true, "" } func (c CacheItem) Set() bool { } func (c CacheItem) Clear() bool { }
This sets the stage to do a few things, such as create unique keys based on an endpoint and query parameters, check for the existence of a cache file, and if it does not exist, get the requested data as per normal.
In our application, we can implement this simply. Let's put a file caching layer in front of our /page
endpoint as shown:
func ServePage(w http.ResponseWriter, r *http.Request) { vars := mux.Vars(r) pageGUID := vars["guid"] thisPage := Page{} cached := cache.newCache("page",pageGUID)
The preceding code creates a new CacheItem
. We utilize the variadic params
to generate a reference filename:
func newCache(endpoint string, params ...[]string) CacheItem { cacheName := endponit + "_" + strings.Join(params, "_") c := CacheItem{} return c }
When we have a CacheItem
object, we can check using the Get()
method, which will return true
if the cache is still valid, otherwise the method will return false
. We utilize filesystem information to determine if a cache item is within its valid time-to-live:
valid, cachedData := cached.Get() if valid { thisPage.Content = cachedData fmt.Fprintln(w, thisPage) return }
If we find an existing item via the Get()
method, we'll check to make sure that it has been updated within the set TTL
:
func (c CacheItem) Get() (bool, string) { stats, err := os.Stat(c.Key) if err != nil { return false, "" } age := time.Nanoseconds() - stats.ModTime() if age <= c.TTL { cache, _ := ioutil.ReadFile(c.Key) return true, cache } else { return false, "" } }
If the code is valid and within the TTL, we'll return true
and the file's body will be updated. Otherwise, we will allow a passthrough to the page retrieval and generation. At the tail of this we can set the cache data:
t, _ := template.ParseFiles("templates/blog.html") cached.Set(t, thisPage) t.Execute(w, thisPage)
We then save this as:
func (c CacheItem) Set(data []byte) bool { err := ioutil.WriteFile(c.Key, data, 0644) }
This function effectively writes the value of our cache file.
We now have a working system that will take individual endpoints and innumerable query parameters and create a file-based cache library, ultimately preventing unnecessary queries to our database, if data has not been changed.
In practice we'd want to limit this to mostly read-based pages and avoid putting blind caching on any write or update endpoints, particularly on our API.
Caching in memory
Just as file system caching became a lot more palatable because storage prices plummeted, we've seen a similar move in RAM, trailing just behind hard storage. The big advantage here is speed, caching in memory can be insanely fast for obvious reasons.
Memcache, and its distributed sibling Memcached, evolved out of a need to create a light and super-fast caching for LiveJournal and a proto-social network from Brad Fitzpatrick. If that name feels familiar, it's because Brad now works at Google and is a serious contributor to the Go language itself.
As a drop-in replacement for our file caching system, Memcached will work similarly. The only major change is our key lookups, which will be going against working memory instead of doing file checks.
Note
To use memcache with Go language, go to godoc.org/github.com/bradfitz/gomemcache/memcache from Brad Fitz, and install it using go get
command.
Caching by file
Our first approach is probably best described as a classical one for caching, but a method not without issues. We'll utilize the disk to create file-based caches for individual endpoints, both API and Web.
So what are the issues associated with caching in the filesystem? Well, previously in the chapter, we mentioned that disk can introduce its own bottleneck. Here, we're doing a trade-off to protect the access to our database in lieu of potentially running into other issues with disk I/O.
This gets particularly complicated if our cache directory gets very big. At this point we end up introducing more file access issues.
Another downside is that we have to manage our cache; because the filesystem is not ephemeral and our available space is. We'll need to be able to expire cache files by hand. This introduces another round of maintenance and another point of failure.
All that said, it's still a useful exercise and can still be utilized if you're willing to take on some of the potential pitfalls:
package cache const ( Location "/var/cache/" ) type CacheItem struct { TTL int Key string } func newCache(endpoint string, params ...[]string) { } func (c CacheItem) Get() (bool, string) { return true, "" } func (c CacheItem) Set() bool { } func (c CacheItem) Clear() bool { }
This sets the stage to do a few things, such as create unique keys based on an endpoint and query parameters, check for the existence of a cache file, and if it does not exist, get the requested data as per normal.
In our application, we can implement this simply. Let's put a file caching layer in front of our /page
endpoint as shown:
func ServePage(w http.ResponseWriter, r *http.Request) { vars := mux.Vars(r) pageGUID := vars["guid"] thisPage := Page{} cached := cache.newCache("page",pageGUID)
The preceding code creates a new CacheItem
. We utilize the variadic params
to generate a reference filename:
func newCache(endpoint string, params ...[]string) CacheItem { cacheName := endponit + "_" + strings.Join(params, "_") c := CacheItem{} return c }
When we have a CacheItem
object, we can check using the Get()
method, which will return true
if the cache is still valid, otherwise the method will return false
. We utilize filesystem information to determine if a cache item is within its valid time-to-live:
valid, cachedData := cached.Get() if valid { thisPage.Content = cachedData fmt.Fprintln(w, thisPage) return }
If we find an existing item via the Get()
method, we'll check to make sure that it has been updated within the set TTL
:
func (c CacheItem) Get() (bool, string) { stats, err := os.Stat(c.Key) if err != nil { return false, "" } age := time.Nanoseconds() - stats.ModTime() if age <= c.TTL { cache, _ := ioutil.ReadFile(c.Key) return true, cache } else { return false, "" } }
If the code is valid and within the TTL, we'll return true
and the file's body will be updated. Otherwise, we will allow a passthrough to the page retrieval and generation. At the tail of this we can set the cache data:
t, _ := template.ParseFiles("templates/blog.html") cached.Set(t, thisPage) t.Execute(w, thisPage)
We then save this as:
func (c CacheItem) Set(data []byte) bool { err := ioutil.WriteFile(c.Key, data, 0644) }
This function effectively writes the value of our cache file.
We now have a working system that will take individual endpoints and innumerable query parameters and create a file-based cache library, ultimately preventing unnecessary queries to our database, if data has not been changed.
In practice we'd want to limit this to mostly read-based pages and avoid putting blind caching on any write or update endpoints, particularly on our API.
Caching in memory
Just as file system caching became a lot more palatable because storage prices plummeted, we've seen a similar move in RAM, trailing just behind hard storage. The big advantage here is speed, caching in memory can be insanely fast for obvious reasons.
Memcache, and its distributed sibling Memcached, evolved out of a need to create a light and super-fast caching for LiveJournal and a proto-social network from Brad Fitzpatrick. If that name feels familiar, it's because Brad now works at Google and is a serious contributor to the Go language itself.
As a drop-in replacement for our file caching system, Memcached will work similarly. The only major change is our key lookups, which will be going against working memory instead of doing file checks.
Note
To use memcache with Go language, go to godoc.org/github.com/bradfitz/gomemcache/memcache from Brad Fitz, and install it using go get
command.
Caching in memory
Just as file system caching became a lot more palatable because storage prices plummeted, we've seen a similar move in RAM, trailing just behind hard storage. The big advantage here is speed, caching in memory can be insanely fast for obvious reasons.
Memcache, and its distributed sibling Memcached, evolved out of a need to create a light and super-fast caching for LiveJournal and a proto-social network from Brad Fitzpatrick. If that name feels familiar, it's because Brad now works at Google and is a serious contributor to the Go language itself.
As a drop-in replacement for our file caching system, Memcached will work similarly. The only major change is our key lookups, which will be going against working memory instead of doing file checks.
Note
To use memcache with Go language, go to godoc.org/github.com/bradfitz/gomemcache/memcache from Brad Fitz, and install it using go get
command.
Implementing HTTP/2
One of the more interesting, perhaps noble, initiatives that Google has invested in within the last five years has been a focus on making the Web faster. Through tools, such as PageSpeed, Google has sought to push the Web as a whole to be faster, leaner, and more user-friendly.
No doubt this initiative is not entirely altruistic. Google has built their business on extensive web search and crawlers are always at the mercy of the speed of the pages they crawl. The faster the web pages, the faster and more comprehensive is the crawling; therefore, less time and less infrastructure resulting in less money required. The bottom line here is that a faster web benefits Google, as much as it does people creating and viewing web sites.
But this is mutually beneficial. If web sites are faster to comply with Google's preferences, everyone benefits with a faster Web.
This brings us to HTTP/2, a version of HTTP that replaces 1.1, introduced in 1999 and largely the defacto method for most of the Web. HTTP/2 also envelops and implements a lot of SPDY, a makeshift protocol that Google developed and supported through Chrome.
HTTP/2 and SPDY introduce a host of optimizations including header compression and non-blocking and multiplexed request handling.
If you're using version 1.6, net/http
supports HTTP/2 out of the box. If you're using version 1.5 or earlier, you can use the experimental package.
Note
To use HTTP/2 prior to Go version 1.6, go get it from godoc.org/golang.org/x/net/http2
Summary
In this chapter, we focused on quick wins for increasing the overall performance for our application, by reducing impact on our underlying application's bottlenecks, namely our database.
We've implemented caching at the file level and described how to translate that into a memory-based caching system. We looked at SPDY and HTTP/2, which has now become a part of the underlying Go net/http
package by default.
This in no way represents all the optimizations that we may need to produce highly performant code, but hits on some of the most common bottlenecks that can keep applications that work well in development from behaving similarly in production under heavy load.
This is where we end the book; hope you all enjoyed the ride!