Creating infrastructure around logs
Okay, let's do some arithmetic. Suppose that you have a rather popular but not on a world scale (yet) website with about 50,000 visits per day. This is a number that managers brag about during their meetups; they get it from some analytics software. It almost means nothing regarding your job. Because what is a visit? Let's say that what you have is an e-commerce site; you sell some nonseasonal stuff, for example, power tools. Your average visitor will look at one to two pages with spikes to early tens when actually choosing and buying something. Let it be three pages per visit on average. What is a page? For you, it is a series of HTTP responses—the main document and all the embedded objects. People notoriously underestimate the sheer size of modern web pages. It would be a safe bet to say that your pages include on average 100 objects (HTML documents, images, scripts, style sheets, and so on) amounting to the size of over a megabyte...