Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Clojure High Performance Programming, Second Edition
Clojure High Performance Programming, Second Edition

Clojure High Performance Programming, Second Edition: Become an expert at writing fast and high performant code in Clojure 1.7.0 , Second Edition

eBook
$20.98 $29.99
Paperback
$38.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Clojure High Performance Programming, Second Edition

Chapter 1. Performance by Design

Clojure is a safe, functional programming language that brings great power and simplicity to the user. Clojure is also dynamically and strongly typed, and has very good performance characteristics. Naturally, every activity performed on a computer has an associated cost. What constitutes acceptable performance varies from one use-case and workload to another. In today's world, performance is even the determining factor for several kinds of applications. We will discuss Clojure (which runs on the JVM (Java Virtual Machine)), and its runtime environment in the light of performance, which is the goal of this book.

The performance of Clojure applications depend on various factors. For a given application, understanding its use cases, design and implementation, algorithms, resource requirements and alignment with the hardware, and the underlying software capabilities is essential. In this chapter, we will study the basics of performance analysis, including the following:

  • Classifying the performance anticipations by the use cases types
  • Outlining the structured approach to analyze performance
  • A glossary of terms, commonly used to discuss performance aspects
  • The performance numbers that every programmer should know

Use case classification

The performance requirements and priority vary across the different kinds of use cases. We need to determine what constitutes acceptable performance for the various kinds of use cases. Hence, we classify them to identify their performance model. When it comes to details, there is no sure shot performance recipe for any kind of use case, but it certainly helps to study their general nature. Note that in real life, the use cases listed in this section may overlap with each other.

The user-facing software

The performance of user-facing applications is strongly linked to the user's anticipation. Having a difference of a good number of milliseconds may not be perceptible for the user but at the same time, a wait of more than a few seconds may not be taken kindly. One important element in normalizing anticipation is to engage the user by providing duration-based feedback. A good idea to deal with such a scenario would be to start the task asynchronously in the background, and poll it from the UI layer to generate a duration-based feedback for the user. Another way could be to incrementally render the results to the user to even out the anticipation.

Anticipation is not the only factor in user facing performance. Common techniques like staging or precomputation of data, and other general optimization techniques can go a long way to improve the user experience with respect to performance. Bear in mind that all kinds of user facing interfaces fall into this use case category—the Web, mobile web, GUI, command line, touch, voice-operated, gesture...you name it.

Computational and data-processing tasks

Non-trivial compute intensive tasks demand a proportional amount of computational resources. All of the CPU, cache, memory, efficiency and the parallelizability of the computation algorithms would be involved in determining the performance. When the computation is combined with distribution over a network or reading from/staging to disk, I/O bound factors come into play. This class of workloads can be further subclassified into more specific use cases.

A CPU bound computation

A CPU bound computation is limited by the CPU cycles spent on executing it. Arithmetic processing in a loop, small matrix multiplication, determining whether a number is a Mersenne prime, and so on, would be considered CPU bound jobs. If the algorithm complexity is linked to the number of iterations/operations N, such as O(N), O(N 2) and more, then the performance depends on how big N is, and how many CPU cycles each step takes. For parallelizable algorithms, performance of such tasks may be enhanced by assigning multiple CPU cores to the task. On virtual hardware, the performance may be impacted if the CPU cycles are available in bursts.

A memory bound task

A memory bound task is limited by the availability and bandwidth of the memory. Examples include large text processing, list processing, and more. For example, specifically in Clojure, the (reduce f (pmap g coll)) operation would be memory bound if coll is a large sequence of big maps, even though we parallelize the operation using pmap here. Note that higher CPU resources cannot help when memory is the bottleneck, and vice versa. Lack of availability of memory may force you to process smaller chunks of data at a time, even if you have enough CPU resources at your disposal. If the maximum speed of your memory is X and your algorithm on single the core accesses the memory at speed X/3, the multicore performance of your algorithm cannot exceed three times the current performance, no matter how many CPU cores you assign to it. The memory architecture (for example, SMP and NUMA) contributes to the memory bandwidth in multicore computers. Performance with respect to memory is also subject to page faults.

A cache bound task

A task is cache bound when its speed is constrained by the amount of cache available. When a task retrieves values from a small number of repeated memory locations, for example a small matrix multiplication, the values may be cached and fetched from there. Note that CPUs (typically) have multiple layers of cache, and the performance will be at its best when the processed data fits in the cache, but the processing will still happen, more slowly, when the data does not fit into the cache. It is possible to make the most of the cache using cache-oblivious algorithms. A higher number of concurrent cache/memory bound threads than CPU cores is likely to flush the instruction pipeline, as well as the cache at the time of context switch, likely leading to a severely degraded performance.

An input/output bound task

An input/output (I/O) bound task would go faster if the I/O subsystem, that it depends on, goes faster. Disk/storage and network are the most commonly used I/O subsystems in data processing, but it can be serial port, a USB-connected card reader, or any I/O device. An I/O bound task may consume very few CPU cycles. Depending on the speed of the device, connection pooling, data compression, asynchronous handling, application caching, and more, may help in performance. One notable aspect of I/O bound tasks is that performance is usually dependent on the time spent waiting for connection/seek, and the amount of serialization that we do, and hardly on the other resources.

In practice, many data processing workloads are usually a combination of CPU bound, memory bound, cache bound, and I/O bound tasks. The performance of such mixed workloads effectively depends on the even distribution of CPU, cache, memory, and I/O resources over the duration of the operation. A bottleneck situation arises only when one resource gets too busy to make way for another.

Online transaction processing

Online transaction processing (OLTP) systems process the business transactions on demand. They can sit behind systems such as a user-facing ATM machine, point-of-sale terminal, a network-connected ticket counter, ERP systems, and more. The OLTP systems are characterized by low latency, availability, and data integrity. They run day-to-day business transactions. Any interruption or outage is likely to have a direct and immediate impact on sales or service. Such systems are expected to be designed for resiliency rather than delayed recovery from failures. When the performance objective is unspecified, you may like to consider graceful degradation as a strategy.

It is a common mistake to ask the OLTP systems to answer analytical queries, something that they are not optimized for. It is desirable for an informed programmer to know the capability of the system, and suggest design changes as per the requirements.

Online analytical processing

Online analytical processing (OLAP) systems are designed to answer analytical queries in a short time. They typically get data from the OLTP operations, and their data model is optimized for querying. They basically provide for consolidation (roll-up), drill-down and slicing and dicing of data for analytical purposes. They often use specialized data stores that can optimize ad-hoc analytical queries on the fly. It is important for such databases to provide pivot-table like capability. Often, the OLAP cube is used to get fast access to the analytical data.

Feeding the OLTP data into the OLAP systems may entail workflows and multistage batch processing. The performance concern of such systems is to efficiently deal with large quantities of data while also dealing with inevitable failures and recovery.

Batch processing

Batch processing is automated execution of predefined jobs. These are typically bulk jobs that are executed during off-peak hours. Batch processing may involve one or more stages of job processing. Often batch processing is clubbed with workflow automation, where some workflow steps are executed offline. Many of the batch processing jobs work on staging of data, and on preparing data for the next stage of processing to pick up.

Batch jobs are generally optimized for the best utilization of the computing resources. Since there is little to moderate the demand to lower the latencies of some particular subtasks, these systems tend to optimize for throughput. A lot of batch jobs involve largely I/O processing and are often distributed over a cluster. Due to distribution, the data locality is preferred when processing the jobs; that is, the data and processing should be local in order to avoid network latency in reading/writing data.

A structured approach to the performance

In practice, the performance of non-trivial applications is rarely a function of coincidence or prediction. For many projects, performance is not an option (it is rather a necessity), which is why this is even more important today. Capacity planning, determining performance objectives, performance modeling, measurement, and monitoring are key.

Tuning a poorly designed system to perform is significantly harder, if not practically impossible, than having a system well-designed from the start. In order to meet a performance goal, performance objectives should be known before the application is designed. The performance objectives are stated in terms of latency, throughput, resource utilization, and workload. These terms are discussed in the following section in this chapter.

The resource cost can be identified in terms of application scenarios, such as browsing of products, adding products to shopping cart, checkout, and more. Creating workload profiles that represent users performing various operations is usually helpful.

Performance modeling is a reality check for whether the application design will support the performance objectives. It includes performance objectives, application scenarios, constraints, measurements (benchmark results), workload objectives and if available, the performance baseline. It is not a replacement for measurement and load testing, rather, the model is validated using these. The performance model may include the performance test cases to assert the performance characteristics of the application scenarios.

Deploying an application to production almost always needs some form of capacity planning. It has to take into account the performance objectives for today and for the foreseeable future. It requires an idea of the application architecture, and an understanding of how the external factors translate into the internal workload. It also requires informed expectations about the responsiveness and the level of service to be provided by the system. Often, capacity planning is done early in a project to mitigate the risk of provisioning delays.

The performance vocabulary

There are several technical terms that are heavily used in performance engineering. It is important to understand these, as they form the cornerstone of the performance-related discussions. Collectively, these terms form a performance vocabulary. The performance is usually measured in terms of several parameters, where every parameter has roles to play—such parameters are a part of the vocabulary.

Latency

Latency is the time taken by an individual unit of work to complete the task. It does not imply successful completion of a task. Latency is not collective, it is linked to a particular task. If two similar jobs—j1 and j2 took 3 ms and 5 ms respectively, their latencies would be treated as such. If j1 and j2 were dissimilar tasks, it would have made no difference. In many cases the average latency of similar jobs is used in the performance objectives, measurement, and monitoring results.

Latency is an important indicator of the health of a system. A high performance system often thrives on low latency. Higher than normal latency can be caused due to load or bottleneck. It helps to measure the latency distribution during a load test. For example, if more than 25 percent of similar jobs, under a similar load, have significantly higher latency than others, then it may be an indicator of a bottleneck scenario that is worth investigating.

When a task called j1 consists of smaller tasks called j2, j3, and j4, the latency of j1 is not necessarily the sum of the latencies of each of j2, j3, and j4. If any of the subtasks of j1 are concurrent with another, the latency of j1 will turn out to be less than the sum of the latencies of j2, j3, and j4. The I/O bound tasks are generally more prone to higher latency. In network systems, latency is commonly based on the round-trip to another host, including the latency from source to destination, and then back to source.

Throughput

Throughput is the number of successful tasks or operations performed in a unit of time. The top-level operations performed in a unit of time are usually of a similar kind, but with a potentially different from latencies. So, what does throughput tell us about the system? It is the rate at which the system is performing. When you perform load testing, you can determine the maximum rate at which a particular system can perform. However, this is not a guarantee of the conclusive, overall, and maximum rate of performance.

Throughput is one of the factors that determine the scalability of a system. The throughput of a higher level task depends on the capacity to spawn multiple such tasks in parallel, and also on the average latency of those tasks. The throughput should be measured during load testing and performance monitoring to determine the peak-measured throughput, and the maximum-sustained throughput. These factors contribute to the scale and performance of a system.

Bandwidth

Bandwidth is the raw data rate over a communication channel, measured in a certain number of bits per second. This includes not only the payload, but also all the overhead necessary to carry out the communication. Some examples are: Kbits/sec, Mbits/sec, and more. An uppercase B such as KB/sec denotes Bytes, as in kilobytes per second. Bandwidth is often compared to throughput. While bandwidth is the raw capacity, throughput for the same system is the successful task completion rate, which usually involves a round-trip. Note that throughput is for an operation that involves latency. To achieve maximum throughput for a given bandwidth, the communication/protocol overhead and operational latency should be minimal.

For storage systems (such as hard disks, solid-state drives, and more) the predominant way to measure performance is IOPS (Input-output per second), which is multiplied by the transfer size and represented as bytes per second, or further into MB/sec, GB/sec, and more. IOPS is usually derived for sequential and random workloads for read/write operations.

Mapping the throughput of a system to the bandwidth of another may lead to dealing with an impedance mismatch between the two. For example, an order processing system may perform the following tasks:

  • Transact with the database on disk
  • Post results over the network to an external system

Depending on the bandwidth of the disk sub-system, the bandwidth of the network, and the execution model of order processing, the throughput may depend not only on the bandwidth of the disk sub-system and network, but also on how loaded they currently are. Parallelism and pipelining are common ways to increase the throughput over a given bandwidth.

Baseline and benchmark

The performance baseline, or simply baseline, is the reference point, including measurements of well-characterized and understood performance parameters for a known configuration. The baseline is used to collect performance measurements for the same parameters that we may benchmark later for another configuration. For example, collecting "throughput distribution over 10 minutes at a load of 50 concurrent threads" is one such performance parameter that we can use for baseline and benchmarking. A baseline is recorded together with the hardware, network, OS and JVM configuration.

The performance benchmark, or simply benchmark, is the recording of the performance parameter measurements under various test conditions. A benchmark can be composed of a performance test suite. A benchmark may collect small to large amounts of data, and may take varying durations depending on the use-cases, scenarios, and environment characteristics.

A baseline is a result of the benchmark that was conducted at one point in time. However, a benchmark is independent of the baseline.

Profiling

Performance profiling , or simply profiling, is the analysis of the execution of a program at its runtime. A program can perform poorly for a variety of reasons. A profiler can analyze and find out the execution time of various parts of the program. It is possible to put statements in a program manually to print the execution time of the blocks of code, but it gets very cumbersome as you try to refine the code iteratively.

A profiler is of great assistance to the developer. Going by how profilers work, there are three major kinds—instrumenting, sampling, and event-based.

  • Event-based profilers: These profilers work only for selected language platforms, and provide a good balance between the overhead and results; Java supports event-based profiling via the JVMTI interface.
  • The instrumenting profilers: These profilers modify code at either compile time, or runtime to inject performance counters. They are intrusive by nature and add significant performance overhead. However, you can profile the regions of code very selectively using the instrumenting profilers.
  • The sampling profilers: These profilers pause the runtime and collect its state at "sampling intervals". By collecting enough samples, they get to know where the program is spending most of its time. For example, at a sampling interval of 1 millisecond, the profiler would have collected 1000 samples in a second. A sampling profiler also works for code that executes faster than the sampling interval (as in, the code may perform several iterations of work between the two sampling events), as the frequency of pausing and sampling is proportional to the overall execution time of any code.

Profiling is not meant only for measuring execution time. Capable profilers can provide a view of memory analysis, garbage collection, threads, and more. A combination of such tools is helpful to find memory leaks, garbage collection issues, and so on.

Performance optimization

Simply put, optimization is enhancing a program's resource consumption after a performance analysis. The symptoms of a poorly performing program are observed in terms of high latency, low throughput, unresponsiveness, instability, high memory consumption, high CPU consumption, and more. During the performance analysis, one may profile the program in order to identify the bottlenecks and tune the performance incrementally by observing the performance parameters.

Better and suitable algorithms are an all-around good way to optimize code. The CPU bound code can be optimized with computationally cheaper operations. The cache bound code can try using less memory lookups to keep a good hit ratio. The memory bound code can use an adaptive memory usage and conservative data representation to store in memory for optimization. The I/O bound code can attempt to serialize as little data as possible, and batching of operations will make the operation less chatty for better performance. Parallelism and distribution are other, overall good ways to increase performance.

Concurrency and parallelism

Most of the computer hardware and operating systems that we use today provide concurrency. On the x86 architecture, hardware support for concurrency can be traced as far back as the 80286 chip. Concurrency is the simultaneous execution of more than one process on the same computer. In older processors, concurrency was implemented using the context switch by the operating system kernel. When concurrent parts are executed in parallel by the hardware instead of merely the switching context, it is called parallelism. Parallelism is the property of the hardware, though the software stack must support it in order for you to leverage it in your programs. We must write your program in a concurrent way to exploit the parallelism features of the hardware.

While concurrency is a natural way to exploit hardware parallelism and speed up operations, it is worth bearing in mind that having significantly higher concurrency than the parallelism that your hardware can support is likely to schedule tasks to varying processor cores thereby, lowering the branch prediction and increasing cache misses.

At a low level, spawning the processes/threads, mutexes, semaphores, locking, shared memory, and interprocess communication are used for concurrency. The JVM has an excellent support for these concurrency primitives and interthread communication. Clojure has both—the low and higher level concurrency primitives that we will discuss in the concurrency chapter.

Resource utilization

Resource utilization is the measure of the server, network, and storage resources that is consumed by an application. Resources include CPU, memory, disk I/O, network I/O, and more. The application can be analyzed in terms of CPU bound, memory bound, cache bound, and I/O bound tasks. Resource utilization can be derived by means of benchmarking, by measuring the utilization at a given throughput.

Workload

Workload is the quantification of how much work is there in hand to be carried out by the application. It is measured in the total numbers of users, the concurrent active users, the transaction volume, the data volume, and more. Processing a workload should take in to account the load conditions, such as how much data the database currently holds, how filled up the message queues are, the backlog of I/O tasks after which the new load will be processed, and more.

The latency numbers that every programmer should know

Hardware and software have progressed over the years. Latencies for various operations put things in perspective. The latency numbers for the year 2015, reproduced with the permission of Aurojit Panda and Colin Scott of Berkeley University (http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html). Latency numbers that every programmer should know are as shown in the following table:

Operation

Time taken as of 2015

L1 cache reference

1ns (nano second)

Branch mispredict

3 ns

L2 cache reference

4 ns

Mutex lock/unlock

17 ns

Compress 1KB with Zippy

(Zippy/Snappy: http://code.google.com/p/snappy/)

2μs (1000 ns = 1μs: micro second)

Send 2000 bytes over the commodity network

200ns (that is, 0.2μs)

SSD random read

16 μs

Round-trip in the same datacenter

500 μs

Read 1,000,000 bytes sequentially from SSD

200 μs

Disk seek

4 ms (1000 μs = 1 ms)

Read 1,000,000 bytes sequentially from disk

2 ms

Packet roundtrip CA to Netherlands

150 ms

The preceding table shows the operations in a computer vis-a-vis the latency incurred due to the operation. When a CPU core processes some data in a CPU register, it may take a few CPU cycles (for reference, a 3 GHz CPU runs 3000 cycles per nanosecond), but the moment it has to fall back on L1 or L2 cache, the latency becomes thousands of times slower. The preceding table does not show main memory access latency, which is roughly 100 ns (it varies, based on the access pattern)—about 25 times slower than the L2 cache.

Summary

We learned about the basics of what it is like to think more deeply about performance. We saw the common performance vocabulary, and also the use cases by which performance aspects might vary. We concluded by looking at the performance numbers for the different hardware components, which is how performance benefits reach our applications. In the next chapter, we will dive into the performance aspects of the various Clojure abstractions.

Left arrow icon Right arrow icon

Description

Clojure treats code as data and has a macro system. It focuses on programming with immutable values and explicit progression-of-time constructs, which are intended to facilitate the development of more robust programs, particularly multithreaded ones. It is built with performance, pragmatism, and simplicity in mind. Like most general purpose languages, various Clojure features have different performance characteristics that one should know in order to write high performance code. This book shows you how to evaluate the performance implications of various Clojure abstractions, discover their underpinnings, and apply the right approach for optimum performance in real-world programs. It starts by helping you classify various use cases and the need for them with respect to performance and analysis of various performance aspects. You will also learn the performance vocabulary that experts use throughout the world and discover various Clojure data structures, abstractions, and their performance characteristics. Further, the book will guide you through enhancing performance by using Java interoperability and JVM-specific features from Clojure. It also highlights the importance of using the right concurrent data structure and Java concurrency abstractions. This book also sheds light on performance metrics for measuring, how to measure, and how to visualize and monitor the collected data. At the end of the book, you will learn to run a performance profiler, identify bottlenecks, tune performance, and refactor code to get a better performance.

Who is this book for?

This book is intended for intermediate Clojure developers who are looking to get a good grip on achieving optimum performance. Having a basic knowledge of Java would be helpful.

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Sep 29, 2015
Length: 198 pages
Edition : 2nd
Language : English
ISBN-13 : 9781785283642
Category :
Languages :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Sep 29, 2015
Length: 198 pages
Edition : 2nd
Language : English
ISBN-13 : 9781785283642
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 142.97
Clojure Reactive Programming
$48.99
Clojure High Performance Programming, Second Edition
$38.99
Clojure for Data Science
$54.99
Total $ 142.97 Stars icon

Table of Contents

9 Chapters
1. Performance by Design Chevron down icon Chevron up icon
2. Clojure Abstractions Chevron down icon Chevron up icon
3. Leaning on Java Chevron down icon Chevron up icon
4. Host Performance Chevron down icon Chevron up icon
5. Concurrency Chevron down icon Chevron up icon
6. Measuring Performance Chevron down icon Chevron up icon
7. Performance Optimization Chevron down icon Chevron up icon
8. Application Performance Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.4
(5 Ratings)
5 star 60%
4 star 20%
3 star 20%
2 star 0%
1 star 0%
Aaron B. Iba Nov 29, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I've been programming in Clojure for 5 years and recently took on a very performance-critical project. I found this book to be extremely helpful and practical for writing high-performance Clojure code. As other reviewers have noted, the book covers a lot of different topics, but I just skipped the ones I was already familiar with and studied the stuff that was new to me -- and there was a lot of great new stuff. Overall I highly recommend this book and I hope the author comes out with new editions for new versions of Clojure.
Amazon Verified review Amazon
Lucas Medeiros Reis Oct 25, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I appreciate technical books that present not only theory, but useful practical tips as well. A lot of books fall somewhere in the middle, presenting little or no theory, and techniques that do not fit in real world projects. This book rises far above this threshold. My main highlights are:Chapter one, "Performance by Design". Made me think differently about performance no matter which language I'll be using.Chapter four, "Host Performance". Learned a lot about the JVM.Chapter six, "Measuring Performance". Measuring is maybe the most important part of performance tuning, and this chapter has both theory and practical Clojure tips.
Amazon Verified review Amazon
Oleg Okun Nov 13, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book is for Clojure professionals thinking of practical applications of their code. It covers a broad range of topics, including use case classification according to performance requirements (CPU, memory, cache, input/output, online and offline data processing), performance characteristics (latency, bandwidth, throughput), data structures and how Clojure stores different types of objects, computational complexity of operations involving these types, abstractions and how to make decisions about which abstraction to use depending on performance use cases, Java Virtual Machine (JVM) on which Clojure runs (by default, Clojure may not produce optimized JVM bytecode!; hence, understanding of how Clojure interacts with Java is of importance), influence of host (hardware + JVM) characteristics on performance, concurrency and concurrency related data structures, parallelization support, measuring and optimizing performance (code profiling, performance statistics).
Amazon Verified review Amazon
Blake Watson Oct 25, 2015
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
This is a big, big topic crammed into a smallish (160+, excluding TOC, index, etc.) page book, but for all that, if you use it right it can be very helpful. This is not a beginner's book: it describes the nitty-gritty of how Clojure deals with various situations, which are things you don't want to concern yourself with until you have to. Laziness is covered, of course, as is chunking, transducers, mutability, transients, looping/recursion, and so on, and that's just Chapter 2! (Chapter 1 is mostly nomenclature.)Now, frankly, if I'm writing a 160 page book on optimization, I would probably take all 160 pages to talk about the material in Chapter 2. These are the underpinnings of the high level aspects of Clojure, but Kumar goes on to spend a chapter on Java bytecode, hardware optimizations. It's a lot of material, and it's only touched on here, by-and-large.I didn't find that to be a bad thing, however. It's perhaps less useful than it might be if you had a giant book that had your exact problem in it, I suppose, but what I've done is to read a chapter through quickly, and then look at my code and see if I can apply what I learned there. This not only has the potential to make the code faster, but to give you a deeper understanding of what's going on under the covers.You want to know going in, though. There's not a lot of hand-holding. I'm still pondering over the section on Agents, which I'm pretty sure I've heard Rich Hickey say he never uses. I kind of wanted the "Measuring Performance" (chapter 6) up front since that's what we're trying to do. Maybe Chapter 7, too.Anyway, this approach with this material worked for me, and I liked the way it removed some of the mystery of Clojure for me, or at least help me resolve some of the mystery by pointing me in the right direction.
Amazon Verified review Amazon
Mr C M Fleming Dec 08, 2015
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
Overall, I thought this book was a good read. It provides a good overview of many essential concepts - throughput, bandwidth and latency, concurrency concepts, touches lightly on algorithmic complexity and also covers the essential statistical concepts required to understand Criterium's output and reason about performance. It's coverage of the Clojure data structures and especially the concurrency constructs is very detailed.However I came away feeling that there was a lack of actionable information in the book. As an experienced Clojure developer, I learned several things that I didn't know previously, but only as brief pointers to external information. The book spent a long time on topics which are undoubtedly interesting, but provided little guidance on what the information means for someone trying to improve the performance of their programs. For example, the discussion of modern processor and memory architecture was interesting and detailed, but how should I apply that knowledge to my projects, especially in a very high-level language such as Clojure? Subjects such as OS level profiling are mentioned, but these are highly specialised topics which have no place in a book this short.In the end I think this book treads an uneasy path - it's very detailed in places, but is too short to contain enough information to make many of the topics worth while. In the end, it tries to address an enormous topic in a short book, and I think it would have been better to have remained more focused on how Clojure developers specifically can improve their programs' performance. It will be useful to someone without a lot of experience in JVM or Clojure performance topics, but more experienced users will probably find it comes up short.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.