When you mutate data, you change it. If data is immutable, you can't change it.
Immutability is native to some programming languages. But why is changing data bad, and how does one write software with data that never changes?
When you mutate data, you change it. If data is immutable, you can't change it.
Immutability is native to some programming languages. But why is changing data bad, and how does one write software with data that never changes?
When you first learned how to program, one of the earliest concepts that you were taught was the variable. A variable is something that's used to store a value, so of course it's important! When you create a variable and assign its initial value, it feels like you're creating new data. This is because you are creating new data.
Then, later on in your code, you assign a new value to your variable. It's a variable, so naturally it varies. This might also seem like you're creating new data, which you are; if you assign the value 5 to your variable, then that's a new value that didn't exist before. Here's the catch: you're also destroying data. What if your variable had a value of 2 before you assigned it 5?
This might not seem like a big deal. Isn't a variable supposed to store whatever value you assign to it? Yes, variables work perfectly fine that way. They do exactly what we ask them to do. So, the problem isn't strictly that variables can change their values. Rather, it's a human limitation—we can't reason our way through the huge number of variable changes that happen in our code. This leads to some scary bugs.
Imagine that you're trying to bake a loaf of bread. Nothing fancy, just a plain loaf. You don't even need to look at the recipe—it's just a handful of ingredients, and you've memorized the steps. Now suppose that one or two of your ingredients have been changed without your knowledge. You still follow the steps, and the end result still looks like bread, but then your family eats it. One family member says something tastes off, another doesn't seem to notice anything, while yet another heads straight for the washroom.
These are just some of many possible scenarios that could have played out, and the problems all stemmed from the ingredients used. In other words, because the ingredients were allowed to change, they did change. Consider the game Telephone. In this game, there is a line of people and a message is whispered to the first person in the line. The same message is whispered (allegedly) to the next person, and so on, until the message reaches the last person who then says the message out loud. The ending message is almost never the same as the starting message. Once again, the message changes because it's allowed to change.
When writing software, you're no different from the person making bread using incorrect ingredients or the Telephone game player relaying an incorrect message. The end result is a scary type of bug. If you mess up your bread, can you pinpoint exactly what went wrong? Can you identify the Telephone game players who broke the message? Your variables change because they can. You've written code to make the variables change, and sometimes everything works exactly as you want it to. But when something goes wrong, it's very difficult to figure out what went wrong.
If data isn't supposed to change, just how are we supposed to get anything done? How do we move the state of an application along from one state to the next if our data is immutable? The answer is that every operation that you perform on immutable data creates new immutable data. These are called persistent changes, because the original data is persisted. The new data that's created as a result of running the operation contains the changes. When we call an operation on this new data, it returns new data, and so on.
What are we supposed to do with the old data when we make a persistent change that results in new data? The answer is – it depends. Sometimes, you'll just replace the old data with the new data. Yes, the variable is changed, but it's replaced with an entirely new reference. This means that something that is still referencing the old data is never affected by your persistent changes.