Traditionally, computers have been good at one thing, and that is mathematical logic. They're amazing at processing mathematical operations at a rate many orders of magnitude faster than any human could ever be, or will ever be, able to. However, that in itself is a huge problem, as computers have been designed in such a way that they can't work with data if we can't express the algorithm in a set of mathematical operations that actually understands that data.
Therefore, tasks that humans find simple, such as understanding natural languages, visual, and auditory information, are practically impossible for computers to perform. Why? Well, let's take a look at the sentence I shot an elephant in my pyjamas.
What does that sentence mean? Well, if you were to think about it, you'd say that it means a person, clad in his pyjamas, is taking a photograph of the elephant. However, the sentence is ambiguous; we may assume questions such as, Is the elephant wearing the pyjamas?, and Is the human hunting the elephant? There are many different ways that we could interpret this.
However, if we take into account the fact that the person mentions that this is Tom, and that Tom is a photographer, then we know that pyjamas are usually associated with humans and that elephants and animals in general don't usually wear clothes. We can then understand the sentence the way it's meant to be understood.
The contextual resolution that went behind understanding the sentence is something that comes naturally to us humans. Natural language is something we're built to be great at understanding and it's quite literally encoded within our Forkhead box Protein P2 (FOXP2) gene; it's an innate ability of ours.
There's proof that natural language is encoded within our genes, even down to the way it's structured. Even if different languages were developed from scratch by different cultures in complete isolation from one another, they have the same, very basic, underlying structure, such as nouns, verbs, and adjectives.
But there's a problem, there's a (sometimes unclear) difference between knowledge and understanding. For example, when we ride a bike, we know how to ride a bike, but we don't necessarily understand how to ride a bike. All of the balancing, the gyroscopic movement, and tracking is a very complex algorithm that our brain runs on, without even realizing it, when we ride a bike. If we were to ask someone to write all the mathematical operations that go behind riding a bike, it would be next to impossible for them to do so, unless they're a physicist. You can find out more about this algorithm, the distinction between knowledge and understanding, how the human mind adapts, and more, with this video by SmarterEveryDay on YouTube: https://www.youtube.com/watch?v=MFzDaBzBlL0.
Similarly, we know how to understand natural language, but we don't completely understand the extremely complex algorithm that goes behind understanding it.
Since we don't understand that complex algorithm, we cannot express it mathematically and, hence, computers cannot understand natural language data, until we provide them the algorithms to do so.
Similar logic applies to visual data and auditory data, or practically any other kind of information that we, as humans, are naturally good at recognizing, but are simply unable to create algorithms for.
There are also some cases in which humans and computers can't work well with data. In a majority of the cases, this would be high-diversity tabular data with many features. A great example of this kind of data is fraud detection data, in which we have lots of features, location, price, category of purchase, and time of day, just to name a few. At the same time, however, there is a lot of diversity. Someone could buy a plane ticket once a year for a vacation, but it wouldn't be a fraudulent purchase as it was made by the owner of the card with a clear intention.
Because of the high diversity, high feature count, and the fact that it's better to be safe than sorry when it comes to this kind of fraud detection, there are numerous points at which a user could get frustrated while working with this system. A real-life example is when I was trying to order an iPhone on the launch day. As this was a very rushed ordeal, I tried to add my card to Apple Pay beforehand. Since I was trying to add my card to Apple Pay with a different verification method than the default, my card provider's algorithm thought someone was committing fraud and locked down my account. Fortunately, I still ended up getting it on launch day, using another card.
In other cases, these systems end up failing altogether, especially when we employ social engineering tricks, such as connecting with other humans on a personal level and psychologically tricking them into trusting us to get into people's accounts.