Simply put, AR is the combination of digital data and real-world human sensory input in real-time that is apparently attached (registered) to the physical space.
AR is most often associated with visual augmentation, where computer graphics are combined with actual world imagery. Using a mobile device, such as a smartphone or tablet, AR combines graphics with video. We refer to this as handheld video see-through. The following is an image of the Pokémon Go game that brought AR to the general public in 2016:
AR is not really new; it has been explored in research labs, military, and other industries since the 1990's. Software toolkits for desktop PCs have been available as both open source and propriety platforms since the late 1990's. The proliferation of smartphones and tablets has accelerated the industrial and consumer interest in AR. And certainly, opportunities for handheld AR have not yet reached their full potential, with Apple only recently entering the fray with its release of ARKit for iOS in June 2017 and Google's release of ARCore SDK for Android in August 2017.
Much of today's interest and excitement for AR is moving toward wearable eyewear AR with optical see-through tracking. These sophisticated devices, such as Microsoft HoloLens and Metavision's Meta headsets, and yet-to-be-revealed (as of this writing) devices from Magic Leap and others use depth sensors to scan and model your environment and then register computer graphics to the real-world space. The following is a depiction of a HoloLens device used in a classroom:
However, AR doesn't necessarily need to be visual. Consider a blind person using computer-generated auditory feedback to help guide them through natural obstacles. Even for a sighted person, a system like that which augments the perception of your real-world surroundings with auditory assistance is very useful. Inversely, consider a deaf person using an AR device who listens and visually displays the sounds and words going on around them.
Also, consider tactic displays as augmented reality for touch. A simple example is, the Apple Watch with a mapping app that will tap you on your wrist with haptic vibrations to remind you it's time to turn at the next intersection. Bionics is another example of this. It's not hard to consider the current advances in prosthetics for amputees as AR for the body, augmenting kinesthesia perception of body position and movement.
Then, there's this idea of augmenting spatial cognition and way finding. In 2004, researcher Udo Wachter built and wore a belt on his waist, lined with haptic vibrators (buzzers) attached every few inches. The buzzer facing north at any given moment would vibrate, letting him constantly know what direction he was facing. Udo's sense of direction improved dramatically over a period of weeks (https://www.wired.com/2007/04/esp/):
Can AR apply to smell or taste? I don't really know, but researchers have been exploring these possibilities as well.
What is real? How do you define "real"? If you're talking about what you can feel, what you can smell, and what you can taste and see, then "real" is simply electrical signals interpreted by your brain. ~ "Morpheus in The Matrix (1999)"
OK, this may be getting weird and very science fictiony. (Have you read Ready Player One and Snow Crash?) But let's play along a little bit more before we get into the crux of this specific book.
According to the Merriam-Webster dictionary (https://www.merriam-webster.com), the word augment is defined as, to make greater, more numerous, larger, or more intense. And reality is defined as, the quality or state of being real. Take a moment to reflect on this. You will realize that augmented reality, at its core, is about taking what is real and making it greater, more intense, and more useful.
Apart from this literal definition, augmented reality is a technology and, more importantly, a new medium whose purpose is to improve human experiences, whether they be directed tasks, learning, communication, or entertainment. We use the word real a lot when talking about AR: real-world, real-time, realism, really cool!
As human flesh and blood, we experience the real world through our senses: eyes, ears, nose, tongue, and skin. Through the miracle of life and consciousness, our brains integrate these different types of input, giving us vivid living experiences. Using human ingenuity and invention, we have built increasingly powerful and intelligent machines (computers) that can also sense the real world, however humbly. These computers crunch data much faster and more reliably than us. AR is the technology where we allow machines to present to us a data-processed representation of the world to enhance our knowledge and understanding.
In this way, AR uses a lot of artificial intelligence (AI) technologies. One way AR crosses with AI is in the area of computer vision. Computer vision is seen as a part of AI because it utilizes techniques for pattern recognition and computer learning. AR uses computer vision to recognize targets in your field of view, whether specific coded markers, natural feature tracking (NFT), or other techniques to recognize objects or text. Once your app recognizes a target and establishes its location and orientation in the real world, it can generate computer graphics that aligns with those real-world transforms, overlaid on top of the real-world imagery.
However, augmented reality is not just the combining of computer data with human senses. There's more to it than that. In his acclaimed 1997 research report, A Survey of augmented reality (http://www.cs.unc.edu/~azuma/ARpresence.pdf), Ronald Azuma proposed AR meet the following characteristics:
- Combines real and virtual
- Interactive in real time
- Registered in 3D
AR is experienced in real time, not pre-recorded. Cinematic special effects, for example, that combine real action with computer graphics do not count as AR.
Also, the computer-generated display must be registered to the real 3D world. 2D overlays do not count as AR. By this definition, various head-up displays, such as in Iron Man or even Google Glass, are not AR. In AR, the app is aware of its 3D surroundings and graphics are registered to that space. From the user's point of view, AR graphics could actually be real objects physically sharing the space around them.
Throughout this book, we will emphasize these three characteristics of AR. Later in this chapter, we will explore the technologies that enable this fantastic combination of real and virtual, real-time interactions, and registration in 3D.
As wonderful as this AR future may seem, before moving on, it would be remiss not to highlight the alternative possible dystopian future of augmented reality! If you haven't seen it yet, we strongly recommend watching the Hyper-Reality video produced by artist Keiichi Matsuda (https://vimeo.com/166807261). This depiction of an incredible, frightening, yet very possible potential future infected with AR, as the artist explains, presents a provocative and kaleidoscopic new vision of the future, where physical and virtual realities have merged, and the city is saturated in media. But let's not worry about that right now. A screenshot of the video is as follows: