Defining Augmented Reality
According to the Merriam-Webster dictionary, the word augment means "to make greater, more numerous, larger, or more intense," while reality is defined as "the quality or state of being real." Considering this, we realize that "augmented reality" is all about using digital content to improve our real world to add better information, understanding, and value to our experiences.
Augmented reality is most commonly associated with visual augmentation, where computer-generated graphics are combined with actual real-world visuals. When using a handheld mobile phone or tablet, for instance, AR combines graphics with the on-screen video (I call this video see-through AR). Using wearable AR glasses, graphics are directly added to your visual field (optical see-through AR).
But AR is not simply a computer graphic overlay. In his acclaimed 1997 research report, A Survey of Augmented Reality (http://www.cs.unc.edu/~azuma/ARpresence.pdf), Ronald Azuma proposed that AR must meet the following characteristics:
- Combines the real and virtual: The virtual objects are perceived as real-world objects that are sharing the physical space around you.
- Interactive in real time: AR is experienced in real time, not pre-recorded. For example, cinematic special effects that combine real action with computer graphics do not count as AR.
- Registered in 3D: The graphics must be registered to real-world 3D locations. For example, a heads-up display (HUD) where information is simply overlayed in the visual field is not AR.
To register a virtual object in 3D, the AR device must have the ability to track its location in 3D space and map the surrounding environment to place objects in the scene. There are multiple technologies and techniques for positional and orientation tracking (together referred to as pose tracking), as well as environmental feature detection, including the following:
- Geolocation: GPS provides low-resolution tracking of your location on the Earth (GPS accuracy is measured in feet or meters). This is usually good enough for wayfinding in a city and identifying nearby businesses, for example, but not for more specific positioning.
- Image Tracking: Images from the device's camera can be used to match the predefined or real-time 2D images, such as QR code markers, game cards, or product packaging, to display AR graphics that track an image's pose (3D position and orientation) relative to the camera space.
- Motion Tracking: Using the device's camera and other sensors (including inertial measurement by IMU motion sensors), you can compute your position and orientation in 3D, and detect visually distinct features in the environment. Academically, you may see this referred to as Simultaneous Localization and Mapping (SLAM).
- Environmental Understanding: As features are detected in the environment, such as X-Y-Z location depth points, they can be clusters to identify horizontal and vertical planes, as well as other shapes in 3D. These can be used by your application for object placement and interaction with real-world objects.
- Face and Object Tracking: Augmented selfie pictures use the camera to detect faces and map a 3D mesh that can be used to add a face mask or other (often humorous) enhancements to your image. Likewise, other shaped objects can be recognized and tracked, as may be required for industrial applications.
In this book, we will be using many of these techniques in real projects with Unity's AR Foundation toolkit, so that you can learn how to build a wide variety of AR applications. And we'll also be learning many other details and capabilities offered by Unity and AR software, all of which we'll use to improve the quality and realism of your graphics and provide engaging interactive experiences for your users.
Like all technologies, AR can potentially be used for better or for worse. A great exposé on a hypothetical disturbing future, where AR is ubiquitous and as consuming as today's mobile media technologies, can be found in this 2016 Hyper-Reality art video by Keiichi Matsuda (http://hyper-reality.co/). Hopefully, you can help build a better future!
In this book, we are using the Unity 3D game engine for development (https://unity.com/), as well as the AR Foundation toolkit package. AR Foundation provides a device-independent SDK on top of the device-specific system features provided by Google ARCore, Apple ARKit, Microsoft HoloLens, Magic Leap, and others. For further reading and to get a good introduction to mobile handheld augmented reality, check out the following links:
- ARCore Fundamental Concepts: https://developers.google.com/ar/discover/concepts.
- Introducing ARKit: https://developer.apple.com/augmented-reality/arkit/.
- Getting Started with AR Development in Unity: https://developers.google.com/ar/discover/concepts.
Let's start developing AR applications with Unity. First, you'll need to install Unity on your development computer.