Camera-based tracking, also known as optical tracking, uses cameras placed on or around the headset to determine position and orientation based on
computer vision algorithms. Camera-based 3D tracking systems require a direct line of light without occlusions, otherwise they will receive wrong data. Optical tracking can be done either with or without markers. Tracking with markers involves targets with known patterns to serve as reference points, and cameras constantly seek these markers and then use various algorithms (for example,
POSIT algorithm) to extract the position of the object. Markers can be visible, such as printed
QR codes, but many use
infrared (IR) light that can only be picked up by cameras. Active implementations feature markers with built-in IR LED lights which can turn on and off to sync with the camera, making it easier to block out other IR lights in the tracking area. Passive implementations are
retroreflectors which reflect the IR light back towards the source with little scattering. Markerless tracking does not require any pre-placed targets, instead using the natural features of the surrounding environment to determine position and orientation.
Outside-in tracking In this method, cameras are placed in stationary locations in the environment to track the position of markers on the tracked device, such as a head mounted display or controllers. Having multiple cameras allows for different views of the same markers, and this overlap allows for accurate readings of the device position. This method is the most mature, having applications not only in VR but also in motion capture technology for film. However, this solution is space-limited, needing external sensors in constant view of the device.
Pros: • More accurate readings, can be improved by adding more cameras • Lower latency than inside-out tracking
Cons: • Occlusion, cameras need direct line of sight or else tracking will not work • Necessity of outside sensors means limited play space area
Inside-out tracking In this method, the camera is placed on the tracked device and looks outward to determine its location in the environment. Headsets that use this tech have multiple cameras facing different directions to get views of its entire surroundings. This method can work with or without markers. The Lighthouse system used by the
HTC Vive is an example of active markers. Each external Lighthouse module contains IR LEDs as well as a laser array that sweeps in horizontal and vertical directions, and sensors on the headset and controllers can detect these sweeps and use the timings to determine position. Markerless tracking, such as on the
Oculus Quest, does not require anything mounted in the outside environment. It uses cameras on the headset for a process called
SLAM, or simultaneous localization and mapping, where a 3D map of the environment is generated in real time. This tech allows high-end headsets like the
Microsoft HoloLens to be self-contained, but it also opens the door for cheaper mobile headsets without the need of tethering to external computers or sensors.
Pros: • Enables larger play spaces, can expand to fit room • Adaptable to new environments
Cons: • More on-board processing required • Latency can be higher == Sensor fusion ==