How Does AR Work
Oppdatert: 18. mai
In today’s world AR can be displayed on various devices including screens, glasses, mobile phone (iOS, Android etc.), hand-less devices, head-mounted devices etc. AR uses different technologies like depth tracking, SLAM (Simultaneous Localisation And Mapping) etc.¹ AR uses computer vision to understand the world around the user using the camera feed on which digital content including computer graphics are rendered over the real-time objects.
Detecting the features of the real world by applying the hit test and other SLAM techniques. Based on the result of the SLAM technique recognising the real world around the user.
Let’s take an example to understand how exactly AR works. Let's take an example for mobile-based AR game to scan a box. Firstly the mobile phone camera is used to capture the real world around the user. For a human, it’s a very easy task to understand the real world around them or to recognise the box, but for the computer, it’s not the same, there is a whole field of Computer Science dedicated for it, commonly known as Computer vision. The raw data from the camera is processed using computer vision to recognize the box.Once the box is recognized the game is triggered which in turn will render the augmented 3D object, making sure that it preciously overlap with the box.
AR requires the understanding of the real-world in term of “Semantics” and “Geometry”. Semantics means what is that object and geometry mean where is that object. The semantics of that image can be understood by the help of deep learning which recognises the object irrespective of the 3D geometry. Computer Vision helps in recognizing the 3D from a 2D image. For precious overlapping, the 3D orientation of the box and position of the box is captured using computer vision. Mostly Augmented Reality is live, which means all the process stated above happens every time for each new frame. For most of the mobile phone, the camera works at 30 frames per second, which means all the complete process stated above needs to complete within 30 milliseconds each time. Sometimes because of intensive processing, there can be a delay of 50 milliseconds but over brain does not notice that. Making this complete process fast and reliable.
Components of Augmented Reality
Let’s try to understand the different component which helps in achieving it.
Camera and Sensors: For collection, the data of user interaction and the real-world around the user, camera live feed and data from different sensors are used including gyro meter, accelerometer, etc. The system can also use the special camera as in the case of HoloLens’s or normal camera of the smartphone to get camera feeds.
Processing: Once real-world data has been collected it needs to be processed in real-time. For this most of the type of AR, the devices act like a small computer and required great processing capabilities. Similar to the computer they also require a CPU, GPU, RAM, GPS, Wi-Fi, flash memory, etc.
Projection: Once the data is processed, 3D content needs to rendered and projected in the real-time on to a surface to view. This is one field in which researches are going on across the globe.
Reflection: Proper image alignment is performed using reflection. For this most of the devices uses mirrors that help human eyes to view and understand the 3D content which is rendered in correct alignment. Some of the devices use a double mirror while other devices use an array of curved mirrors which reflect light to a camera and in turn to the human eyes.
: R. Azuma, Y. Baillot, R. Behringer, S. Feiner, S. Julier and B. MacIntyre, “Recent advances in augmented reality,” in IEEE Computer Graphics and Applications, vol. 21, no. 6, pp. 34–47, Nov.-Dec. 2001. DOI:10.1109/38.963459