Apple’s revolutionary product, the Vision Pro, is leading the way into a new era of “spatial computing.” The Vision Pro by Apple introduces an encompassing virtual landscape through a combination of virtual reality and interactive augmented reality experiences, enhancing the user’s surroundings. This results in a seamless fusion between the physical and virtual realms, with Apple asserting that this encounter surpasses any other AR, VR, or mixed-reality headset available in today’s market.
Hailing from Cupertino, the tech giant proudly dubs the Vision Pro as “the pinnacle of advanced electronic devices” and boasts the integration of over 5,000 patents to craft its unparalleled and distinctive encounter. During WWDC 2023, the company’s top executives, including CEO Tim Cook, along with several engineers, dedicated a substantial session to delve into the artistry that defines the Vision Pro, making it a highlight of the virtual event.
Scheduled for an early 2024 release, the Apple Vision Pro will carry a price tag of $3,499. The lead-up to its market debut has been designated to allure developers to the new visionOS platform, tailor-made for this headset and forthcoming Apple devices. Beyond its standard advantages in immersive video and gaming, the Vision Pro offers an array of features that promise an extraordinary experience. Based on the demonstrations showcased by Apple, here are the standout features of the Apple Vision Pro.
Adjustable Crown for Immersive Control
At the core of the Apple Vision Pro lies an immersive three-dimensional interface presented on micro OLEDs for each eye. These exceptionally crisp displays offer a lifelike visual quality, complemented by effects such as shadows that interact with real objects in the physical environment. These elements infuse virtual objects with a natural presence, seamlessly integrating them into the user’s surroundings.
Simultaneously, the Vision Pro headset empowers users to tailor their level of immersion using a rotating crown reminiscent of the Apple Watch. This crown allows users to fine-tune their engagement with the virtual setting based on their desired level of focus or attention.
For instance, during work scenarios requiring presentations to colleagues, users can dial down the immersion. Conversely, while watching movies or gaming, the rotating dial enables users to fully envelop themselves in virtual realms called “Environments.”
These Environments feature dynamic imagery captured volumetrically from actual locations using high-resolution 3D cameras, creating a perception of physical presence within the simulated setting. The objective is to replace the user’s immediate environment with a more serene backdrop, facilitating tasks such as work, video consumption, or gaming without disruptions.
Seamless Eye Tracking
An intriguing facet of the Apple Vision Pro is its ability to interact with users through their senses, encompassing vision, voice, and touch. The user’s eyes play a pivotal role in navigating the interface, essentially serving as a mouse pointer. By employing infrared cameras and a ring of LEDs that project imperceptible light patterns onto the user’s eye, the Vision Pro accurately determines the user’s gaze, eliminating the need for head movement. This accuracy is established through an initial calibration performed upon the headset’s first use.
Beyond its advanced hardware, Apple’s mastery extends into the finer points that enable this eye-tracking capability. Sterling Crispin, a former Apple neurotechnology prototyping researcher, highlighted the nuanced aspects making this tracking achievable in a tweet.
The miniature cameras not only track but also anticipate minuscule eye movements, enabling precise predictions of user clicks. Supported by AI, Apple extensively studied physiological and emotional responses to refine the eye-tracking features, ensuring a comprehensive and immersive user experience.
Apple clarifies that eye-tracking data is isolated within a distinct background process and remains inaccessible to third-party apps or websites. However, it doesn’t explicitly detail whether this data could be utilized for model training purposes as it accumulates more real-world usage data.
Intuitive Hand Gestures for Interface Navigation
Apple introduces an innovative approach to interface navigation with the Vision Pro, relying on physical cues from the user’s hands to execute actions like clicking, scrolling, or zooming. This eliminates the need for an unwieldy external controller that requires wrist engagement. For example, tapping an index finger against the thumb triggers a click, while a flick of the finger initiates scrolling. The headset also supports gestures for resizing or relocating open windows, as well as zooming in or out on media. Apple emphasizes the aim for a seamless and natural interaction with virtual objects.
This capability is facilitated by an extensive array of sensors that capture and transmit visual data. These sensors encompass primary front-facing cameras, alongside downward- and side-facing cameras—individual cameras for the left and right sides of the headset. This design enables users to rest their hands in their lap without needing to hold them in the air while interacting.
While the cameras ascertain the hand’s location, the Vision Pro incorporates supplementary sensors to enhance real-time perception of 3D depth, even in dim environments. These additions encompass infrared illuminators that function as floodlights, LiDAR scanners, and TrueDepth cameras.
Apple hasn’t disclosed the exact workings of hand gestures within gaming contexts, leaving room for curiosity regarding the headset’s ability to swiftly and accurately interpret hand movements against complex backgrounds.
Translucent Display for Engaging Interactions
Apple’s approach with the Vision Pro centers around delivering an immersive experience without isolating the user from their surroundings. Recognizing the importance of eye contact in interpersonal interactions, the Vision Pro employs a seemingly transparent screen that showcases the user’s eyes when they aren’t immersed in a virtual environment. This is achieved through a digital representation displayed on an external OLED screen, known as “Eyesight.” This feature activates automatically as someone approaches the user.
The Vision Pro’s eyepiece cameras capture the user’s actual eye movements when engaged in conversations, allowing the person wearing the headset to maintain visual contact with others and their immediate surroundings through a live feed relayed within the headset.
This external display also conveys a clear indication to others when the Vision Pro is in use. In contrast to Eyesight, the live feed of the user’s eyes becomes less defined during augmented reality experiences, replaced by animations during more captivating virtual reality scenarios. This distinction is especially noticeable from a distance, with the headset accommodating anyone within the user’s virtual realm when they approach or initiate conversation.
Realistic Avatars in 3D Facetime
Apple introduces a novel approach to Facetime interactions on the Vision Pro, placing participants in virtual tiles within the user’s surroundings. Each tile can be scaled and repositioned according to the user’s preferences. Facetime leverages the Vision Pro’s spatial audio capabilities, delivering participants’ audio from the corresponding position of their tile. This feature also allows users to share their screens from web browsers and applications.
However, the most captivating aspect of spatial Facetime is not just the portrayal of other participants, but the portrayal of the user. Apple endeavors to liberate users from holding phones or other devices during Facetime calls while wearing the Vision Pro. To accomplish this, the Vision Pro generates an “authentic representation” of the user, similar to how iPhones and iPads create Memojis using TrueDepth cameras.
Utilizing neural networks, Apple crafts a lifelike persona that can be displayed during Facetime calls, mirroring the user’s facial expressions, eye movements, and hand gestures—captured by the headset’s sensors. Furthermore, while standard users on iPhones, iPads, and Macs observe a 2D depiction of the user’s face, Vision Pro users experience a 3D version of the persona.