What is missing from the article is that creating a model from a few pictures is not that hard (well it is to do well, but hear me out)
The difficult part is animating it realistically with the sensors you have, in real time.
Extracting signal from eye-gaze cameras with a sighlty wider field of view, that allows realistic not not uncanny valley animation is quite hard to do on the general public Peoples faces are all different sizes and shapes, to the point that even getting accraute gaze vectors is hard, let alone smile and check position (those are done with different cameras, not just eye gaze. )
https://www.youtube.com/live/ucRukZM0d1s?t=1h1m50s
https://zju3dv.github.io/freetimegs/
The videos can be played back in real-time, though they require multiple cameras to capture.
The previous beta ones were terrifying frankenstein monsters. The new ones fooled my boss for 30 minutes.
There's a bit of uncanny valley left, nevertheless. My persona's smile reminds of the horrible expressions people like to make in Source Filmmaker.
It feels a bit like the original Segway’s over-engineered solution versus cheap Chinese hoverboards, then the scooters and e-bikes that took over afterwards.
Why would I be paying all this money for this realistic telepresence when my shitbox HP laptop from Walmart has a perfectly serviceable webcam?
Like, guize, c'mon. Virtual desktop can do three. For 3.5k you gotta do better. I don't particularly need a virtual me in space as much as I need more screens that can do, like, actual work.
Even so, latency-in-zoom kind of becomes an attribute of the medium and you learn to adapt. How does it feel with the Vision Pro though? The article talks about a really convincing sense of being in the same place with someone - how does latency affect that? (And does it differ based on if you're all physically in Silicon Valley or not?)
Just in time for Vision Pro to go big. Right?
1. The scanning is fast, it takes longer to set up a fingerprint on a macbook air. Just turning the head from side to side, then up and down, smiling and raising one's eyebrows.
2. I used the M5, and the processing time to generate the persona was quick. I didn't time it, but it felt like less than 10 seconds.
3. My cheeks tend to restrict smiling while wearing the headset, it works but people that know me understood what I meant when I said my smile was hindered.
4. Despite the limited actions used for set up, it reproduces a far greater range of facial movements. For example if I do the invisible string trick, it captures my lips correctly (when you move the top lip in one direction and the lower lip in the opposite direction, as if pulled by a string.)
5. I wasn't expecting this big of a jump in quality from the v1.
Perhaps how their heads, eyes move with this weird "fluid" effect and way too much blurred faces?
One way latency on the Internet across fiber is about 4μs to 5μs per kilometer in my experience.
For example, SF to Paris is ~40ms one way (it used to be 60ms 15y ago, latency and jitter have really improved).
Double those values for the round trip allowing you to interject in a conversation.
Add wifi, which has terrible latency with a lot of jitter (1ms to 400ms jitter is not uncommon). Wi-Fi 7 should reduce the jitter and latency in theory. We shall see improvements in the coming decade. Cellphone 5G did improve latency for me, so I don't doubt WiFi will eventually deliver.
In other words you need to be within 3Mm (3000km) away to get a chance at a 30ms roundtrip. And that's assuming peer to peer without wifi nor slow devices.
For a conference call, everybody connects to a central server acting as the relay. So now the latency budget is halved already.
But you've still got all the network latency including Wi-Fi latency on both ends. And you always need a small audio buffer so discrete network packets can be assembled into continuous audio without gaps.
So I wouldn't expect this latency to be any different from regular videoconferencing.
To some degree but not fully. When you adapt your brain is still doing extra work to compensate, similarly to how you don’t «hear» jet engine noise after acclimating to an airplane but it will still tire you to some degree.
I had Zoom and Teams meetings daily during Covid, and personal FaceTime calls almost daily for a while. I still get «Zoom fatigue» if a call goes on for over an hour, if I need to talk face to face during the call (i.e. no screen sharing, can’t disable video and look at something else, etc.) I’m fine if I don’t look at people’s faces but rather people’s screen sharing.
While I have my browser configured to prefer Dutch, the second one is English; I wish I could tell it / them that I don't want them to translate anything if it's in one of those languages.
Once you're already in VR, it's nice to not have to break out for a meeting, and that's where Personas fit in.
It's not a killer app carrying the product, it's a necessary feature making sure there's not a gap in workflow.
web browsing without captchas, anubis, bot tests, etc. (“human only” internet, maybe like Berners-Lee’s “semantic web” idea [1][2])
Non “anonymized”:
non-jury court and arbitration appearances (with expansion of judges to clear backlogs [3])
medical checkups and social care (eg neurocognitive checkups for elderly, social services checkins esp. children, checkins for depressed or isolated needing offwork social interactions, etc.)
bureaucratic appointments (customer service by humans, DMV, building permits, licenses, etc.)
web browsing for routine tasks without logins (banks, email, etc)
[1] <https://www.newyorker.com/magazine/2025/10/06/tim-berners-le...> [2] <https://newtfire.org/courses/introDH/BrnrsLeeIntrnt-Lucas-Nw...> [3] <https://nysfocus.com/2025/05/30/uncap-justice-act-new-york-c...>
i prefer working in my vp and see a possible world where vp makes my remote team collaborate as if were in the office, from the comfort of the most ergonomic location in my house
it solves this problem and 0.0001% of people are dorks like me who try and say, "they did it" while the rest of the world keeps going to work as before
all of the tech problems were solvable. people simply dont want to put a thing on their face and i think thats unsolvable
Thank you! Now I get it!
So it’s sort of a stopgap solution before the ar glasses are small enough to do actual video calls without looking silly?
The vision pro’s overall productivity solution is inferior to existing, cheaper technology, and it has to be supplemented by a solution to a problem created by its own design.
Essentially you’re saying that after putting on a double headband device that wrecks my hair, gets me sweaty, strains my neck with weight, and fucks up my makeup, I now have to use a workaround fake avatar because the tech bros who made this product had to say “oh shit, if you have a headset on you can’t be on camera!”
For $3500 I can be in real reality and be surrounded by higher resolution professional monitors and just show my real self on camera instead.
Human-only Internet: why choose this implementation over something simpler? Surely there’s a simpler way to prove you’re human that doesn’t involve 3D avatar construction on a head-worn display that screws up your hair and makeup. [1] E.g., an Apple Watch-like device can verify you have a real pulse and oxygen in your blood.
Court: solution is already in place, which is showing up to a physical courtroom. Clearing backlogs can be done without a technological solution, it’s more of a process and staffing problem. Moving the judges from a court to a home office doesn’t magically make them clear cases faster.
Medical checkups: phone selfie camera
Bureaucratic appointments: solution in place, physical building, or many of these offer virtual appointments already over a webcam.
Web browsing without logins: passkeys, FaceID, fingerprint
[1] yet another male-designed tech bro product that never considered the concerns of the majority of the population.
To me it would be a shortcoming of the device if I couldn't show me and the thing I'm working on at the same time.
Why do we have 4K monitors when 1920x1080 is perfectly fine for 99.999% of use cases?
If you look at the world through this lens called "serviceability" you'll think everything is a solution looking for a problem.
I think overall it probably remains a niche category. I don't see it becoming as popular as smart watches or anything like that. I do hope that Apple continues to invest in it though as it is a really cool technology.
Some people frequently want to do that sort of work while away from their desk.
The “when you’re using the headset” part is the issue. Why are we using the headset? What are the benefits? Why am I making these tradeoffs like messing up my hair, putting a heavy device on my head, messing up my makeup, etc.
This is like saying “The Segway had advanced self-leveling to solve the problem of how to balance when you’re on an upright two wheel device”.
But why are you on an upright two wheel device? Why not just add a third wheel? Why not ride a bicycle? Why not ride a scooter?
The solution is really cool and technologically advanced but it doesn’t actually solve anything besides an artificially introduced problem.
A lot of people here work with text all day every day and we would rather work with text that looks like it came out of a laser printer than out of a fax machine.
There's regular latency due to distance, just like on a phone call if you're chatting with someone halfway across the world.
But on a normal connection, audio and the persona should always be in sync, the same way audio and video are over Zoom or FaceTime.
There shouldn't be any extra latency for the audio only.
Court: disagree in part. More judges are needed to address the severe backlogs, but as an example NYS judges oppose expansion (see [3] from previous post). A lot of calendar time is spent appearing before judges around a city (they're not all in one area) for motion hearings and the like despite all documents being electronically submitted. Also, there are frequent reschedulings when one party can't physically appear. Some state judges allow teleconference, but a lot don't. Appellate and federal courts rarely.
Checkups and social services: some secure way of monitoring client interactions and outcomes is needed. In Los Angeles, the homeless services agency has been criticized by a federal judge for incompetence [1] and more than half of the child-prostitutes in a notorious corridor were found to be "missing" from the foster system [2]. Maybe headsets are not the best answer, but govt agencies and social service NGOs need to record evidence of their efforts for accountability.
[1] <https://www.latimes.com/california/story/2025-03-31/los-ange...> [2] <https://www.nytimes.com/2025/10/26/magazine/sex-trafficking-...>
As others said, resolution is not everything. DPI and panel quality matters a lot.
A good lower resolution panel is better than a lower quality larger panel. Uniformity, backlight color, color rendering quality, DPI... all of them matters.
--
This comment has been written on a 28" 1440p monitor.
VR/AR headsets are useful for working on and demonstrating many things that we've had to compromise to fit into a 2D paradigm. Being able to be present with that 3D model has clear advantages over using, for example, a mouse with a 2D equivalent or a 3D projection.
Having to justify how the 3rd dimension is useful is probably a conversation where one party is not engaging in good faith.
The segway analogue is also pretty poor considering how useful self-balancing mobility devices have proven to be - including those which only possess a single wheel.
Because it's not. Facial expressions and body language carry gigantic amounts of information.
So many misunderstandings arise when the channel is audio-only. E.g. if a majority of people in a meeting are uneasy with something, they can see it on each others' faces, realize they're not alone, and bring it up. When it's audio-only, everyone thinks they're the only one with concerns and so maybe it's not worth interrupting what they incorrectly assume to be the general consensus over audio.
On the other hand, video calls are worse and less comfortable than audio calls.
Why do we have video calls? Because a webcam costs $1-5 to put into a laptop and bandwidth is close enough to free.
Why do we have 4K monitors? Because they only cost a small amount more than 1080p monitors and make the image sharper with not a whole lot of downsides (you can even bump them down to 1080p if you have a difficult time driving the resolution). I paid $400 for my 4K 150Hz gaming monitor so going with 1080p high refresh rate VRR would have only saved me $200 or so.
Serviceability for purpose is a spectrum and the Vision Pro is at the wrong end of it.
For more than the price of three 4K OLED 144Hz monitors, you get to don a heavy headset that messes up your hair, makes you sweaty, screws up your makeup, and you get less resolution and workspace than the monitors. Your battery lasts an hour so it’s inferior to a laptop with an external portable monitor or two. It’s actually harder to fit into a backpack than a laptop plus portable monitors since it’s not flat.
Then you have to use some complicated proprietary technology [1] to make a 3D avatar of yourself to overcome the fact that you now have a giant headset on your head and look like an idiot if you were to go on camera.
You can’t do a bunch of PC stuff on it because it’s basically running iPadOS.
This is not the same as “why are we bothering with 4K?”
[1] What will you do if Apple starts charging money for this feature?
They could never cut the price down because of it. The knockoffs used much simpler ways to balance yourself, including just changing the form factor to something more conventional that doesn’t even need balance correction (scooters and e-bikes).
If I'm doing work at my desk and I get a Zoom call, there is a 0.00% chance I will go plug in my Vision Pro to answer it. I'm just going to open the app and turn on my webcam, spatial audio be damned.
wtf apple, indeed.
For some reason people then blame their old displays rather than apple for this.
By most accounts the Vision Pro hasn’t even cracked a million sales. And that’s the best productivity-focused headset on the market.
You can say that this is a really amazing paradigm shift but if it was people would be lining up to buy it.
The first part is obvious, for the second part if you're looking at slides and docs during the whole meeting, getting a super high fidelity view of all the other participants also looking (probably) at the slides doesn't help in any way.
I mean, Google Meet has a spotlight view exactly for this reason.
I often think how stupid video call meetings are. Teams video calls are one of the few things that make every computer I own, including my M1 MPB, run the fans at full tilt. I've had my phone give me overheat warnings from showing the tile board of bored faces staring blankly at me. And yeah, honestly, it feels like a solution looking for a problem. I understand that it's not, and that some people are obsessed for various reasons (some more legitimate than others) with recreating the conference room vibe, but still.
And with monitors? This is a far more "spicy" take, but I think 1280x1024 is actually fine. Even 1024x768. Now, I have a 4K monitor at home, so don't get me wrong: I like my high DPI monitor.
But I think past 1024x768, the actual productivity gains from higher resolutions begins to rapidly dwindle. 1920x1080, especially in "small" displays (under 20 inches) can look pretty visually stunning. 4K is definitely nicer, but do we really need it?
I'm not trying to get existential with this, because what do we really "need"? But I think that, objectively, computing is divided into two very broad eras. The first era, ending around the mid 2000s, was marked by year-after-year innovation where 2-4 years brought new features that solved _real problems_, as in, features that gave users new qualitative capabilities. Think 24-bit color vs 8-bit color, or 64-bit vs 32-bit (or even 32-bit vs 16-bit). Having a webcam. Having 5+ hours of battery life on a laptop, with a real backlit AMLCD display. Having more than a few gigabytes of internal storage. Having a generic peripheral bus (USB/firewire). Having PCM audio. Having 3D hardware acceleration...
I'm not prepared to vigorously defend this thesis ;-) but it seems at about 2005-ish, the PC space had reached most of these "core qualitative features". After that, everything became better and faster, quantitatively superior versions of the same thing.
And sometimes yeah, it can feel both like it's all gone to waste on ludicrously inefficient software (Teams...), and sometimes, like modern computing did become a solution in search of a problem, in order to keep selling new hardware and software.
My take is like, make me tether with usb-c, reduce resolution and increase latency if I go over what the connection can handle. Use foveated rendering. All I want is more screens.
For now, I'm working with Virtual Desktop on my Quest 3. It's not ideal - pixel density at the edge sucks and even in center it's not quite good enough for text unless I enlarge my screens to be the size of barn doors, but I get 3 very large screens out of my m1 and that makes me happy enough. It's also lighter than an AVP, which after test driving I assume multi-hour sessions would become a literal pain in the neck.
Whatever the tradeoffs are, though, if apple offered infinite screens with text-readability I'd gladly throw money at them for the privilege.
Tinfoil hat moment - I do wonder if the AVP devs got a visit from a bat-wielding gang of monitor engineers. Apple screens ain't cheap.
One person talks about a laptop, another talks about their big coding desktop monitor, a third talks about a TV they use. None agree how much 1080p clarity makes sense for usage because the only thing quoted is resolution. This drives the assumption everyone is talking about the same sizes and viewing distances based on the resolution, which is almost never the case (before the conversation even gets to the age old debate of how much clarity is enough).
I'm sure if you ask the original commenter, they don't mean 1080p looks great for reading books at 34" just as much as GP wouldn't mean to compare screens of different sizes either.
I sometimes connect the same 24" monitor (an ASUS VZ249Q) to my M1 MacBook via USB to DP (so no intermediate electronics), and the display quality feels inferior to KDE, for example.
Same monitor allows for unlimited working for hours without eye fatigue when driven from my Linux machine. I have written countless lines of code and LaTeX documents on that panel. It rivals the comfort of my HP EliteDisplay.
> Why would I be paying all this money for this realistic telepresence when my shitbox HP laptop from Walmart has a perfectly serviceable webcam?
I gave a pretty straightforward answer for why this feature would exist in this product. People sometimes on this forums ask legitimate questions.
It's pretty clear you weren't, rather you're seeking an opportunity to merely push some tired agenda, likely tied to some personal vendetta, and you're doing a pretty piss-poor job of it.
Idk man, I do lile seeing multiple windows at once. Browser, terminal, ...
I edit video for a tech startup. High high high volume. I need 2-3 27+”1440p screens to really feel like I’ve got the desktop layout I need. I’m running an NLE (which ideally has 2 monitors on its own but I can live on 1), slack, several browser windows with HubSpot and Trello et al., system monitoring, maybe a DAW or audacity, several drives/file windows opens, a text editor for note taking, a PDF/email window with notes for an edit, terminal, the list goes on.
At home I can’t live without my 3440x1440p WS + 1440p second monitor for gaming and discord + whatever else I’ve got going. It’s ridiculous but one monitor, especially 1080p, is so confining. I had this wonderful 900p gateway I held on to until about 2 years ago. It was basically a tv screen, which was nice but just became unnecessary once I got yet another free 1080p IPS monitor from someone doing spring cleaning. I couldn’t go back. It was so cramped!
This is a bit extreme: but our livestream computer is 3 monitors plus a 4th technically: a 70” TV we use for multiview out of OBS.
I need space lol
Still, if I were to have a long-distance relationship with a tolerant partner, or one of us traveled frequently or for long periods, I would be tempted to consider these so we could watch a show or movie and hang out despite the distance.
Tinfoil pt 2: AVP might be working on a related but applicable product. Most people old enough with presbyopia issues would happily forget bifocals. <https://appleworld.today/2025/07/apple-glasses-could-look-li...>
> VR/AR headsets are useful for working on and demonstrating many things
What things?
> that we've had to compromise to fit into a 2D paradigm.
What compromises?
> Being able to be present with that 3D model has clear advantages over using, for example, a mouse with a 2D equivalent or a 3D projection.
What advantages?
I think if this was even a niche representation of the future we’d see specialized companies with 3D-oriented software like Autodesk jumping all over the the Vision Pro specifically, but they seem to be nowhere to be found. All the key players in the industry besides Meta have basically bailed, including Microsoft and Google shutting down commercial/industrial solutions that had previously been touted as successful.
I have no vendetta here, I just think that full immersion VR was the wrong play for productivity and general computing. I think that the full immersion VR market is dying and that solutions like Meta Ray-Bans and VITURE glasses are way more palatable because they are way more “normal,” including the way they eschew these moonshot paradigm-shifting technologies that might actually work very well, but nobody asked for.
Nobody wants to be a 3D avatar and work inside a headset where your view of the outside world is desaturated by cameras because it’s cringe and weird.
As a side note I will also point out that if you use a Vision Pro with a MacBook to use the secondary screen functionality (required for writing code or running apps outside the App Store) you’re basically doing the exact same thing as VITURE glasses except you paid 10x more and your battery life sucks. And you can just join a standard conference call on your glasses and essentially look normal.
As a Mac user, I find this arguable. Many of the color correction comes from the fact that Macs contain ICC profiles for tons of monitors. OTOH, if the monitor is already has accurate color rendering out of the box (e.g.: Dell UltraSharp, HP EliteDisplay), Linux (esp. KDE) has very high display quality on HiDPI monitors, too.
Buried inside Apple's $3,499 Vision Pro VR headset is a feature that continually wows me, but you've probably never heard of it. The feature, called Personas, involves two or more users, all wearing Vision Pros, chatting with one another in real time but as virtual replicas.
Now out of beta, Personas are part of Apple's avatar system for the Vision Pro, creating replicas of yourself via a 3D photo scan.
Taking a scan of myself isn't a new thing. Some five years ago, I tried telepresence with 3D-scanned avatars on Nreal AR glasses with a company called Spatial. I've gotten peeks at Meta's realistic codec avatars. I explored cartoon avatar telepresence with Microsoft in HoloLens. And I've even scanned myself into all sorts of bizarre AI deepfakes using OpenAI's Sora phone app.
Still, no one is doing anything in VR or AR headsets or glasses as advanced as Apple's Vision Personas. And we haven't seen the beginning of how good things could get.
Watch this: Apple Vision Pro's Best Feature Is Your Avatar. Could Personas End Up on an iPhone Next? | All Things Mobile
04:13
To learn more, I donned an M5 Vision Pro headset and jumped into a FaceTime for an exclusive chat with Apple's senior director of the Vision Products Group, Jeff Norris, and the senior director of product marketing, Steve Sinclair. The two showed up as Personas in my home office. We wandered in like ghosts when the meeting started, face to face, so to speak. After a few minutes, it felt like we were actually spending time together in person.
Apple doesn't discuss the future. But Norris and Sinclair did explain some of the very cool 3D tech that makes Personas seem so realistic. As we chatted, I imagined that similar scans could be done on places other than Vision Pro, like maybe your iPhone, which would be accessible to more people.
Apple's Personas seem uncanny outside the headset, but not inside.
Apple
It's hard to find another person who has a Vision Pro, but when I have, the eerie sense of someone ghost-walking into my home is like wizardry. Apple's VisionOS has evolved to allow collaboration between Personas, flexing virtual spaces out for up to five people to see and share virtual objects and apps together. Multiple people in the same room wearing Vision headsets can collaborate with Personas that can beam in remotely, as well.
I've dreamed of that Tony Stark-like, Star Wars-hologram telepresence idea for years now. It's basically here. It's just walled into very expensive hardware.
Smart glasses haven't been able to handle the load of avatars like this yet, although AR glasses from Snap and others may be trying soon. My question for Apple is: What technology is making Personas happen, and could it ever appear anywhere else?
In our meeting, Norris explains that Persona technology uses Gaussian splatting to create those surprisingly convincing 3D facial scans. Gaussian splatting is the key tech to many 3D applications right now, often applied to scanning objects or large-scale environments. Meta's Hyperscape Capture app on Quest can scan whole rooms into 3D-walkable spaces in VR, for example. It knits a 3D image or landscape from a series of 2D images using AI.
What makes Personas unique is the focus on scanning yourself instead of your environment. Using VisionOS 26, Norris showed me the key changes from the earlier Persona versions. The renders can now show greater detail at multiple angles and capture details like jewelry and eyelashes. Bodies and faces are scanned together, which makes the render feel more seamless.
"There's machine learning involved, but not many people really realize that it's a concert of networks that come together," says Norris. "We counted them up, it's over a dozen, but we actually reduced the number when we moved to this new version of Personas."
I mentioned the possibility of scanning rooms into Vision Pro down the road (apps like Scaniverse and Polycam already show off 3D scans in headsets). Norris says Apple is already applying Gaussian splatting to the spatial 3D conversions of photos, which now look weirdly immersive in headsets. So, what's next?
Vision Pro headsets can collaborate in the same space and fold in Personas from somewhere else at the same time. You just need to have one of those headsets to participate in the spatial experience.
Apple/Screenshot by Joe Maldonado/CNET
Even though the Persona scan is done via Vision Pro's headset, which requires me to hold up the headset to turn my head and scan, it's not a process that requires me to use Vision Pro's sensors extensively.
"We only need a handful of images when we are enrolling your Persona," Norris tells me. "That includes a few facial expressions to help our networks understand how your face moves when you're talking and smiling. And that's it."
I wonder whether an iPhone could eventually scan a Persona, which I'd find a lot easier than using the Vision Pro. Norris doesn't answer that directly.
"It's interesting to imagine different ways of accomplishing that," he responds. "But right now, we love that it's self-contained to the device and that all the processing happens on the device. None of these images have to go anywhere in order for that to happen."
Me in my VisionOS Persona during my first demo of the new version at WWDC earlier this year.
Apple
The single Persona I scan and bond to my Apple ID on Vision Pro feels like it's designed to act as a one-to-one mapping for my virtual self. It's the closest thing Apple has to a substitute for using a camera to broadcast my actual face, which can't be done since I'm wearing a headset.
AI companies are already scanning and generating virtual versions of people in increasing numbers of deepfakes, both intentional and unintentional. OpenAI's Sora app is the most prominent example now, and uses a similar type of face-scanning tech on iPhones to generate a "Cameo" of myself I can lend out to others.
I ask Norris where the line can be drawn going forward. He makes it clear that Apple wants to clearly and securely represent a person in real time, not as a reproduction.
"We have focused Personas on that authentic representation goal," he says. "We're trying to grant what I think is a fundamental human wish, which is: 'I wish you were here.' That begins by trying to be faithful to how we appear, and how we're moving, and how we're emoting as we speak."
Right now, Apple limits you to using one Persona scan at a time, which surprises me. I'd love a variety of Scott Stein avatars in different moods or simply with different clothes. While Apple doesn't explore identity transformation via scans, I do appreciate the options for realistic glasses, and I'd love to be able to add more accessories.
"People can reenroll or just put on a different shirt and enroll again," says Norris. "I totally understand why that would be something we'd want. But we're focusing on just the one at a time right now."
I tried using scanned avatars with Nreal AR glasses back in 2020 using an app by Spatial could use phones and headsets together. Will Apple do that too?
Spatial
I'm already thinking about more options for Personas, not just for Apple's expensive headset, but for iPhones and other devices.
What if they could be personal stand-ins on our FaceTime calls? I can already call my wife on FaceTime from Vision Pro, and she can see my 2D Persona there. She laughs at it because it feels somewhat supernatural. If Apple has already opened the doorway this much with Animoji on FaceTime, why not Personas too?
Norris insists that Personas work better in the Vision headset, which I agree with. The renderings feel more convincing, somehow. When we place ourselves in environments that are already half-composed of virtual things, these 3D-scanned identities appear more natural. But physical distance and body expressions can also happen in space. Personas can leave their box and hover around as torsos, hands and faces.
"I can tell a joke and you're gonna get it because you're gonna see my body language, and see my facial expressions that you don't see on a two-dimensional screen," says Sinclair. "Here, we're in the room together, and it feels like we really are here."
As his Persona stands next to my cluttered desk in that virtual form, I realize he's right.
Apple is already receiving feedback about this for business uses. "We're hearing about it in healthcare as well," Norris says. "Doctors who create procedures and want to train other people. They don't have to travel around the country. They can just get on a FaceTime call with their Personas."
I still see a future where iPhones, iPads, laptops and headsets all collaborate together, something companies like Microsoft and Qualcomm have pointed to as a bridge between headsets and flat-screen devices. Samsung and Google are discussing those types of connecting points with Android XR, too. Apple has an ARKit on iPhones and iPads, so the possibilities already exist.
Norris says that Personas outside of a headset would be missing something right now. "To get the full appreciation of the experience, you kind of have to have both the sensing capabilities and the incredible display capabilities. They really have to kind of come together to create a magic moment like this."
As Apple moves toward an expected line of smart glasses in the future, and inevitably toward more advanced iPhones and iPads, that philosophy could evolve. Personas are the start of a fundamental change in how we handle collaboration and connection.
For the moment, however, you'll never experience it unless you're inside a Vision Pro. I look forward to a time when the entry ticket into this magic telepresence world is far more affordable and better distributed, so more people can come aboard.
Right now, my Persona is mainly by itself. I'd love it if I could get some company more often.
Don't miss any of our unbiased tech content and lab-based reviews. Add CNET as a preferred Google source.