You

What's the outcome?

The final output is a working proof of concept for an embodied AI presence that can direct attention into a shared room. Pointo combines a kiosk interface, a pointing behavior, and a small interaction library that makes the system readable without needing a long explanation.

Final demo moments from the prototype

Why build it?

We are at a moment where AI has genuinely moved beyond the screen. Spatial computing, ambient intelligence, AI in physical workflows — these aren’t speculative anymore. But the way we actually interact with these systems hasn’t shifted. It’s still chat boxes, voice assistants, screens. The interface layer hasn’t caught up with what the technology can now do. And that gap is most visible when the work you’re doing is physical — when you’re standing in a space, trying to navigate it, build something, find where something is. Language consistently falls short. You can describe where something is, but the person always has to do the translation from words into physical reality themselves. The AI knows, but it cannot show. This thesis asks what it affords when it can.

Pointing has always been central to how we interact with computers. From the very beginning, it’s been the human doing the pointing. AI made it abstract. Endless capability, endless possibility —All of it locked inside a text box. The interface got smaller as the intelligence got larger.

Premise exploration

What's the approach?

The project moved through a few testing pivots before settling into a pointing-centered concept. Early exploration clarified what needed to be legible, what could stay abstract, and which parts of the behavior had to remain stable so the interaction felt intentional instead of random.

The SXL framing helped narrow the work around a single interaction loop: establish presence, indicate direction, and let the pointing gesture do the communicative work. From there, the testing loop was used to tighten the gesture vocabulary and remove anything that competed with the core spatial signal.

Testing pivot iterations
Iteration and correction during testing

How does it work?

Three architecture diagrams map the system from concept to implementation. The repeated pattern across those sketches is separation: sensing and control are kept distinct from the visible kiosk so the interface stays understandable even as the behavior underneath becomes more capable.

That split matters because the system is trying to do something subtle. It has to communicate where attention should go while still feeling like a single coherent presence in the room. The architecture therefore prioritizes a simple visible front end, a reliable interaction loop, and a compact set of behaviors that can be tested in sequence.

What can it do?

The prototype can situate itself in context, load a kiosk-style interface, and express a small set of gestures that make its intent legible. The kiosk became the place where the interaction gets grounded: it gives the pointing behavior a visible center of gravity and turns the AI from a floating idea into something people can read in space.

Pointo kiosk interface
Kiosk interface preview
Gesture library 1
Gesture library 2

How is it built?

The build materials show the physical and interaction sketching that shaped the final behavior. Pointo was developed through low-fidelity diagrams, browser-based prototype work, and repeated refinement of the pointing behavior until the system felt stable enough to test in a shared setting.

PointoBot is built on the open-source Emoto v2 hardware foundation by Lucas, Gautam and Marisa. I adapted that platform for this thesis by replacing the phone mount with a laser pointer and shaping a custom gesture library for spatial communication in a shared room.