March 17, 2026 — Artificial intelligence systems are gaining the ability to remember text, but a critical frontier remains: visual memory. Startup Memories.ai is building what it calls the “visual memory layer” to enable AI in wearables and robots to recall what they see, a capability its founders argue is essential for success in the physical world.
Nvidia Partnership Fuels Development
Memories.ai announced a collaboration with Nvidia at the chipmaker’s GTC conference. The partnership provides the startup with access to Nvidia’s Cosmos Reason 2, a reasoning vision language model, and the Nvidia Metropolis application framework for video search and summarization.
CEO Shawn Shen told TechCrunch the technology is foundational. “AI is already doing really well in the digital world, what about the physical world?” Shen said. “AI wearables, robotics need memories as well. Ultimately, you need AI to have visual memories. We believe in that future.”
The company’s approach addresses a gap in current AI memory development. While OpenAI, Google’s Gemini, and xAI have launched memory features for chatbots in recent years, those tools primarily handle text. Visual data, which is less structured and more complex to index, is crucial for devices that interact with the world through cameras and sensors.
From Meta Glasses to a New Venture
The idea for Memories.ai originated while Shen and co-founder Ben Zhou were building the AI system behind Meta’s Ray-Ban smart glasses. They realized a key problem: users needed a way to recall the video data the glasses recorded.
After finding no existing solutions, they spun out of Meta to launch Memories.ai in 2024. The startup has raised $16 million to date through seed funding led by Susa Ventures, with participation from Seedcamp, Fusion Fund, and Crane Venture Partners.
Building the visual memory layer required solving two core challenges. The first involved creating infrastructure to embed and index video into a storable, recallable data format. The second necessitated collecting the right video data to train their models.
Hardware for Data, Software for Memory
To gather training data, the company built a custom hardware device called LUCI, worn by data collectors to record video. Shen emphasized they do not plan to become a hardware company. They created LUCI because commercial video recorders prioritized high-definition formats that drained battery life, which was unsuitable for their needs.
On the software side, Memories.ai launched its large visual memory model (LVMM) in July 2025. Shen compared its function to a smaller, specialized version of Google’s multimodal Gemini Embedding 2 model. The company released a second-generation LVMM and has signed a partnership with Qualcomm to run its models on Qualcomm processors.
Memories.ai is already working with several major wearable companies, though Shen declined to name them. For now, the startup is focusing on the model and infrastructure layer, anticipating broader market adoption in the future. “We think the wearables and robotics market will come, but it’s probably just not now,” Shen said.
The development signals a growing focus on equipping AI with contextual, sensory memory. As Nvidia and other chipmakers advance hardware for edge AI, startups like Memories.ai are building the software layers to make those systems more useful and autonomous in real-world environments.
Updated insights and analysis added for better clarity.
This article was produced with AI assistance and reviewed by our editorial team for accuracy and quality.
