San Francisco: Google is mulling to create a bird’s-eye view of users’ lives using their phone data, photos and searches and then applying AI technology, the media reported.
Titled “Project Ellmann,” the idea is to use large language models (LLMs) like latest Gemini model to utilise search results, find patterns in a user’s photos, create a chatbot and “answer previously impossible questions,” reports CNBC.
“Project Ellman is just one of many ways Google is proposing to create or improve its products with AI technology,” the report mentioned.
Google Photos has more than 1 billion users and 4 trillion photos and videos.
“Project Ellmann” could pull in context using biographies, previous moments and subsequent photos to describe a user’s photos more deeply than “just pixels with labels and metadata”>
“We can’t answer tough questions or tell good stories without a bird’s-eye view of your life,” according to the project.
“We trawl through your photos, looking at their tags and locations to identify a meaningful moment. When we step back and understand your life in its entirety, your overarching story becomes clear,” according to the project presentation.
“With ‘Ellmann Chat,’ imagine opening ChatGPT but it already knows everything about your life. What would you ask it?”
A Google spokesperson said that Google Photos has always used AI to help people search their photos and videos, and “we’re excited about the potential of LLMs to unlock even more helpful experiences”.
“This was an early internal exploration and, as always, should we decide to roll out new features, we would take the time needed to ensure they were helpful to people, and designed to protect users’ privacy and safety as our top priority,” the spokesperson added.
Google this week introduced Gemini, its most capable and general model yet with state-of-the-art performance across many leading benchmarks.
In the coming months, Gemini will be available in more of Google products and services like Search, Ads, Chrome and Duet AI.
According to Google, Gemini 1.0’s sophisticated multimodal reasoning capabilities can help make sense of complex written and visual information.
–IANS