Dnext

May 22, 2024 2:34am

Google had a plethora of AI announcements at Google I/O.

Gemini Advanced subscribers now have access to the newest model, Gemini 1.5, which has a 1-million-token context window. In practice the "context window" is the combined size of your prompt and any underlying "system message" that the creators of the system put in.

Google demoed a "Ask your photos" feature where you can ask questions like "What's my license plate number" and it searches all your photos and finds your license plate number and tells you what it is. You can ask when your kid learned how to swim. You can ask questions of your Gmail, such as "summarize all the announcements from my kid's school".

Google is working towards AI agents that will do multiple steps for you instead of just answering one question. You could tell it to complete a task for you and then it will go and try to complete all of those steps to complete the task. "Return these shoes for me." It figures out where the shoes came from, how much they cost, how to contact customer support, and then it actually contacts the shoe seller.

Their lightweight model is called Gemini 1.5 Flash and is designed to run on mobile phones.

Project Astra is their attempt to create a real-time AI agent that uses the camera on your phone. You can ask it to explain what you're looking at, "What is this part of the speaker called?" or ask it to make up rhymes.

Google's response to OpenAI's Sora is a video generation model called Veo.

Google is rolling out an "AI Overview" in Google Search. (I've already seen it.) It uses what they call "multi-step reasoning". You should be able to ask Google Search "multi-step questions".

They're building AI into Android phones that can detect if you're potentially talking to a scammer.

They're open-sourcing a 2-billion parameter model called Gemma 2.

Google just took over the AI world (a full breakdown) - Matt Wolfe

#solidstatelife #ai #genai #llms #multimodal #edgecomputing #google

There are no comments yet.