- Himanshu Ramchandani
- Posts
- Google's Project Astra beating GPT-4o - Business to AI
Google's Project Astra beating GPT-4o - Business to AI
Business Objective to AI Objective, Multimodal, AI Community
Hi You!
Actionable Tips and Insights every Saturday about AI Leadership, building Data Teams/Products, and Intuition around AI technology.
Your AI path will be more valuable with this addition.
Happy AI.
Todays Content β
Project Astra by Googleπ§ β Features
AI-assistedπ€ β the augmentation
Conceptπ β
What is a Multimodal?
How does Multimodal AI work?
AI Leadershipπ β Business Objective to AI Objective
Newsletter Sponsorπ β Prompts Daily
AI Titans Skool Communityβ β 74 seats to price hike
Project Astra by Googleπ§
Itβs an AI assistant.
Astra is built on Gemini models.
It can process multimode information and understand the context.
Google gave a disclaimer before the video that there is no cut. (unlike OpenAI GPT-4o)
The video shows how you can use your phone's camera to detect objects in the room.
Astra does it pretty well.
Letβs build our intuition around some terminologies π
AI-assistedπ€
AI-assisted war is going on in companies on this.
The AI we see right now is a tool that will assist us make our work better.
It has a huge knowledge base from which we can answer all our questions.
The term we use for AI-assisted is AI Augmentation, and the best person for this is Tobias Zwingmann on LinkedIn.
I love his work.
Conceptπ
What is a Multimodal?
Multimodal uses data fusion techniques to integrate different data types and build a more complete and accurate data understanding.
It combines text, images, and audio clips to explain a concept.
Once you get the input, the fusion model will come into play.
The work of the fusion model combines relevant information from different modalities, which will include β
Transformer models
Attention mechanisms
Graph convolutional networks
How does Multimodal AI work?
Image source - unknown
Follow the data in the below 4 stages β
1οΈβ£ The smartphone can gather (data collection)
Text input by typing
Voice commands via a microphone
Visual information through its camera
2οΈβ£ The data will be processed using( data processing)
NLP for text
Computer Vision for images
Speech Recognition for audio processing
3οΈβ£ The system will integrate all these types of data (data integration)
Sensor data with image
Synchronizing text with audio
Video with audio for the analysis of a scene
Text data for augmented reality applications
4οΈβ£ Analysis and decision-making is done by using ML algorithms to perform (delivery)
Object recognition
Sentiment analysis
Real-time translation
It has versatile capabilities. Pretty cool, No?
So you and I are going to use this in our products.
The first step is to clearly define our business objective and transform it into an AI objective.
AI Leadershipπ
Business Objective to AI Objective
Business objective into ML Objective, not the other way around, examples.
Traditional Machine Learning β
Business Objective | ML Objective |
---|---|
Youtube(increase user engagement) | maximize the time users spend watching videos |
Instagram (improve the platform safety) | accurately predict if a given content is harmful |
Multimodal Generative AI β Business Objective into AI Objective
Business Objective | Multimodal AI Objective |
---|---|
Amazon(Increase sales by improving product recommendations) | visual (images of products) and textual (product descriptions, customer reviews) data to provide highly personalized product recommendations(with ranking) |
Education Tech (improve learning outcomes by providing personalized study resources) | textual (student's notes, textbooks) and visual (diagrams, video lectures) to create customized study plans and resources for each student's learning style and progress(knowledge graph can be used) |
Once it is defined, we can move towards quality data collection.
Always follow a framework to solve all the AI problems.
Have you heard of Prompts Daily newsletter? I recently came across it and absolutely love it.
AI news, insights, tools and workflows. If you want to keep up with the business of AI, you need to be subscribed to the newsletter (itβs free).
Read by executives from industry-leading companies like Google, Hubspot, Meta, and more.
Want to receive daily intel on the latest in business/AI?
AI Skool Community β 74 seats to price hike
75 seats remaining. Unlock it to avoid a price hike.
Get updates on WhatsApp β
Recommended Reads
Be part of 50,000+ like-minded AI professionals across the platform
β Telegram β Discord Server
How satisfied are you with today's Newsletter?This will help me serve you better |
Please reply to this email with your requirements or suggestions on what you want in future newsletter content.
PS: build your newsletter, β Here
Reply