Making robots learn new skills, Any-Modality Augmented Language Model, Canva's MagicStudio, Analogical Prompting, wearable AI and more
Greetings and welcome to this week's AI Brews for a concise roundup of the week's major developments in AI.
In today’s issue (Issue #34):
AI Pulse: Weekly News & Insights at a Glance
AI Toolbox: Product Picks of the Week
AI Skillset: Learn & Build
🗞️🗞️ AI Pulse: Weekly News & Insights at a Glance
🔥 News
Google DeepMind introduced 𝗥𝗧-𝗫: a generalist AI model to help advance how robots can learn new skills. To train it, DeepMind together with 33 academic labs developed Open X-Embodiment, a massive open dataset that compiles over 500 skills and 150,000 tasks from 22 robots types. It is the most comprehensive robotics dataset of its kind released to accelerate the development of multi-robot models that could be trained to generalize across platforms, scenes, objects and tasks. [Details].
Researchers from Meta AI present Any-Modality Augmented Language Model (AnyMAL), a unified model that understands multiple inputs (vision, audio, motion sensor signals). When multiple modalities are interleaved and given as input the model reasons over them jointly [Paper].
Perplexity announced pplx-api - an API designed to be one of the fastest ways to access open-source models including Mistral 7B, Llama2 13B, Code Llama 34B, and Llama2 70B [Details].
Researchers from Google DeepMind and Stanford University introduce a new prompting approach, Analogical Prompting, designed to automatically guide the reasoning process of large language models. Inspired by how humans recall relevant past experiences when tackling new problems, the approach makes LLMs self-generate relevant exemplars or knowledge in context, prior to problem solving [Paper].
Canva launched MagicStudio - a rich collection of generative AI tools for images, videos, presentations, animation and content. Runway’s Gen-2 will be accessible directly in Canva with its new Magic Media app [Details].
Google introduced Assistant with Bard, a personal assistant powered by generative AI. It combines Bard’s generative and reasoning capabilities with Assistant’s personalized help. You can interact with it through text, voice or images — and it can even help take actions for you. It will be available on Android and iOS [Details].
Reka released the first version (in private preview) of their multimodal assistant Yasa-1. In addition to being multimodal, Yasa features long context document processing, fast natively-optimized retrieval augmented generation, multilingual support (20 languages), search engine interface, and a code interpreter [Details].
Rewind AI is launching Rewind Pendant - a wearable that captures what you say and hear in the real world and then transcribes, encrypts, and stores it entirely locally on your phone [Details].
Stability AI launched an experimental version of Stable LM 3B - a compact language model designed to operate on handhelds and laptops, that outperforms previous versions in text production speed and scores higher on natural language processing benchmarks [Details].
Pixel 8 Pro runs Google’s generative AI models on-device [Details].
Google introduced 4 AI-powered photo and video features on Pixel 8 and Pixel 8 Pro that enable removing unwanted audio, expressions, objects, and more from videos and photos on the new Pixel phones. [Details ].
Poe announced API v2 which now lets developers query any other bot on Poe for free, and the ability to return images. Poe is also hosting an online hackathon for developers on October 07 [Details].
LinkedIn announced new AI-powered tools for learning, recruitment, marketing and sales [Details].
Microsoft makes DALL-E 3 available in Bing Chat and on Bing.com/create, for free [Details].
StableDiffusionXL is now available on Poe [Details].
Humane debuts its first AI device, the Ai Pin, a sensor-equipped wearable designed for intuitive interactions, at Coperni's Paris fashion show, with a full reveal set for November 9 [Details].
🔦 Weekly Spotlight
A 166 pages report that analyses GPT-4V(ision) capabilities spanning a variety of domains and tasks [Link].
A Twitter/X thread on the finalists of the AI in Motion hackathon. The winner is a Roomba powered with computer vision that tracks, stores, and remembers all your misplaced items [Link].
Tweet2Film by Runway: A collaborative film produced right inside this Twitter/X thread [Link].
localpilot: use GitHub Copilot locally on your Macbook with one-click [Link].
AI Flow: an open source, user-friendly UI application that empowers you to seamlessly connect multiple AI models together, specifically leveraging the capabilities of ChatGPT [Link].
🔍 🛠️ AI Toolbox: Product Picks of the Week
Wois: an AI tool that provides insight into your communication patterns with speaking and behavioral analysis, generates audiovisual content and connects you to a global network of experts.
Induced: Creates virtual AI workers that can automate the execution of workflows on a browser in the cloud with human-like reasoning.
Formless: Collects form responses through natural conversations rather than traditional form structures.
📕 📚 AI Skillset: Learn & Build
Prompt engineering for Claude's long context window [Link].
A guide on generating AI videos with AnimateDiff in ComfyUI [Link].
Towards the AI Agent Ecosystem - A Technical Guide for Founders & Operators Building Agents [Link].
AI Brews is free, and your sharing it with a friend helps us grow. Thanks for your support and have a nice weekend! 🎉 Mariam.