Multimodal model for music understanding, Generative fill for videos, Llama Impact Grants, Runway Academy, generative AI model for vector graphics,AI-powered object-aware editing, and more
Greetings and welcome to this week's AI Brews for a concise roundup of the week's major developments in AI.
In today’s issue (Issue #35):
AI Pulse: Weekly News & Insights at a Glance
AI Toolbox: Product Picks of the Week
AI Skillset: Learn & Build
🗞️🗞️ AI Pulse: Weekly News & Insights at a Glance
🔥 News
Researchers present LLark: A Multimodal Foundation Model for Music - an open-source instruction-tuned multimodal model for music understanding. LLark is trained entirely from open-source music data and models [Demo | Paper]
Researchers released LLaVA-1.5. LLaVA (Large Language and Vision Assistant) is an open-source large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. LLaVA-1.5 achieved SoTA on 11 benchmarks, with just simple modifications to the original LLaVA and completed training in ~1 day on a single 8-A100 node [Demo | Paper | GitHub].
Voice AI platform ElevenLabs released AI Dubbing tool that enables users to automatically translate any audio in a video into a different language while maintaining the original speaker’s voice [Link].
Meta AI introduced Stable Signature - a new method for watermarking images created by open source generative AI [Details].
Meta has opened Llama Impact Grants applications, which run until November 15. Proposals using Llama 2 to tackle education, environmental, and open innovation challenges may be awarded a $500K grant [Details].
Adobe introduced at Adobe Max [Details]:
Firefly Vector Model - a generative AI model for vector graphics. Text to Vector Graphic as a beta feature is available in Illustrator [Details].
Text to Template as a beta feature in Adobe Express, a feature powered by the new Firefly Design Model.
Firefly Image 2 Model - updated model that powers Firefly, its generative AI image tool. Available as beta on the Firefly web app .
Project Stardust - a generative AI-powered object-aware editing engine that lets you magically move or remove objects simply by clicking on them. For example, users can select persons in a photograph, move them to a different place in the composition and fill in the background where they were previously standing.
Project Fast Fill - lets users remove objects from a video or change backgrounds as if they were working with a still image, using a text prompt. Users only have to do this once and the edit will then propagate to the rest of the scene [Details].
Project Res Up: an experimental AI-powered upscaling tool that greatly improves the quality of low-resolution GIFs and video footage [Details].
Mistral’s paper introducing Mistral 7B - a 7-billion-parameter language model that outperforms Llama 2 13B across all evaluated benchmarks - is now on arXiv [Paper].
Replit AI makes its basic AI-powered code completion and code assistance features free for all developers on the free plan [Details].
Vercel beta released v0 a generative user interface tool that generates React code based on shadcn/ui and Tailwind CSS [Details].
Replit AI released Replit Code v1.5 - an open source 3.3B parameter Causal Language Model, trained on 1T tokens, focused on Code Completion [Hugging Face].
Microsoft may debut its first AI chip in November to mitigate cost [Details]. OpenAI is also exploring developing its own AI chips [Details].
Google cloud announced new AI-powered search capabilities that will help health-care workers quickly pull accurate clinical information from different types of medical records [Details].
Character.AI launched a new feature Character Group Chat - users can interact with multiple AI Characters and humans in the same room [Details].
🔦 Weekly Spotlight
Decomposing Language Models Into Understandable Components by Anthropic Research [Link].
2023 Kaggle AI Report [Link].
State of AI Report 2023 by Nathan and the Air Street Capital [Link].
slowllama: Finetune llama2-70b and codellama on MacBook Air without quantization [Link].
🔍 🛠️ AI Toolbox: Product Picks of the Week
Lipdub: record a video of you speaking and Lipdub will change your exact voice and lip movements to match the new language you’ve selected.
Uizard: AI-powered UI design tool for designing wireframes, mockups, and prototypes. Generates UI designs from text prompts, converts hand-drawn sketches into wireframes, and transforms screenshots into editable designs.
Moonvalley: text-to-video AI engine to create high definition, cinematic video and animations across multiple visual styles.
📕 📚 AI Skillset: Learn & Build
The new Runway Academy is now live featuring tutorials, AMAs and deep-dives for Runway’s generative AI tools [Link].
Multimodality and Large Multimodal Models (LMMs) - a detailed post covering from the fundamentals of a multimodal system to active research areas [Link].
Fine-tuning Mistral 7B on your own data [Link].
AI Brews is free, and your sharing it with a friend helps us grow. Thanks for your support and have a nice weekend! 🎉 Mariam.