Stable Artisan, ElevenLabs Music, Conversational AI Teams, AlphaFold 3, EMOPortraits, Retrieval Augmented Fine Tuning and more
Hi. Welcome to this week's AI Brews for a concise roundup of the week's major developments in AI.
In today’s issue (Issue #62):
AI Pulse: Weekly News & Insights at a Glance
AI Toolbox: Product Picks of the Week
🗞️🗞️ AI Pulse: Weekly News & Insights at a Glance
🔥 News
Google DeepMind and Isomorphic Labs introduced AlphaFold 3, a new AI model for predicting the biomolecular structures and interactions of proteins, DNA, RNA, small molecules, and more with unprecedented accuracy. AlphaFold Server, a free web-service powered by AlphaFold 3, has also been developed by Google available for free for non-commercial research [Details].
DeepSeek released DeepSeek-V2, an open-source MoE model that specializes in math, code and reasoning. It comprises 236B total parameters, of which 21B are activated for each token. DeepSeek-V2 delivers impressive results on current major large model leaderboards [Details].
Refuel.ai released RefuelLLM-2 and RefuelLLM-2-small, open-source large language models purpose built for data labeling, enrichment and cleaning. RefuelLLM-2 outperforms GPT-4-Turbo, Claude-3-Opus and Gemini-1.5-Pro , across a benchmark of ~30 data labeling tasks. RefuelLLM-2 is a Mixtral-8x7B base model [Details | Playground]
A team of UC Berkeley researchers, known for Gorilla LLM, presents a new fine-tuning approach RAFT - Retrieval Augmented Fine Tuning. RAFT sits in the middle-ground between RAG (Retrieval Augmented Generation) and DSF (Domain-specific Supervised Fine-Tuning) . It simultaneously primes the LLM on domain knowledge and style ( DSF), while improving the quality of generated answers from the retrieved context . The RAFT model is trained on top of the base model Llama2-7B [Details].
Stability AI launched Stable Artisan - a bot that enables media generation on Discord powered by Stability AI’s image and video models, Stable Diffusion 3, Stable Video Diffusion, and Stable Image Core [Details].
TikTok will automatically label AI-generated content created on platforms like DALL·E 3 [Details].
Apple unveiled the M4, its first chip designed specifically for AI from the ground up, built on a 3nm architecture and set to power the new iPad Pro and future Macs [Details].
Imperial College London researchers introduced the EMOPortraits model, which significantly improves the realism of intense and asymmetric expressions of head avatars. Dataset and code will be released by July 2024 [Details].
Microsoft and LinkedIn release the 2024 Work Trend Index on the state of AI at work. 78% of AI users are bringing their own AI tools to work (BYOAI). This includes employees across every age group and not just Gen Z. 52% of people who use AI at work are reluctant to admit to using it for their most important tasks [Details].
Eleven Labs previews ElevenLabs Music - all the songs in the thread were generated from a single text prompt with no edits [Link].
PremAI released open-source Small Language Foundation Models, Prem 1B and Prem 1B chat that excel at RAG. Prem 1B is a transformer-based SLM, similar to TinyLlama, and uses Flash Attention. It supports context length up to 8192 tokens [Details].
IBM open-sourced its Granite code models for code generative tasks, trained with code written in 116 programming languages. The Granite code models family consists of models ranging in size from 3 to 34 billion parameters, in both a base model and instruction-following model variants [Details].
OpenAI released the first draft of the Model Spec, a document that specifies desired behavior for the models in the OpenAI API and ChatGPT. Model behavior is the way that models respond to input from users—encompassing tone, personality, response length, and more [Details | Model Spec].
Microsoft is working on a new in-house large-scale AI language model called MAI-1 with ~500B parameters. The development of MAI-1 is being led by Mustafa Suleyman, the former Google AI leader who recently served as CEO of the AI startup Inflection acquired by Microsoft [Details].
OpenAI is readying a search product to rival Google, Perplexity. The feature would let ChatGPT users search the web and cite sources in its results [Details].
X launched Stories, delivering news summarized by Grok AI [Details].
The mysterious gpt2-chatbot model that showed up in the LMSYS arena a few days ago was suspected to be a testing preview of a new OpenAI model. This has now been confirmed, thanks to a 429 rate limit error message that exposes details from the underlying OpenAI API platform [Details].
🔦 Weekly Spotlight
Artificial Analysis LLM Performance Leaderboard - a leaderboard evaluating price, speed and quality across >100 serverless LLM API endpoints [Link]
AI Index: State of AI in 13 Charts - Stanford University’s report [Link].
A human-size robot balancing on a ball that acts as a spherical wheel can push wheelchairs as smoothly as a human assistant [Video].
How to train your first machine learning model and run it inside your iOS app via CoreML [Link].
Building Agentic RAG with LlamaIndex - new free short course on DeepLearning.ai [Link].
VLM-1 - visual API (early-preview) to extract rich and structured data (e.g. JSON) accurately from visual content like images, videos, PDFs etc. [Link].
AI engineers report burnout and rushed rollouts as ‘rat race’ to stay competitive hits tech industry [Link].
🔍 🛠️ AI Toolbox: Product Picks of the Week
Udio: AI music and song generation from prompts. The app now includes a new feature, Audio Inpainting - you can select a portion of a track to re-generate based on the surrounding context. This makes it easy to edit single vocal lines, correct errors etc.
ChatLabs: AI chat app to access 30 Al Models with a single subscription. The new Split Screen mode lets you interact with two different AI models at the same time.
Synthflow: Create AI voice assistants to make outbound calls, answer inbound calls, and schedule appointments, without coding. Synthflow has now introduced Conversational AI Teams to automate customer interactions with teams of intelligent AI voice assistants that talk to customers & each other.
Supertone Shift: Real-time voice changer.
Paddleboat: Empowers sales teams to practice cold calling, handle objections and close more deals with realistic AI roleplays.
Last week’s issue
You can support my work via BuyMeaCoffee.
Thanks for reading and have a nice weekend! 🎉 Mariam.
Just wanted to say I appreciate your roundups, very useful!