Jamba hybrid SSM-Transformer Model, empathic LLM, Databrix MoE model, Animation driven by Audio, Qwen1.5-MoE, generative AI nurses and more
Hi. Welcome to this week's AI Brews for a concise roundup of the week's major developments in AI.
In today’s issue (Issue #57 ):
AI Pulse: Weekly News & Insights at a Glance
AI Toolbox: Product Picks of the Week
🗞️🗞️ AI Pulse: Weekly News & Insights at a Glance
🔥 News
AI21 Labs introduced Jamba, a production-grade Mamba based model. By enhancing Mamba Structured State Space model (SSM) technology with elements of the traditional Transformer architecture, Jamba compensates for the inherent limitations of a pure SSM model. Jamba optimizes for memory, throughput, and performance—all at once. It outperforms or matches other state-of-the-art models in its size class. Jamba has been released with open weights, licensed under Apache 2.0. Available on Hugging Face and coming soon to the NVIDIA API catalog [Details].
Databricks introduced DBRX, an open, general-purpose LLM that uses a fine-grained mixture-of-experts (MoE) architecture with 132B total parameters of which 36B parameters are active on any input. Across a range of standard benchmarks, DBRX outperforms open LLMs like Mixtral, LLaMA2-70B and Grok-1. It surpasses GPT-3.5, and it is competitive with Gemini 1.0 Pro. It is an especially capable code model, surpassing specialized models like CodeLLaMA-70B on programming, in addition to its strength as a general-purpose LLM. The model is also available on the Databricks API [Details].
Hume AI released Empathic Voice Interface (EVI), a first-of-its-kind conversational AI with emotional intelligence. EVI uses a new form of multimodal generative AI that integrates large language models (LLMs) with expression measures, which Hume refers to as an empathic large language model (eLLM). The eLLM enables EVI to adjust the words it uses and its tone of voice based on the context and the user’s emotional expressions [Demo | Details | wait list].
Tencent introduced AniPortrait, a novel framework for generating high-quality animation driven by audio and a reference portrait image. Code and model weights have been released [Paper | GitHub].
X announced an update to its AI chatbot Grok-1.5, with improvements in performance in coding and math-related tasks, and a context length of 128,000 tokens. Grok-1.5 will soon be available to early testers. Earlier Elon Musk had announced that all Premium subscribers on X will gain access to Grok this week, not just those on Premium+, as before [Details]
Qwen (Alibaba Cloud) released Qwen1.5-MoE, a 2.7B, a small MoE model with only 2.7 billion activated parameters yet matching the performance of state-of-the-art 7B models like Mistral 7B and Qwen1.5-7B. Compared to Qwen1.5-7B, which contains 6.5 billion non-embedding parameters, it achieves a 75% decrease in training expenses and accelerates inference speed by a factor of 1.74 [Details].
Claude 3 models dominates LMSYS Chatbot Arena Leaderboard. Claude 3 Opus tops the list beating GPT-4 Turbo, while Claude 3 Sonnet outperform older GPT-4 models and Claude 3 Haiku beating Mistral Large [Link].
Adobe introduces structure reference feature for Firefly AI and GenStudio for brands. It enables taking one image and generating new ones that may be completely different stylistically, but whose internal elements are arranged and sized similarly to the first image [Details].
Mata AI introduced OPT2I, a a training-free text-to-image (T2I) optimization-by-prompting framework that provides refined prompts for a T2I model that
improve prompt-image consistency. The framework starts from a user prompt and iteratively generates revised prompts with the goal of maximizing a consistency score. OPT2I can boost the prompt-image consistency by up to 24.9% [Paper]
OpenAI has started testing usage-based GPT earnings by partnering with a small group of US builders [Details].
Adobe introduced Firefly Services and Custom Models. Firefly Services makes over 20 new generative and creative APIs available to developers. Custom Models, allows businesses to fine tune Firefly models based on their assets [Details].
Nvidia announced a collaboration with Hippocratic AI , a healthcare company that is offering generative AI nurses, that range in specialties from “Colonoscopy Screening” to “Breast Cancer Care Manager,” and work for $9 an hour [Details].
Worldcoin Foundation open-sourced the core components of its iris-scanning Orb’s software [Details].
Emad Mostaque resigned from his role as CEO of Stability AI and from his position on the Board of Directors of the company to pursue decentralized AI [Details]
Stability AI released Stable Code Instruct 3B, an instruction-tuned Code LM based on Stable Code 3B. With natural language prompting, this model can handle a variety of tasks such as code generation, math and other software development related queries [Details].
Mistral AI released Mistral-7B-v0.2 Base model. This is the base model behind Mistral-7B-Instruct-v0.2 released in Dec, 2023 [Details]
Open AI shared new examples of the Sora generations by visual artists, designers, creative directors and filmmakers [Details].
🔦 Weekly Spotlight
A little guide to building Large Language Models in 2024 by Thomas Wolf, co-founder Hugging Face [YouTube]
Winners from Mistral AI hackathon [Link].
Hackers can read private AI-assistant chats even though they’re encrypted [Link]
Google AI Hackathon - Build a creative app that uses Google’s Generative AI tools. Deadline: 3 May 2024 [Link].
Devika Agentic AI Software Engineer - an open-source alternative to Devin by Cognition AI [Link].
Here’s why AI search engines really can’t kill Google [Link].
Pricing and Packaging Your B2B or Prosumer Generative AI Feature by a16 [Link].
Rachet: A web-first, cross-platform ML developer toolkit [Link].
Towards 1-bit Machine Learning Models [Link].
🔍 🛠️ AI Toolbox: Product Picks of the Week
Martin: A personal AI long-term memory you can text, email, or talk to. Integrated with calendar, email, search, and more
Suno AI V3: v3 enables you to make full, two-minute songs in seconds and is now available to all users. v3, is the first model capable of producing radio-quality music.
Bezi AI: Generate 3D models using text prompts
AirTable AI: Use AI to analyze, organize, and connect the workflows and information you have in Airtable.
Delfiny AI: AI-powered Digital Marketing Assistants for personalized meetings. It provides advice and insights on your digital advertising strategies
You can support my work via BuyMeaCoffee.
Thanks for reading and have a nice weekend! 🎉 Mariam.
What happened, no more ai brews? I allways check the mail hyped to see your weekly resume!