Phi-2, Mixtral, GPT-4-powered humanoid, Fully transparent open-source LLMs, Open AI Grants, AI website builder for solopreneurs, and more
Greetings and welcome to this week's AI Brews for a concise roundup of the week's major developments in AI.
In today’s issue (Issue #44):
AI Pulse: Weekly News & Insights at a Glance
AI Toolbox: Product Picks of the Week
AI Skillset: Learn & Build
🗞️🗞️ AI Pulse: Weekly News & Insights at a Glance
🔥 News
Microsoft Research released Phi-2 , a 2.7 billion-parameter language model. Phi-2 surpasses larger models like 7B Mistral and 13B Llama-2 in benchmarks, and outperforms 25x larger Llama-2-70B model on muti-step reasoning tasks, i.e., coding and math. Phi-2 matches or outperforms the recently-announced Google Gemini Nano 2 [Details | Hugging Face].
University of Tokyo researchers have built Alter3, a humanoid robot powered by GPT-4 that is capable of generating spontaneous motion. It can adopt various poses, such as a 'selfie' stance or 'pretending to be a ghost,' and generate sequences of actions over time without explicit programming for each body part.[Details | Paper] .
Mistral AI released Mixtral 8x7B, a high-quality sparse mixture of experts model (SMoE) with open weights. Licensed under Apache 2.0. Mixtral outperforms Llama 2 70B on most benchmarks with 6x faster inference and matches or outperforms GPT3.5 on most standard benchmarks. It supports a context length of 32k tokens [Details].
Mistral AI announced La plateforme, an early developer platform in beta, for access to Mistral models via API. [Details].
Deci released DeciLM-7B under Apache 2.0 that surpasses its competitors in the 7 billion-parameter class, including the previous frontrunner, Mistral 7B [Details].
Researchers from Indiana University have developed a biocomputing system consisting of living human brain cells that learnt to recognise the voice of one individual from hundreds of sound clips [Details].
Resemble AI released Resemble Enhance, an open-source speech enhancement model that transforms noisy audio into noteworthy speech [Details | Hugging Face].
Stability AI introduced Stability AI Membership. Professional or Enterprise membership allows the use of all of the Stability AI Core Models commercially [Details].
Google DeepMind introduced Imagen 2, text-to-image diffusion model for delivering photorealistic outputs, rendering text, realistic hands and human faces Imagen 2 on Vertex AI is now generally available [Details].
LLM360, a framework for fully transparent open-source LLMs launched in a collaboration between Petuum, MBZUAI, and Cerebras. LLM360 goes beyond model weights and includes releasing all of the intermediate checkpoints (up to 360!) collected during training, all of the training data (and its mapping to checkpoints), all collected metrics (e.g., loss, gradient norm, evaluation results), and all source code for preprocessing data and model training. The first two models released under LLM360 are Amber and CrystalCoder. Amber is a 7B English LLM and CrystalCoder is a 7B code & text LLM that combines the best of StarCoder & Llama [Details |Paper].
Mozilla announced Solo, an AI website builder for solopreneurs [Details].
Google has made Gemini Pro available for developers via the Gemini API. The free tier includes 60 free queries per minute [Details].
OpenAI announced Superalignment Fast Grants in partnership with Eric Schmidt: a $10M grants program to support technical research towards ensuring superhuman AI systems are aligned and safe. No prior experience working on alignment is required [Details].
OpenAI Startup Fund announced the opening of applications for Converge 2: the second cohort of their six-week program for engineers, designers, researchers, and product builders using AI [Details].
Stability AI released Stable Zero123, a model based on Zero123 for 3D object generation from single images. Stable Zero123 produces notably improved results compared to the previous state-of-the-art, Zero123-XL [Details].
Anthropic announced that users can now call Claude in Google Sheets with the Claude for Sheets extension [Details].
ByteDance introduced StemGen, a music generation model that can listen and respond to musical context [Details].
Together AI & Cartesia AI, released Mamba-3B-SlimPJ, a Mamba model with 3B parameters trained on 600B tokens on the SlimPajama dataset, under the Apache 2 license. Mamba-3B-SlimPJ matches the quality of very strong Transformers (BTLM-3B-8K), with 17% fewer training FLOPs [Details].
OpenAI has re-enabled chatgpt plus subscriptions [Link].
Tesla unveiled its latest humanoid robot, Optimus Gen 2, that is 30% faster, 10 kg lighter, and has sensors on all fingers [Details].
Together AI introduced StripedHyena 7B — an open source model using an architecture that goes beyond Transformers achieving faster performance and longer context. This release includes StripedHyena-Hessian-7B (SH 7B), a base model, & StripedHyena-Nous-7B (SH-N 7B), a chat model [Details].
Google’s AI-assisted NotebookLM note-taking app is now open to users in the US [Details].
Anyscale announced the introduction of JSON mode and function calling capabilities on Anyscale Endpoints, significantly enhancing the usability of open models. Currently available in preview for the
Mistral-7B
model [Details].Together AI made Mixtral available with over 100 tokens per second for $0.0006/1K tokens through their platform; Together claimed this as the fastest performance at the lowest price [Details].
Runway announced a new long-term research around ‘general world models’ that build an internal representation of an environment, and use it to simulate future events within that environment [Details].
European Union officials have reached a provisional deal on the world's first comprehensive laws to regulate the use of artificial intelligence [Details].
Google’s Duet AI for Developers, the suite of AI-powered assistance tools for code completion and generation announced earlier this year, is now generally available and and will soon use the Gemini model [Details].
a16z announced the recipients of the second batch of a16z Open Source AI Grant [Details].
🔦 Weekly Spotlight
Channel 1 demos AI-powered newscast with hyperrealistic AI anchors [Link].
VGen: an open-source video synthesis codebase by the Alibaba Group, featuring state-of-the-art video generative models, including the latest image-to-video model I2VGen-XL [Link].
DeciLM-7B vs Mistral 7B on Chain of Thought Tasks [Link].
A Remake of the Google Gemini demo using GPT-4 (along with code) [Link].
Promptbase by Microsoft: an evolving collection of resources, best practices, and example scripts for eliciting the best performance from foundation models like GPT-4 [GitHub].
🔍 🛠️ AI Toolbox: Product Picks of the Week
Scade: A no-code platform with over 1,500 AIs. Scade.pro Flow enables connecting and cascading AI models via drag and drop interface.
Audiobox Maker by Meta AI: An audio story creator based on Audiobox, Meta’s new foundation research model for audio generation.
📕 📚 AI Skillset: Learn & Build
How I Used Google Forms, AI, and Apps Script Automation to Analyze 1,700 Survey Responses [Link].
Run LLMs locally - 5 Must-Know Frameworks (Ollama, GPT4All, PrivateGPT , llama.cpp and LangChain) [Link].
🔥 ☕︎ You can support my work via Buy Me a Coffee. Thanks for reading and have a nice weekend! 🎉 Mariam.
Between Phi-2 and Mixtral-7B, it seems like we're entering an era of (relatively) small language models.