Apple Open Source AI Models, Expressive AI Avatars, phi-3-mini, Snowflake Arctic, Firefly Image 3 and More
Hi. Welcome to this week's AI Brews for a concise roundup of the week's major developments in AI.
In today’s issue (Issue # 60 ):
AI Pulse: Weekly News & Insights at a Glance
AI Toolbox: Product Picks of the Week
🗞️🗞️ AI Pulse: Weekly News & Insights at a Glance
🔥 News
Microsoft introduced phi-3 family of models. phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, has overall performance competitive with Mixtral 8x7B and GPT-3.5 despite being small enough to be deployed on a phone. Phi-3 mini is available in the HuggingChat. Phi-3 Small (7B parameters) and Phi-3 Medium (14B parameters) will be released soon [Details | Report | Hugging Face].
Snowflake released Snowflake Arctic, a truly open enterprise-focused LLM ( Apache 2.0 license) that uses a unique Dense-MoE Hybrid transformer architecture - 408B with 17B active parameters and 128 experts. [Details].
Apple released OpenELM, a family of open language models (270 million - 3 billion parameters), designed to run on-device. OpenELM outperforms comparable-sized existing LLMs pretrained on publicly available datasets. Apple also released CoreNet, a library for training deep neural networks [Paper | Hugging Face | GitHub].
SenseTime launched its latest large model, the SenseNova 5.0. It’s a Mixture of Experts model trained on over 10TB tokens, 200k context window and having multimodal capabilities. The model is better than OpenAI's GPT 4 in the majority of general usage scenarios, especially in enterprise applications and Chinese-language usage scenes, according to SenseTime's chairman. SenseTime's Sora-like text-to-video service will also soon debut [Details + this].
Synthesia launched Expressive Avatars, a new family of fully generative AI digital humans powered by their new EXPRESS-1 model, that are able to ‘understand’ what they're saying. The avatars can detect sentiment of a script and perform all the subtle nuances of human communication [Details].
MIT CSAIL and MyShell AI researchers introduced OpenVoice V2, a text-to-speech model that can clone any voice and speak in many languages. OpenVoice V2 is fully open-sourced, and allows free commercial use [Details].
Adobe Introduced Firefly Image 3 Foundation Model that delivers higher-quality image generations with more variety, better understanding of prompts, creative control with Structure Reference and Style Reference. The beta is available in Photoshop and in the Firefly web application [Details].
The Ray-Ban Meta Smart Glasses have multimodal AI now. The glasses take a picture when given a voice command “Hey Meta, look and...” , the AI communes with the cloud, and an answer arrives in your ears [Details].
OpenAI announced new enterprise-grade features for API customers
that include more security features and controls, updates to the Assistants API, and tools to better manage costs [Details].
Meta Llama 3 models have been downloaded over 1.2 million times, with developers sharing over 600 derivative models on Hugging Face since release last week. Llama 3 70B Instruct is tied for first for English-only evals on the LMSYS Chatbot Arena Leaderboard, and sits at six overall making it the highest ranked openly available model, just behind closed proprietary models [Details].
Startup chip company Groq’s breakthrough AI chip achieves blistering 800 tokens per second on Meta’s LLaMA 3 [Details].
Augment, a GitHub Copilot rival, launches out of stealth [Details].
🔦 Weekly Spotlight
The Economic Impact of Generative AI - a report by Andrew McAfee, Google [Link].
Friend, an Open Source AI wearable recording device on Kickstarter [Link].
RAGFlow: an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding [Link].
The 5 winning projects of ‘Build with Claude Developer Contest’ [Link].
LangWatch: an open-source LLM monitoring & analytics platform [Link]
Prompt Guide 101: Gemini for Google Workspace Prompt Guide [Link].
Cohere Toolkit: a collection of prebuilt components enabling users to quickly build and deploy RAG applications [Link].
Evolutionary Model Merging For All [Link].
🔍 🛠️ AI Toolbox: Product Picks of the Week
Assista: Integrates productivity tools like Notion, Google, Slack etc. into a single interface, allowing you to manage tasks through simple voice or text commands.
Aragon.ai: AI Headshot generator that generates professional AI headshots in minutes
Vidnoz: Generate AI Videos using 600+ AI avatars, 470+ realistic AI voices, 800+ templates.
You can support my work via BuyMeaCoffee.
Thanks for reading and have a nice weekend! 🎉 Mariam.