Multilingual Expressive and Streaming Speech Translation by Meta, Real-time Text-to-Image generation, llamafile, Starling-7B, Mathematical Olympiad for AI Models, and More
Greetings and welcome to this week's AI Brews for a concise roundup of the week's major developments in AI.
In today’s issue (Issue #42 ):
AI Pulse: Weekly News & Insights at a Glance
AI Toolbox: Product Picks of the Week
AI Skillset: Learn & Build
🗞️🗞️ AI Pulse: Weekly News & Insights at a Glance
🔥 News
Meta AI introduced a suite of AI language translation models that preserve expression and improve streaming [Details | GitHub]:
SeamlessExpressive enables the transfer of tones, emotional expression and vocal styles in speech translation. You can try a demo of SeamlessExpressive using your own voice as an input here.
SeamlessStreaming, a new model that enables streaming speech-to-speech and speech-to-text translations with <2 seconds of latency and nearly the same accuracy as an offline model. In contrast to conventional systems which translate when the speaker has finished their sentence, SeamlessStreaming translates while the speaker is still talking. t intelligently decides when it has enough context to output the next translated segment.
SeamlessM4T v2, a foundational multilingual & multitask model for both speech & text. It's the successor to SeamlessM4T, demonstrating performance improvements across ASR, speech-to-speech, speech-to-text & text-to-speech tasks.
Seamless, a model that merges capabilities from SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2 into one.
Stability AI released SDXL Turbo: a real-time Text-to-Image generation model. SDXL Turbo is based on a anew distillation technology, which enables the model to synthesize image outputs in a single step and generate real-time text-to-image outputs while maintaining high sampling fidelity [Details].
Mozilla’s innovation group and Justine Tunney released llamafile that lets you distribute and run LLMs with a single file. llamafiles can run on six OSes (macOS, Windows, Linux, FreeBSD, OpenBSD, and NetBSD) and on multiple CPU architectures [Details].
Perplexity released two new PPLX models: pplx-7b-online and pplx-70b-online.
Google DeepMind presented GNoME (Graph Networks for Materials Exploration): an AI tool that discovered 2.2 million new crystal structures, with 380,000 being highly stable and promising for breakthroughs in superconductors, supercomputers, and advanced batteries for electric vehicles [Details].
Amazon introduced two new Amazon Titan multimodal foundation models (FMs): Amazon Titan Image Generator (preview) and Amazon Titan Multimodal Embeddings. All images generated by Amazon Titan contain an invisible watermark [Details].
Researchers present Animatable Gaussians, a new avatar representation method that can create lifelike human avatars from multi-view RGB videos [Details].
Pika Labs released a major product upgrade of their generative AI video tool, Pika 1.0, which includes a new AI model capable of generating and editing videos in diverse styles such as 3D animation, anime, cartoon and cinematic using text, image or existing video [Details].
Eleven Labs announced a grant program offering 11M text characters of content per month for the first 3 months to solo-preneurs and startups [Details].
Researchers from UC Berkeley introduced Starling-7B, an open large language model trained using Reinforcement Learning from AI Feedback (RLAIF). It utilizes the GPT-4 labeled ranking dataset, Nectar, and a new reward training pipeline. Starling-7B outperforms every model to date on MT-Bench except for OpenAI’s GPT-4 and GPT-4 Turbo [Details].
XTX Markets is launching a new $10mn challenge fund, the Artificial Intelligence Mathematical Olympiad Prize (AI-MO Prize) The grand prize of $5mn will be awarded to the first publicly-shared AI model to enter an AI-MO approved competition and perform at a standard equivalent to a gold medal in the in the International Mathematical Olympiad (IMO) [Details] .
Microsoft Research evaluated GPT-4 for processing radiology reports, focusing on tasks like disease classification and findings summarization. The study found GPT-4 has a sufficient level of radiology knowledge with only occasional errors in complex context that require nuanced domain knowledge. The radiology report summaries generated by GPT-4 were found to be comparable and, in some cases, even preferred over those written by experienced radiologists [Details].
AWS announced Amazon Q, a new generative AI–powered assistant for businesses. It enables employees to query and obtain answers from various content repositories, summarize reports, write articles, perform tasks, and more, all within their company's integrated content systems. Amazon Q offers over 40 built-in connectors to popular enterprise systems [Details].
18 countries including US, Britain signed a detailed international agreement on how to keep artificial intelligence safe from rogue actors, pushing for companies to create AI systems that are ‘secure by design’ [Details].
🔦 Weekly Spotlight
Interview: Sam Altman on being fired and rehired by OpenAI [Link].
Open source version of image+text-based adventure game using GPTs in ChatGPT MonkeyIslandAmsterdam.com by Peter levels [Link].
🔍 🛠️ AI Toolbox: Product Picks of the Week
Instant Avatar by HeyGen: This updated version generates a custom avatar in 5 minutes from a 2 minutes talking footage. The basic version is available for free.
Sider 4.0: Group AI chat to reduce hallucination by comparing answers from different AIs like ChatGPT, GPT-4.0, Claude, Bard in a group.
Outset.ai: AI conducts real-time, voice-to-voice interviews for user research and then synthesizes the results into trends, counts, summaries, and key quotes.
📕 📚 AI Skillset: Learn & Build
Generative AI for beginners: Learn the fundamentals of building Generative AI applications with this 12-lesson comprehensive course by Microsoft [Link].
llamafile is the new best way to run a LLM on your own computer [Link].
Building and Evaluating Advanced RAG Applications: A free short course on DeepLearning.AI by Jerry Liu (co-founder LlamaIndex) and Anupam Datta (co-founder TruEra) [Link].
AI Brews is free, and your sharing it with a friend helps us grow. Thanks for your support and have a nice weekend! 🎉 Mariam.