MiniMax-M2, Music 2.0 and Hailuo 2.3, gpt-oss-safeguard, Odyssey-2 interactive video model, fastest real-time conversation model, Granite 4.0 Nano, GitHub Agent HQ, Mistral AI Studio and more
MiniMax-M2, Music 2.0 and Hailuo 2.3, gpt-oss-safeguard, Odyssey-2 interactive video model, fastest real-time conversation model, Granite 4.0 Nano, GitHub Agent HQ, Mistral AI Studio and more
Oct 31, 2025
Hey there! Welcome back to AI Brews - a concise roundup of this week's major developments in AI.
In today’s issue (Issue #114 ):
AI Pulse: Weekly News at a Glance
Weekly Spotlight: Noteworthy Reads and Open-source Projects
AI Toolbox: Product Picks of the Week
🗞️🗞️ Weekly News at a Glance
MiniMax:
MiniMax-M2 - an open-source compact, fast, and cost-effective MoE model (230 billion total parameters with 10 billion active parameters) built for elite performance in coding and agentic tasks. According to benchmarks from Artificial Analysis, MiniMax-M2 composite score ranks #1 among open-source models globally [Details].
MiniMax Music 2.0 music generation model - creates full 5-minute compositions with lifelike vocals across multiple genres, supports duets and a cappella, and offers precise control in musical expression [Details].
MiniMax Speech 2.6 model featuring latency under 250ms for real-time conversations, Smart Text Normalization for URLs, emails, dates, numbers etc., Full voice clone + fluent LoRA and support for 40+ languages [Details].
Hailuo 2.3 video generation model with improvements in movement like flips and dancing sequences, expression & physical realism - 4 free videos daily [Details].
MiniMax Agent, built on MiniMax-M2, is now publicly available and is free for a limited time.
Odyssey introduced Odyssey-2, a new interactive video model that generates AI video instantly that you can interact with. You experience it much like a language model: you type and the video responds in the moment. Odyssey-2 begins streaming video instantly, producing a new frame of video every 50 milliseconds. You can try here [Details].
OpenAI:
gpt-oss-safeguard (research preview ): open-weight reasoning model that lets developers use their own custom policies to classify content. The model interprets those policies to classify messages, responses, and conversations and outperforms gpt-5-thinking and the gpt-oss open models on multi-policy accuracy. Available in two sizes: gpt-oss-safeguard-120b and gpt-oss-safeguard-20b [Details].
Sora 2 Character Cameos: Character cameos let you build a reusable “character” from a short video of anobject or pet. Once created, you can tag that character to appear in future generations just like you can with a personal cameo. Depending on the permissions you choose, others can tag it too. [Details]
Aardvark: an autonomous agent that can help developers and security teams discover and fix security vulnerabilities at scale. Aardvark is now available in private beta [Details].
Cartesia launched Sonic-3, a state-of-the-art real-time conversation model built on State Space Models (SSMs) instead of transformers, achieving 90ms model latency and 190ms end-to-end claiming it to be the fastest on the market. It delivers natural speech with emotional range including laughter and supports 42 languages [Details].
IBM released Granite 4.0 Nano, a family of tiny models (350M to 1.5B parameters) designed for edge devices that outperform similarly-sized competitors on benchmarks for knowledge, coding, and agentic tasks like tool calling. The Apache 2.0 licensed models feature both a new hybrid-SSM architecture and traditional transformer variants [Details]
Cursor released their first coding model, Composer, and a new interface for working with many agents in parallel. Composer is a frontier model that is 4x faster than similarly intelligent models. It’s built for low-latency agentic coding in Cursor, completing most turns in under 30 seconds [Details].
Cognition introduced SWE-1.5, a frontier-size model optimized for software engineering that achieves near-SOTA coding performance and is 6x faster than Haiku 4.5 and 13x faster than Sonnet 4.5. SWE-1.5 is now available in Windsurf [Details].
Anthropic:
expanding Claude for Financial Services with an Excel add-in, additional connectors to real-time market data and portfolio analytics, and new pre-built Agent Skills, like building discounted cash flow models and initiating coverage reports [Details].
Claude for Excel is available in beta as a research preview through a waitlist for 1,000 Max, Team and Enterprise plan customers. This allows users to work directly with Claude in a sidebar in Microsoft Excel, where Claude can read, analyze, modify, and create new Excel workbooks
GitHub is giving developers access to third-party AI coding agents with the launch of a new ‘Agent HQ’. Instead of just using GitHub Copilot, developers will get to try OpenAI’s Codex, Anthropic’s Claude, Google’s Jules, xAI, and Cognition’s Devin within GitHub in the coming months [Details].
Google upgraded NotebookLM with Gemini’s full 1 million token context window across all plans,, 6x longer conversation memory and boosting response quality by 50%. It now automatically explores your sources from multiple angles, going beyond your initial prompt to synthesize findings into a more nuanced response. Plus, you can customize chat to adopt a specific goal, voice or role, ranging from a PhD student analyzing sources to a creative storyteller exploring ideas [Details].
Meta and Hugging Face have partnered to launch OpenEnv, a shared and open community hub for agentic environments. Agentic environments define everything an agent needs to perform a task (tools, APIs, credentials, execution context), while providing clarity, safety, and sandboxed control [Details].
Mistral launched Mistral AI Studio, a production platform that helps enterprises deploy AI systems reliably through three pillars: Observability for tracking and evaluation, Agent Runtime for durable workflow execution, and AI Registry for versioning and governance [Details].
Microsoft 365 Copilot now enables you to build apps and workflows. App Builder and Workflows are now available in the Agent Store for customers in the Frontier program [Details].
LangChain launched LangSmithAgent Builder in private preview. It provides a no code agent-building experience, complete with memory and guided prompt creation [Details].
Agent Lightning: a Microsoft training framework that enables optimization of AI agents built with any agent framework or even without agent framework, requiring almost no code modifications
BrowserOS: The first open-source browser with built-in AI agents. Unlike Chrome or Safari, BrowserOS prioritizes privacy and automation, letting you automate repetitive tasks using just natural language.
Sentra by Dodo Payments:An AI Agent that integrates Billing & Payments by handling SDKs, APIs, Adapters, and plugs into your tech stack (Auth, DB etc).
Dropbox Dash: AI teammate that surfaces the content and context you need to stay focused and on track.
TLDW (Too Long; Didn’t Watch): TLDW (open-source) turns long-form YouTube videos into a structured learning workspace. Paste a URL and the app generates highlight reels, timestamped AI answers, and a place to capture your own notes so you can absorb an hour-long video in minutes.
Pomelli: Google Labs experiment to generate on-brand content for your business
Last Issue
Thanks for reading and have a nice weekend! 🎉 Mariam.
This piece really made me think about the future of human-AI interaction. Your roundup is excellent, as alwas. Could you elaborate a bit more on how the 'agentic tasks' of MiniMax-M2 relate to the interactive video capabilities of Odyssey-2? I find that particularly fascinting for policy discussions.
This piece really made me think about the future of human-AI interaction. Your roundup is excellent, as alwas. Could you elaborate a bit more on how the 'agentic tasks' of MiniMax-M2 relate to the interactive video capabilities of Odyssey-2? I find that particularly fascinting for policy discussions.