AlphaEvolve, Psyche, Windsurf SWE-1, HunyuanCustom, GenSpark's Download Agent, Step1X-3D, Meta 3D AssetGen 2.0, HealthBench, ElevenLab's Soundboard, Maunus Image Generation, Higgsfield Ads and more

May 16, 2025

Hey there! Welcome back to AI Brews - a concise roundup of this week's major developments in AI.

In today’s issue (Issue #102):

AI Pulse: Weekly News at a Glance
Weekly Spotlight: Noteworthy Reads and Open-source Projects
AI Toolbox: Product Picks of the Week

🗞️🗞️ Weekly News at a Glance

Google DeepMind introduced AlphaEvolve, a Gemini-powered coding agent for algorithm discovery. It iteratively improves the best algorithms found, and re-combines ideas from different solutions to find even better ones. AlphaEvolve was applied to discovering algorithms for matrix multiplication, a fundamental problem in computer science - it managed to identify multiple new algorithms. It was also applied to over 50 open problems in analysis, geometry, combinatorics
and number theory. In 75% of cases, it rediscovered the best solution known so far. In 20% of cases, it improved upon the previously best known solutions, thus yielding new discoveries [Details].
Windsurf launched SWE-1 family of models, optimized for the entire software engineering process, not just the task of coding. SWE-1 is approximately Claude 3.5 Sonnet levels of tool-call reasoning while being cheaper to serve. It outperforms all non-frontier models and open-weight alternatives. It will be available to all paid users for a promotional period of 0 credits per user prompt [Details].
Tencent introduced HunyuanCustom, a multi-modal customized video generation framework that emphasizes subject consistency while supporting image, audio, video, and text conditions [Details].
Eleven Labs launched SB-1 Infinite Soundboard, a custom soundboard, drum machine, and endless ambient noise generator all in one. Describe the sound effects you want to hear, then SB-1 will generate them using the Text-to-SFX model [Details].
Nous Research introduced Psyche, an open infrastructure that democratizes AI development by decentralizing training across underutilized hardware. Building on DisTrO and its predecessor DeMo, Psyche reduces data transfer by several orders of magnitude, making distributed training practical [Details].
Stability AI open-sourced Stable Audio Open Small, a 341 million parameter text-to-audio model optimized to run entirely on Arm CPUs. Designed for quickly generating short audio samples, it can produce up to 11 seconds of audio on a smartphone in less than 8 seconds [Details].
Genspark launched Download Agent & AI Drive for batch file downloading and auto organization [Video].
Qwen released the quantized models of Qwen3. You can deploy Qwen3 via Ollama, LM Studio, SGLang, vLLM and choose from multiple formats including GGUF, AWQ, and GPTQ [Details].
StepFun released Step1X-3D, a fully open-source 3D generation framework. Benchmark results demonstrate state-of-the-art performance that exceeds existing open-source methods, while also achieving competitive quality with proprietary solutions [Details].
Meta introduced Meta 3D AssetGen 2.0, a new foundation model for 3D content creation. Unlike AssetGen 1.0, AssetGen 2.0 uses 3D diffusion for geometry estimation and is trained on a large corpus of 3D assets. For texture generation, it also introduces new methods for improved view consistency, texture in-painting, and increased texture resolution. It’s being internally used for 3D worlds creation, and will be rolled out later this year to Horizon creators [Details].
ChatGPT now lets you export your deep research reports as well-formatted PDF. You can also connect GitHub repositories directly to Deep Research. Google’s Gemini Advanced also now connects with GitHub [Details ]
OpenAI introduced HealthBench, a new benchmark designed to better measure capabilities of AI systems for health, built in partnership with 262 physicians who have practiced in 60 countries. HealthBench includes 5,000 realistic health conversations, each with a custom physician-created rubric to grade model response. o3 outperforms other models, including Claude 3.7 Sonnet and Gemini 2.5 Pro (Mar 2025) in overall score [Details].
Sakana AI released the Continuous Thought Machine (CTM), an AI model that uniquely uses the synchronization of neuron activity as its core reasoning mechanism, inspired by biological neural networks. Unlike traditional artificial neural networks, the CTM uses timing information at the neuron level that allows for more complex neural behavior and decision-making processes. This innovation enables the model to “think” through problems step-by-step, making its reasoning process interpretable and human-like [Details].
Manus AI added a new image generation capability to its AI agent. It understands your intent, plans a solution, and knows how to effectively use image generation along with other tools to accomplish your task. Watch the demo in the video below. Manus AI is available to everyone without a waitlist, giving each user one free task per day (worth 300 credits) and a one-time bonus of 1,000 credits [Video].
OpenAI launched Safety evaluations hub that provides access to safety evaluation results for OpenAI’s models [Details].
Tongyi Lab of Alibaba Group has released Wan2.1-VACE, all-in-one video creation and editing model in 1.3B and 14B sizes under Apache-2.0 license. It excels in Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio [Details | GitHub].
ByteDance released DeerFlow, an open-source multi-agent framework for deep research, combining language models with tools like web search, crawling, and Python execution [Details].
Google launched the AI Futures Fund for AI startups with early access to DeepMind's AI models, resources, expertise and equity funding [Details].
GPT-4.1 is now available directly in ChatGPT, a faster alternative to OpenAI o3 & o4-mini for everyday coding needs [Details].
Langchain launched Open Agent Platform, an open-source no-code agent building platform. These agents can be connected to a wide range of tools, RAG servers, and other agents through an Agent Supervisor [Details].

🔦 🔍 Weekly Spotlight

AG-UI: The Agent-User Interaction Protocol: AG-UI is an open, lightweight, event-based protocol that standardizes how AI agents connect to front-end applications.
Pipedream MCP server: Add 2,500+ APIs with 10,000+ tools to your agent
FireCrawl Templates: Ready to use Firecrawl examples
NanoVLM by Hugging Face: the simplest repository for training/finetuning a small sized Vision-Language Model with a lightweight implementation in pure PyTorch
Klavis AI (YC X25): Open Source MCP integration for AI applications
Vision Language Models (Better, Faster, Stronger)
Vibe Check: Gemini 2.5 Pro and Gemini 2.5 Flash - Why Google might quietly win the race to be AI’s top backend provider
Reverse Engineering PowerPoint's XML to Build a Slide Generator
Reinforcement fine-tuning use cases - by OpenAI

🔍 🛠️ Product Picks of the Week

Higgsfield Ads: Upload a product photo, pick one of 40+ templates and turn into a studio ad
Flowise 3.0: Build AI Agents visually. Open source agentic systems development platform. Flowise provides modular building blocks to build any agentic systems, from simple compositional workflows to autonomous agents
Scria AI: Free Deep Research tool powered by multiple AI models including Claude 3.7 Sonnet.
GenSpark AI Sheets: auto-finds companies, people, products etc. Analyze & visualize data following your prompt.
CodeRabbit for VS Code: Free AI Code Reviews in Cursor, Windsurf etc.
Fellou: The world's first agentic browser.
Little Language Lessons by Google Labs: A collection of bite-sized learning experiments built with Gemini.

Last Issue
Claude Integrations, Qwen3, Chai by Langbase, agentic commerce,Phi-4 Reasoning, LlamaFirewall, Kimi-Audio, Gen-4 References, DeepWiki by Cognition, F Lite, Dia, Suno v4.5 ,Xiaomi MiMo-7B and more
May 2
Hey there! Welcome back to AI Brews - a concise roundup of this week's major developments in AI.
Read full story