Open-Weight alternative to GPT-4o Realtime, Athene-V2, Stripe Agent Toolkit, Qwen2.5-Coder-32B, Prompt Canvas and Promptim, Vidu-1.5, MagicQuill, OpenCoder and more
Hunyuan-Large, AI model for open-world games, X-Portrait 2 for realistic character animations, FLUX1.1 [pro] Ultra and Raw, Magentic-One, Hume AI App, action model for GUI agents and More
Recraft V3, new best open-source compact language models, Wonder Animation, X to Voice, Meta's MarDini model, GitHub Spark and more
Claude's computer use, Mochi 1 & Allegro open-source video models, Aya Expanse, Stable Diffusion 3.5, HUGS by Hugging face, Meta Spirit LM, Act-One, Haiper 2.0, Multimodal Embed 3,Playground v3 & More
Open Multimodal Native Model, BeaGo, Mistral advanced Edge Models, Suno Scenes, Supercomplete, Movie Gen, Dash for Business, F5-TTS, Interactive Meeting Avatar and more
FLUX1.1 [pro], Canvas, Realtime API from OpenAI, open-sourcing of Reverb, Digital Twin Catalog, Copilot Vision, Depth Pro, new Whisper model, Pikaffects and more
Molmo, Meta's Vision Models, Next-Token Prediction Multimodal model, AlphaChip, Hundred Film Fund, HuggingChat macOS, Updated models from OpenAI and Google and more
Qwen 2.5, Seed-Music, StoryMaker, Jina Embeddings V3, Multimodal RAG, Luma Labs and Runway APIs, CogVideoX image-to-video generation model and More
OpenAI's new reasoning model, Empathic Voice Interface 2, Covers by Suno, Pixtral Multimodal model, DataGemma, Notes to Podcast and more
Replit Agent, world’s top open-source model, new real-time audio conversational model, AlphaProteo, style vs substance, fully open-source mixture-of-expert (MoE) language model and more
Ultra-long context, Qwen2-VL outperforms GPT-4o, new open weights Text to Video model, Eagle multimodal large language model, fastest AI inference and more
Jamba 1.5, Ideogram 2.0, Phi-3.5-MoE, Transfusion, Dream Machine 1.5, Mistral-NeMo-Minitron 8B, fine-tuning for GPT-4o and more
1T open-source LLM, Llama 3.1 405B, Mistral Large 2, Stable Video 4D, Outfit Anyone, SearchGPT, Llama Guard 3 and more
Mistral NeMo, GPT-4o mini, AI-powered platform to create controlled videos, SmolLM, Anthology Fund and more
Multilingual speech recognition model with emotion recognition, LivePortrait, Lynx open source hallucination detection model, EchoMimic, In-browser speech recognition and more
Real-time speech-to-speech model, Magic Insert, CriticGPT, Meta 3D Gen, Multimodal Canvas, InternLM 2.5, AI Voice Isolator, llama-agents and more
Claude 3.5 and Artifacts, Florence-2 and Meta's models, Expressive talking and singing characters, Video to Sounds Effects app, DeepSeek-Coder-V2, video-to-audio and more
Dream Machine, Apple Intelligence, AI to understand animals communication, Mixture of Agents, Real-time Expressive Generative Humans, Skybox AI new model and more
Qwen2, Kling video model, Text to Sound Effects, No Language Left Behind model, video gaming AI assistant, Audio uploads and more
Netflix of AI, Perplexity Pages, AI agent platform for financial analysis, Low-latency voice model, AI Prize Fight, Codestral, K2 and MAP-Neo models, AutoCoder, HuggingChat Tools and more
Copilot+ PCs, Phi-3 models, open Multimodal outperforms GPT-4V, Cohere's multilingual Aya 23 model and more
GPT-4o, Google I/O updates, interact with tables and charts in real-time, ZeroGPU, Gemini API dev competition, Chinese-to-image generation and more
Stable Artisan, ElevenLabs Music, Conversational AI Teams, AlphaFold 3, EMOPortraits, Retrieval Augmented Fine Tuning and more
Hyper-SD, Llama3 with 1M+ context length, new Robot that folds clothes & cooks, Qwen1.5-110B, Vidu AI and more
Apple Open Source AI Models, Expressive AI Avatars, phi-3-mini, Snowflake Arctic, Firefly Image 3 and More
Llama 3, Lifelike Audio-Driven Talking Faces, Reka Core, Stable Assistant with Stable Diffusion 3, Meta's real-time image generation, Driving with Natural Language, multi-bot chat and More
Foundation Model for Efficient Enterprise Search, fully open-source Text-to-speech model, Native Audio understanding in Gemini 1.5 Pro, AI film competition, Physical AI model, Mixtral 8×22B & More
Jamba hybrid SSM-Transformer Model, empathic LLM, Databrix MoE model, Animation driven by Audio, Qwen1.5-MoE, generative AI nurses and more
SceneScript, Automating the generation of foundation models, 01 Light, Stable Video,3D, AnimateDiff-Lightning, foundation models for self-driving and humanoid robots, NVIDIA NIM and more
Emu Video Edit , General game-playing AI agent, fully autonomous AI software engineer, DeepSeek-VL, Robotics Foundation Model, and more
Calude 3 Opus, Train a 70b language model at home, Firewall for AI, Fast 3D Object Generation from Single Images, multimodal foundation model for any-to-any search tasks, and more
Mistral Large, vocal expressive avatar videos, Generative virtual worlds, Reliable text rendering and Magic Prompt, DJ Mode, AI-powered film making and more
Meta's V-JEPA vision models, OpenAI's Sora video model, Gemini 1.5 Pro with 1 million tokens context, Reka Flash, Largest text-to-speech AI model and more
Ultra 1.0, new multilingual model, open-source conversational and empathic AI Voice Assistant, InteractiveVideo and more
Truly Open Models, Code Llama 70B, Amazon AI Hackathon , AI Grant, world’s greenest 7B model and more
Fix to ‘lazy’ GPT-4, commercially permissive OSS LLaVA models, new multimodal model for digital agents, Google's new video model and more
Screenshots to Code Dataset, Multi Motion Brush in Gen-2, Open-source AGI, AI system that solves complex geometry problems, AI in drug discovery and more
GPT Store, text-to-3d in under 10 seconds, DeepSeekMoE 16B, jailbreaking advanced LLMs, LLaVA-ϕ, Microsoft's open-source agent framework, and more
Open source AI voice cloning, Meta's full-bodied photorealistic avatars from audio, Mobile-ALOHA and more