Discussion about this post

User's avatar
Neural Foundry's avatar

Seedance 1.5 pro's spatial sound coordination with visuals is notable since most audio-video models stil treat audio as an afterthought or separate generation step. The ability to generate sound effects that spatially correspond to onscreen action (not just temporal sync) suggests they're modeling acoustic properties alongside visual composition. SAM Audio's span prompting is clever too, letting you point at a moment to isolate specific sounds rather than trying to describe them textualy. The convergence toward native multimodal generation (Wan2.6, Seedance) where audio and visual aren't bolted together post-hoc will probaly shift how people think about content creation workflows.

Expand full comment
Daniel Nest's avatar

Exa AI's People Search: What a perfectly non-creepy concept, nothing to see here!

Expand full comment

No posts

Ready for more?