Code Llama + Multilingual multimodal translation model by Meta , AI gives voice to a stroke survivor, Flythroughs by Luma AI, fine-tuning for GPT-3.5 Turbo and more
Greetings and welcome to this week's AI Brews for a concise roundup of the week's major developments in AI.
In today’s issue (Issue #28 ):
AI Pulse: Weekly News & Insights at a Glance
AI Toolbox: Product Picks of the Week
AI Skillset: Learn & Build
🗞️🗞️ AI Pulse: Weekly News & Insights at a Glance
🔥 News
Meta AI releases Code Llama, a large language model for coding that is built on top of Llama 2. Code Llama Code outperformed state-of-the-art publicly available LLMs on code tasks. It is free for research and commercial use. You can try it on Fireworks AI and Perplexity Labs [Details].
Meta AI released SeamlessM4T (Massive Multilingual Multimodal Machine Translation) - the first all-in-one, multilingual multimodal translation model. SeamlessM4T can perform multiple tasks across speech and text: speech-to-text, speech-to-speech, text-to-speech, text-to-text translation, and speech recognition. It supports 100 languages for input (speech + text), 100 languages for text output and 35 languages (plus English) for speech output [Details | Demo | Hugging Face |GitHub].
Researchers from UC San Francisco and UC Berkeley have developed new brain-computer technology (BCI) that enables a stroke survivor to speak with facial expressions for first time in 18 years via a digital avatar. It is the first time that either speech or facial expressions have been synthesized from brain signals [Details].
Hugging Face released IDEFICS, an open-access 80 billion parameters multimodal model that accepts sequences of images and texts as input and generates coherent text as output. It is reproduction of Flamingo (developed by DeepMind) and is comparable in performance with the original closed-source model across various image-text understanding benchmarks. IDEFICS is built solely on publicly available data and models (LLaMA v1 and OpenCLIP) [Details].
Allen Institute for AI has released Dolma, the largest open dataset of 3 trillion tokens from a diverse mix of web content, academic publications, code, books, and encyclopedic materials. [HuggingFace Hub].
Open AI is now letting developers fine-tune GPT-3.5 Turbo. Fine-tuning for GPT-4 coming this fall. Early tests have shown that fine-tuned GPT-3.5 Turbo can match or exceed GPT-4 on certain narrow tasks [Details | Guide].
ElevenLabs released Eleven Multilingual v2 - a new Foundational AI speech model for nearly 30 languages. ElevenLabs is now out of beta [Details].
Hugging Face announced SafeCoder - a code assistant solution built for the enterprise [Details].
Midjourney released 'Vary Region’, an ‘inpainting’ feature to regenerate specific parts of an upscaled image [Details].
Stability AI is collaborating with Nvidia for improvement in the speed and efficiency of Stable Diffusion XL by integrating NVIDIA TensorRT, a high-performance optimization framework [Details | Hugging face].
OpenAI partners with Scale to provide support for enterprises fine-tuning models [Details].
YouTube is collaborating with Universal Music Group to launch Music AI Incubator [Details].
IBM has built a new, state-of-the-art generative AI code model to transform legacy COBOL programs to enterprise Java [Details].
A US federal judge gave a ruling that a piece of art created by AI is not open to protection [Details].
ElevenLabs has teamed up with the open-access video platform ScienceCast, allowing users to generate instant narrated summaries of scientific papers [Details].
Google announced a number of security-related enhancements to Google Workspace products, including GMail and Drive, some of which will take advantage of AI to automate certain tasks [Details].
ChatGPT custom instructions are now live in the EU and UK [Link].
HuggingChat now supports Amazon SageMaker deployment which allows organizations to build ChatGPT-like experiences fully within AWS [GitHub].
Meta AI presents Shepherd - a language model specifically tuned to critique model responses & suggest refinements. It goes beyond the capabilities of untuned models to identify diverse errors & suggest improvements [Paper].
Adobe Express adds generative AI features powered by Adobe Firefly to its free plan, enabling generation of images and text effects using text prompts [Link].
Project Jupyter released Jupyter AI - generative artificial intelligence in Jupyter notebooks. Users can generate code, ask questions about their local files, and generate entire notebooks from natural language prompts [Link].
Nvidia released the code for Neuralangelo, which can turn regular videos into highly detailed 3D models of both objects and large-scale indoor/outdoor scenes.[GitHub].
🔦 Weekly Spotlight
Jailbreaking wrist watch into a real-life second brain [Link].
I Made Stable Diffusion XL Smarter by Finetuning it on Bad AI-Generated Images [Link].
DoctorGPT: an open-source LLM that can pass the US Medical Licensing Exam. It works offline and is cross-platform [Link].
Llama-2-7B-32K-Instruct — and fine-tuning for Llama-2 models with Together API [Link].
A MIT-licensed JS starter kit by a16z, for building and customizing your own version of AI town - a virtual town where AI characters live, chat and socialize [Link].
🔍 🛠️ AI Toolbox: Product Picks of the Week
Flythroughs by Luma AI: an iPhone app to show off your space with AI-generated cinematic videos that look like professional drone captures. Flythroughs is built on Luma's 3D NeRF AI and a brand new path generation model.
YouLearn: an AI tutor for understanding lectures. Upload YouTube clips, Google Docs, PDFs, MP4s, Google Slides, etc. to get a high-level summary, and insightful notes. Generates MCQs, study cards, illustrations, mind maps, and supplementary materials.
Kombai: A new model trained to understand and code UI designs like humans.
Prompt it with design files to get high-quality UI code in one click per component. Kombai was developed with feedback from over 500 developers during private research preview.
📕 📚 AI Skillset: Learn & Build
GitHub’s design experts share 10 tips and lessons for designing magical user experiences for AI applications and AI coding tools [Link].
AI 101 for Teachers: a free, foundational online learning series for educators on AI and its transformative potential in education [Link].
Large Language Models with Semantic Search - new short course on DeepLearning.ai in collaboration with Cohere [Link].
Thanks for reading and have a nice weekend! 🎉 Mariam.
Glad you're back!