Visual Captions, Text-to-image Diffusion models on mobile, Orca by Microsoft, Forefront Chat and More.
Greetings and welcome to this week's AI Brews - your thoughtfully curated guide to AI products, learning resources and a concise roundup of the week's impactful news. Our goal? To provide a balanced selection in the rapidly evolving AI landscape, keeping you well-informed without the information overload. We value your feedback - don't hesitate to reply to this email with suggestions on how we can make this better for you. Thanks!
In today’s issue:
AI Pulse: News, Insights and Social Spotlight of the Week
AI Toolbox: Product Picks of the Week
AI Skillset: Learn & Build
🗞️ AI Pulse: News, Insights and Social Spotlight of the Week
🔥 News & Insights
Researchers from Snap present SnapFusion, a new approach that, for the first time, unlocks running text-to-image diffusion models on mobile devices in less than 2 seconds [Paper].
StabilityAI adds a new feature Uncrop to their generative AI tool, Clipdrop. It creates AI-generated backgrounds to automatically expand any image using Stable Diffusion XL as a foundation model. It’s free to try in the Clipdrop web app, with no need to log in [Details].
Google has updated Bard with a new technique, implicit code execution. This lets Bard run code in the background when it sees math-related prompts, making word problems and math calculations about 30% more accurate. Bard can now also directly export any table it creates to Google Sheets [Details].
Microsoft develops Orca - a 13-billion parameter model outperforming smaller open-source models and at times equaling or outperforming ChatGPT, though it lags behind GPT-4 [Paper].
Google presents and open-sources Visual Captions, a system that uses spoken words to add real-time images to video chats [Details].
AlphaDev, Google DeepMind’s AI, discovers small sorting algorithms from scratch that outperformed human benchmarks. These algorithms have been added to the LLVM standard C++ sort library. This is the first time an algorithm designed by AI has been added to this library. AlphaDev also discovered a new hashing algorithm, now released in the open-source. [Details | Paper].
Adobe opens its Firefly generative AI model to enterprise customers, allowing them to customize the model with their own branded assets [Details].
Apple announced a number of AI features without mentioning ‘AI’ [Details].
HuggingChat, the open-source alternative to ChatGPT by HuggingFace added a web search feature [Link].
Tafi, the owner of Daz 3D announces launch of a text-to-3D character engine, that will allow users to create high-quality custom 3D characters using simple text prompts. Tafi is using a massive 3D dataset derived from its proprietary Genesis character platform [Details].
Runway’s much-awaited Gen-2 for text-to-video is available now with free trial [Details].
Europe wants platforms to label AI-generated content to fight disinformation [Details].
Google presents SQuId, a 600M parameter regression model that uses the SQuId dataset and cross-locale learning to evaluate speech synthesis quality in multiple languages and describe how natural it sounds [Details].
Together released the v1 versions of the RedPajama-INCITE family of models, allowing commercial use. RedPajama-INCITE-7B-Instruct is the highest scoring open model on HELM benchmarks outperforming Falcon-7B. RedPajama, is a project to create leading open-source models, and it reproduced LLaMA training dataset of over 1.2 trillion tokens in April [Details].
Wordpress launches Jetpack AI Assistant for generating blog posts, detailed pages, structured lists and comprehensive tables from within the Wordpress editor [Details].
Google Research presents StyleDrop: a method for generation of images from text prompts in any style described by a single reference image. StyleDrop is powered by Muse, a text-to-image generative vision transformer [Details | Paper].
Why AI Will Save the World by Marc Andreessen [Link].
🔍 🛠️ AI Toolbox: Product Picks of the Week
Forefront Chat
Forefront Chat is a web-based tool that lets you chat with multiple AI models (GPT-4, GPT-3.5, Claude Instant and Claude) from a single multi-tab window. Chats are auto-organized into subject-based folders, and internet access can be toggled for each model. You can also upload and chat with PDF files, transcribe audio from MP3 or YouTube and generate images via prompts.
Parsio
Parsio is an AI-powered tool to automate data extraction from various documents including PDFs, web pages and emails. The GPT-powered parser enables using natural language to extract data from human-written emails, CVs and unstructured documents. The parsed data can be exported to Google Sheets, downloaded in multiple formats and is also available via their API.
OpenChat
An open-source tool to create custom ChatGPT-like bots trained on your own data from PDFs, websites etc.. The custom chatbot can be embedded as a chat widget on your website. While the free demo is limited to 10 questions per minute and first 15 pages of a website/doc, you have the option to self-host. GitHub.
📕 AI Skillset: Learn & Build
GPT best practices: A guide by OpenAI that shares strategies and tactics for getting better results from GPTs [Link].
LLM Bootcamp: A series of free lectures by The full Stack on building and deploying LLM apps [Link].
Generative AI learning path by Google Cloud [Link].
Midjourney prompt share: Stylized Vector Avatars/Icons [Twitter Link].
Thank you for reading AI Brews! If you have any thoughts, questions or just want to say hello, please don't hesitate to hit reply. Mariam