100K context windows, Stable Animation SDK, Airtable AI, Google's massive AI updates, Transformers Agent and more
Greetings and Welcome to this week's AIBrews - your thoughtfully curated guide to the most promising AI products, learning resources and a concise roundup of the week's impactful news. The aim is to provide a balanced selection in the rapidly changing AI landscape, ensuring our readers stay informed without feeling overwhelmed. Please let us know how we can further optimize your experience and save you time.
Without further ado, let's dive in.
Thanks for reading AI Brews! Subscribe for free to receive new posts and support my work.
In today’s issue:
AI Pulse: News, Insights and Social Spotlight of the Week
AI Toolbox: Product Picks of the Week
AI Skillset: Learn & Build
🗞️ AI Pulse:: News, Insights and Social Spotlight of the Week
🔥 News & Insights
Anthropic has increased the context window of their AI chatbot, Claude to 100K tokens (around 75,000 words or 6 hours of audio. In comparison, the maximum for OpenAI’s GPT-4 is 32K tokens). Beyond reading long texts, Claude can also retrieve and synthesize information from multiple documents, outperforming vector search approaches for complex questions [Details].
Stability AI released Stable Animation SDK for artists and developers to create animations from text or from text input + initial image input, or from text input + input video [Details]:
Google made a number of announcements at Google’s annual I/O conference:
Introduced PaLM 2 - new language model with improved multilingual (trained in 100+ languages ), reasoning and coding capabilities [Palm 2 technical report]. Available in four sizes from smallest to largest: Gecko, Otter, Bison and Unicorn. Gecko can work on mobile devices and is fast enough for great interactive applications on-device, even when offline.
Update to Google’s medical LLM, Med-PaLM 2, which has been fine-tuned on medical knowledge, to include multimodal capabilities. This enables it to synthesize information from medical imaging like plain films and mammograms. Med-PaLM 2 was the first large language model to perform at ‘expert’ level on U.S. Medical Licensing Exam-style questions.
Updates to Bard - Google’s chatbot:
Powered by PaLM 2 with advanced math and reasoning skills and coding capabilities.
More visual both in its responses and prompts. Google lens now integrated with Bard.
integrated with Google Docs, Drive, Gmail, Maps and others
Extensions for Bard: Includes both for Google’s own apps like Gmail, Doc etc. as well as third-party extensions from Adobe, Kayak, OpenTable, ZipRecruiter, Instacart, Wolfram and Khan Academy.
Bard now available in 180 countries.
Update to Google search featuring AI-generated text from various web sources at the top of the search results. Users can ask follow-up questions for detailed information. This Search Generative Experience, (SGE) will be accessible via a new ‘Search Labs’ program
Magic Editor in Google Photos to make complex edits without pro-level editing skills
Immersive view for routes in Google Maps. Immersive View uses computer vision and AI to fuse billions of Street View and aerial images together to create a rich digital model of the world. Demo below:
Three new foundation models are available in Vertex AI:
Codey: text-to-code foundation model that supports 20+ coding languages
Imagen: text-to-image foundation model for creating studio-grade images
Chirp: speech-to-text foundation model that supports 100+ languages
Duet AI for Google Workspace: generative AI features in Docs, Gmail, Sheets, Slides, Meet and Chat.
Duet AI for Google Cloud: assistive AI features for developers including contextual code completion, code generation, code review assistance, and a Chat Assistant for natural language queries on development or cloud-related topics.
Duet AI for AppSheet: to create intelligent business applications, connect data, and build workflows into Google Workspace via natural language without any coding.
Studio Bot: coding companion for Android development
Embeddings APIs for text and images for development of applications based on semantic understanding of text or images.
Reinforcement Learning from Human Feedback (RLHF) as a managed service in Vertex AI - the end-to-end machine learning platform
Project Gameface: a new open-source hands-free gaming mouse enables users to control a computer's cursor using their head movement and facial gestures
MusicLM for creating music from text, is now available in AI Test Kitchen on the web, Android or iOS.
Project Tailwind: AI-powered notebook tool that efficiently organizes and summarizes user notes, while also allowing users to ask questions in natural language about the content of their notes.
Upcoming model Gemini: created from the ground up to be multimodal, it is under training.
Meta announced generative AI features for advertisers to help them create alternative copies, background generation through text prompts and image cropping for Facebook or Instagram ads [Details].
IBM announced at Think 2023 conference:
Watsonx: a new platform for foundation models and generative AI, offering a studio, data store, and governance toolkit [Details]
Watson Code Assistant: generative AI for code recommendations for developers. Organizations will be able to tune the underlying foundation model and customize it with their own standards. [Demo].
Airtable is launching Airtable AI enabling users to use AI in their Airtable workflows and apps without coding. For example, product teams can use AI components to auto-categorize customer feedback by sentiment and product area, then craft responses to address concerns efficiently [Details].
Salesforce announced an update to Tableau that integrates generative AI for data analytics. Tableau GPT allows users to interact conversationally with their data. Tableau Pulse, driven by Tableau GPT, surfaces insights in both natural language and visual format [Details].
Hugging Face released Transformers Agent - a natural language API on top of transformers [Details].
MosaicML released a new model series called MPT (MosaicML Pretrained Transformer) to provide a commercially-usable, open-source model that in many ways surpasses LLaMA-7B. MPT-7B is trained from scratch on 1T tokens of text and code. MosaicML also released three fine-tuned models: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+, the last of which uses a context length of 65k tokens! [Details].
Meta has announced a new open-source AI model, ImageBind, capable of binding data from six modalities at once, without the need for explicit supervision. The model learns a single embedding, or shared representation space, not just for text, image/video, and audio, but also for depth, thermal and inertial measurement units (IMUs) which calculate motion and position [Demo | Details]
The first RedPajama 3B and 7B RedPajama-INCITE family of models, including base, instruction-tuned & chat models, have been released. The 3B model is the strongest in its class, and the small size makes it extremely fast and accessible. RedPajama, is a project to create leading open-source models, and it reproduced LLaMA training dataset of over 1.2 trillion tokens a few weeks ago [Details].
Anthropic has used a method called 'constitutional AI' to train its chatbot, Claude that allows the chatbot to learn from a set of rules inspired by sources like the UN's human rights principles. Unlike traditional methods that depend heavily on human moderators to refine responses, constitutional AI enables the chatbot to manage most of the learning process using these rules to guide its responses towards being more respectful and safe [Details].
Midjourney reopens free trials after month-long pause [Details].
OpenAI’s research on using GPT-4 to automatically write explanations for the behavior of neurons in large language models [Details].
🔦 Social Spotlight
Teach-O-Matic, an AI YouTuber that creates how-to videos about anything [Link].
Research data for jobs most likely to be impacted by generative AI [Link].
🔍 🛠️ AI Toolbox: Product Picks of the Week
Mindstone is an AI-powered learning tool to help maximize the learning experience from online content without feeling overwhelmed. We can save content from various sources (videos, podcasts and written content), take notes, highlight content and also build learning habits with smart prompts. The AI generates summaries, extracts key insights and answers questions. Additionally, it automatically tags content for easy organization and access. Available on web, iOS and Android.
Catbird is a text-to-image generative AI tool to generate and compare outputs from up to six AI models simultaneously, with over 15 models to choose from. It features AI-assisted prompt improvement, transforming even simple keywords into creative and detailed prompts. The tool is free, requires no sign-up, and has no usage limits. Over 20 million images have already been generated using Catbird!
📕 AI Skillset: Learn & Build
A quick Deforum Tutorial for creating AI animations:
Tutorials on vector embeddings and vector databases by Pinecone.
Transformers Agent - AutoGPT with Hugging Face Models:
Thank you for reading AI Brews! If you have any thoughts, questions or just want to say hello, please don't hesitate to hit reply. Mariam
Thanks for reading AI Brews! Subscribe for free to receive new posts and support my work.