AI_Distilled 38: Latest in AI: Sora, Gemini 1.5, and More

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!

👋 Hello,

“People say AI is overhyped, but I think it's not hyped enough. The next generation who will use this in the next few years will have a much higher bar on what technology can do for them. So how you build it for that generation, how you build it for that future will be really interesting to see.”

-Puneet Chandok, Microsoft India and South Asia president

Speaking at a panel discussion on AI at the Mumbai Tech Week, Chandok believes AI is not hyped enough considering its potential for disruptive transformation. He encourages more training on AI to realize its full potential.

Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across the AI sector:

OpenAI unveils Sora, an AI model generating videos from text

Google's latest conversational AI model Gemini 1.5 has a million-token context window

New AI news reader app tackles clickbait headlines, provides summaries

Slack is rolling out new AI features for enterprise users including thread summaries

LangChain announced raising $25 million to launch new platform for building LLM apps

AI helps improve medical imaging to benefit patients globally

Researchers develop AI model that determines a person's sex from brain scans

We’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge:

Giving AI Models a Better Memory: How Google DeepMind Expanded Context Windows

Advanced Techniques For More Relevant AI Responses

Reinforcement Learning Explained

Bridging the Gap Between AI and App Development

Finally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects:

Creating Custom Models Without the Hassle of Data Collection

Code Your Own AI Coding Buddy

Evaluating Code Quality with AI Assistants

Easily Deploy Language Models Locally

Looking for some inspiration? Here are some GitHub repositories to get your projects going!

gptscript-ai/gptscript

karpathy/minbpe

AAAI-DISIM-UnivAQ/DALI

QwenLM/Qwen

Writer’s Credit: Special shout-out to Vidhu Jain for her valuable contribution to this week’s issue.

Cheers,

Kartikey Pandey

Editor-in-Chief, Packt

⚡ TechWave: AI/GPT News & Analysis

OpenAI unveiled Sora, an AI model generating videos from text at up to a minute in length. Sora demonstrates an understanding of language and the physical world and photorealism across styles, though human subjects appear game-like.

Google's latest conversational AI model Gemini 1.5 analyzes more information than before, thanks to a million-token context window. This allows for summarizing the Apollo 11 mission transcript or analyzing a 44-minute silent film in full. Early results show the system maintains performance as context grows into the millions.

ai-distilled-38-latest-in-ai-sora-gemini-15-and-more-img-0

Bulletin, a new AI-powered news reader app, tackles clickbait headlines and provides summaries of news articles with customizable news sources.

Slack is rolling out new AI features for enterprise users including thread summaries, channel recaps, and answering workplace questions. The tools provide highlights from missed messages and help catch up.

LangChain announced raising $25 million to launch their new platform LangSmith for building and monitoring LLM apps. LangSmith allows developers to accelerate workflows across development, testing, deployment, and monitoring. It has already seen significant adoption with over 70,000 signups and 5000 monthly active companies.

ai-distilled-38-latest-in-ai-sora-gemini-15-and-more-img-1

Courtesy: Bulletin/Shihab Mehboob

AI is helping improve medical imaging to benefit patients globally. ML can quickly analyze large datasets to find issues doctors may miss and flag urgent cases. Cloud solutions also enable sharing scans and remote expert assistance anywhere. Companies are applying these methods to speed diagnoses, reduce wait times, and bring ultrasounds directly to homes. Researchers have also developed an AI model that can determine a person's sex from brain scans with over 90% accuracy. The model analyzed dynamic MRI scans and identified the default mode, striatum, and limbic networks as key in distinguishing male and female brains. This breakthrough furthers our understanding of brain organization and could help address sex-specific health issues.

🔮 Expert Insights from Packt Community

Generative AI with LangChain - By Dr. Ben Auffarth

ChatGPT and the GPT models by OpenAI have brought about a revolution not only in how we write and research but also in how we can process information.

This book discusses the functioning, capabilities, and limitations of LLMs underlying chat systems, including ChatGPT and Bard. It also demonstrates, in a series of practical examples, how to use the LangChain framework to build production-ready and responsive LLM applications for tasks ranging from customer support to software development assistance and data analysis

Key Takeaways

Explore the expansive utility of LLMs in real-world applications.

Guidance on fine-tuning, prompt engineering, and best practices.

Learn how to use the LangChain framework to build production-ready LLM applications.

By the end of this book, you'll be equipped with the practical knowledge and skills to leverage the transformative power of generative AI with confidence and creativity.

🌟 Secret Knowledge: AI/LLM Resources

🌀 Giving AI Models a Better Memory: How Google DeepMind Expanded Context Windows: Google DeepMind's latest AI model Gemini 1.5 has significantly improved how much information it can process at once, thanks to advances in "long context windows." The team discovered their model could understand over 1 million pieces of information in a single sitting, far surpassing earlier limits. This opens up new possibilities for tasks like summarizing lengthy documents, analyzing large codebases, and even comprehending full movies. Developers are excited to explore creative uses of this expanded recall.

🌀 Advanced Techniques For More Relevant AI Responses: This article discusses how to improve AI conversation models like RAG by enhancing how information is stored, found and used. Methods covered include indexing sentences individually while keeping their surrounding context, combining keyword search with semantic search, and re-scoring results based on the question. The author demonstrates implementing these "advanced RAG" techniques in Python using tools like LlamaIndex and Weaviate. With these optimizations, AI systems can provide more helpful responses by accessing knowledge in a targeted manner.

🌀 Reinforcement Learning Explained: This article breaks down the key concepts of reinforcement learning in an easy-to-understand way. It covers states, actions, rewards, and how agents interact with environments to learn policies. RL agents try different strategies to maximize long-term rewards through trial and error. Episodes provide a framework to evaluate policies. Deterministic policies pick set actions while stochastic policies use probabilities. Whether you're new to RL or a veteran, this primer is worth a read to get acquainted with the basics.

🌀 Bridging the Gap Between AI and App Development: As AI becomes more advanced, developers need easier ways to integrate cutting-edge features into their work. However, directly using AI code frameworks can be challenging and limit scalability. The solution? AI gateways. By handling tasks like routing, caching, and monitoring behind the scenes, gateways act as a bridge between complex AI systems and traditional development workflows. They streamline the integration process while ensuring high performance. Are gateways the future of intelligent applications?

Partnering with Notion

Ever tried Notion? It's a workspace that helps you do things better and faster.

You get AI for notes and teamwork, easy drag-and-drop for content, and cool new features to help manage projects and share knowledge.

Give it a try!

🔛 Masterclass: AI/LLM Tutorials

🌀 Creating Custom Models Without the Hassle of Data Collection: Tired of spending big bucks to use proprietary AI APIs or going through the tedious process of collecting your training data? This page shows how you can train customized models more efficiently. By using an open-source LLM to generate synthetic annotations for a small sample of your data, you can then fine-tune a smaller model tailored exactly to your needs. The process takes just a few steps and allows you to analyze large datasets for a fraction of the cost. Best of all, you avoid sending sensitive data to third parties.

🌀 Code Your Own AI Coding Buddy: This guide shows you how to build an AI assistant that lives right on your computer. Using tools like HuggingFace and Streamlit, you can create a chatbot trained on Code Llama. Simply ask it questions and it will respond with examples in languages like Python, Java, and C++. Better yet, the models are free and open-source. This is a neural net sidekick to help automate repetitive tasks and speed up your workflow.

🌀 Evaluating Code Quality with AI Assistants: This article explores using AI to improve code quality by testing Python scripts with SonarQube and getting feedback from LLMs. The author ran tests on ChatGPT and open-source models like Code Llama to see if they could identify issues flagged by SonarQube. While the models struggled to pinpoint errors solely from descriptions, some provided insightful summaries. Continued development of coding-focused LLMs may help automate part of the review process.

🌀 Easily Deploy Language Models Locally: With a simple four-step process, you can get powerful language models like ChatGPT running on your hardware. First, choose a model from HuggingFace and quantize it for faster performance. Then build an Ollama image to serve the model. For a slick interface, deploy a ChatGPT-style React app talking to Ollama via Docker. The whole setup only takes around 15 minutes. Now you've got a custom language assistant without internet dependence.

🚀 HackHub: Trending AI Tools

🌀 gptscript-ai/gptscript: Open source NLP tool that allows developers to automate tasks by writing scripts in plain English.

🌀 karpathy/minbpe: Minimal and clean Python code for the byte pair encoding algorithm commonly used in NLP and language model tokenization.

🌀 AAAI-DISIM-UnivAQ/DALI: Framework allowing developers to build multi-agent systems in Prolog for applications like robotics, event processing, and more.

🌀 QwenLM/Qwen: Open source code, models, and documentation for the Qwen series of LLMs, including Qwen, Qwen-Chat, and their various sizes.