The latest trends in language models and generative AI
As we saw in the previous chapters, LLMs set the basis for extremely powerful applications. Starting with LLMs, over the last months we have witnessed an explosive advancement in generative models, from multimodality to newly born frameworks, to enable multi-agent applications. In the next sections, we will see some examples of these new releases.
GPT-4V(ision)
GPT-4V(ision) is a large multimodal model (LMM) developed by OpenAI and officially released in September 2023. It enables users to instruct GPT-4 to analyze image inputs provided by the user. This integration of image analysis into LLMs represents a significant advancement in AI research and development. Model multimodality was achieved by using a technique called image tokenization, which converts images into sequences of tokens that can be processed by the same model as text. This allows the model to handle different types of data, such as text and images, and...