Large language models, which are the AI algorithms that power chatbots like ChatGPT, are powerful because they are trained on enormous amounts of publicly available data from the internet. While they are capable of summarizing, creating, predicting, translating and synthesizing text and other content, they can only do so on the data they have been trained on, at a specific point in time.
That’s why businesses are looking to methods like retrieval-augmented generation, or RAG, and fine-tuning to bridge the gap between the general knowledge these LLMs have and the up-to-date and specific knowledge that makes them useful for enterprises. Here’s what to know about these techniques, and how they work: Vector database: A database designed to store a massive amount of data as “vectors," which are numerical representations of the raw data.
Depending on the amount and type of data—from images and text to tables—each vector can contain tens to thousands of dimensions grouped by similarity. This format, which differs from a traditional database with columns and rows, allows AI models to quickly search for contextually similar vectors and identify the context of a user’s question.
Traditional databases require specific text-matching searches, whereas vector databases can search for similarities in the data based on a user’s query—finding instances of “coffee" that likely relate to “beverage," for example. Retrieval-augmented generation (RAG): A method used by developers to connect large language models with external data sources, such as a business’s private information, so that it can provide more personalized, accurate and relevant responses.
The term originated from a 2020 paper by Meta Platforms AI researchers. The RAG technique
. Read more on livemint.com