www.coginiti.co
Open in
urlscan Pro
52.22.180.1
Public Scan
Submitted URL: https://email.coginiti.co/e3t/Ctc/ZS+113/cChL-04/VWKvbG4ypGtfW7ytX3b416pfBW1qZjFf5dV4GJN5g75kd3qgyTW95jsWP6lZ3q1W3m_dCJ6mr...
Effective URL: https://www.coginiti.co/blog/implementing-retrieval-augmented-generation-coginiti/?utm_campaign=Coginiti%20Monthly%20New...
Submission: On May 09 via api from US — Scanned from DE
Effective URL: https://www.coginiti.co/blog/implementing-retrieval-augmented-generation-coginiti/?utm_campaign=Coginiti%20Monthly%20New...
Submission: On May 09 via api from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
This website stores cookies on your computer. These cookies are used to collect information about how you interact with our website and allow us to remember you. We use this information in order to improve and customize your browsing experience and for analytics and metrics about our visitors both on this website and other media. To find out more about the cookies we use, see our Privacy Policy If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference not to be tracked. Accept Decline * Platform * Data Workspace Edition * Coginiti Pro * For Individual Data Scientists, Engineers, and Analysts * Coginiti Team * Collaborative Practices for Teams & Departments * Coginiti Enterprise * Manage Collaborative Practices & Data Efficiency for Global Organizations * Capabilities * Deep Database & Object Store Support * AI Assistant * API for Your Data Stack * CoginitiScript Language * Powerful Query & Analysis * Share & Reuse Curated Assets * Industry Solutions * Insurance * Financial Services * Healthcare * Consumer Packaged Goods * All Your Favorite Data Platforms * Redshift * Netezza * Yellowbrick * More... * Services * Resources * Success Stories * Data Hero Profiles * Join us in celebrating those shaping the future of business * Case Studies * Real-world success stories in data-driven transformations * Learning Center * Live & On-Demand Training * White Papers * Blog * SQL Tutorials * User Resources * Release Notes * Documentation & Help Center * Submit a Support Request * Company * About * Our Story * We believe the future of intelligence is collaborative. * Responsible AI * Coginiti's AI commitment * Community * Partners * We’re proud of the relationships we’ve built. * Newsroom * Coginiti announcements and media. * Connect * Careers * View opportunities to join and grow with us. * Contact Us * Reach out to the Coginiti crew. * Pricing * Support * Try Free IMPLEMENTATION RETRIEVAL AUGMENTED GENERATION IN COGINITI Matthew Mullins March 15, 2024 Coginiti has implemented Retrieval Augmented Generation (RAG) for Coginiti Team and Enterprise customers to improve the quality of AI Assistant interactions. Retrieval Augmented Generation is a way to enhance the quality of responses from a large language model by supplying it with relevant domain knowledge that was not part of its original training data. What follows are some of the technical details about what Retrieval Augmented Generation is, how it works, and how we went about implementing it in Coginiti. COGINITI’S AI ASSISTANT Coginiti’s AI Assistant today is a direct integration to the model service for a number of large language models with some proprietary prompting. Coginiti ships with support for the public APIs from OpenAI and Anthropic along with cloud model services such as AWS Bedrock and Azure. Coginiti’s strategy is to support customer model choices the same way we support their choices around data platforms. When a user submits a question to Coginiti’s AI Assistant, we enrich the context of the session with some default prompts to enhance the context of the user enquiry. These prompts include the name of the connected data platform, such as AWS Redshift or Databricks, which helps the connected language model generate the appropriate platform syntax when generating SQL. Coginiti also provides a map of the connected schema, including table names, column names, and primary key relationships. These prompts are included to increase the accuracy of the model response. We test these prompts and various database schema across each of our supported model services for consistent results, though language models are nondeterministic so performance can vary. Coginiti also enables users to create static custom prompts, so they can tailor the interaction to the model for their own needs. (Yes, you can make it talk like a pirate) Using Coginiti’s AI Assistant gives consistently good results with text-to-sql generation, error explanations, performance recommendations, and general data guidance. However, we wanted the AI Assistant implementation to be able to handle more complex questions about the business. To be able to do that, the language model needs to have more exposure to information about the business. One way to approach this would be to fine-tune a model based on information from the business, but this is time consuming, expensive, and somewhat brittle. (If you want help though, reach out) Retrieval Augmented Generation presents a less expensive and more flexible way to get context relevant information to the model during the user interaction. WHAT IS RETRIEVAL AUGMENTED GENERATION Retrieval Augmented Generation could have also been called Search Augmented Generation because it involves searching over a collection of data for relevant results, then injecting those results into the context of the user’s chat session. The Coginiti product was well positioned for this kind of implementation because it ships with an embedded repository of relevant data in the form of our Analytics Catalog. The Analytics Catalog stores all your data and analytics assets from critical transformation code, cleansing routines, and query logic, along with relevant metadata in the form of documentation, comments, and tags. (Hint: if you are not enhancing your catalog assets with metadata today, you will want to start so you can fully leverage the RAG capabilities). Coginiti could have easily enabled keyword search across catalog assets since it already ships with a search engine, but keyword search is limiting. Traditional keyword search though is like using a flashlight in the dark to find something specific—you might find things that are labeled correctly but miss out on related items. In contrast, semantic search uses a broader beam to illuminate not just the exact words but also associated concepts and meanings. More powerful would be to use semantic search as a way to improve search accuracy for users by capturing the meaning of search queries. For example, an analyst searching for “holiday sales trends” would receive a mix of all queries mentioning “holiday,” “sales,” or “trends,” which could include irrelevant information. With semantic search, the system understands the context of “holiday sales trends” and recognizes the analyst’s likely intent to analyze customer purchasing behavior during holiday seasons. Semantic search algorithms create a vector space comprised of embeddings of all entries of a context corpus, in our case, the Analytics Catalog. That’s a complicated way of saying that embeddings are a way of translating words or phrases into a language that computers can understand better. Just like each word in a dictionary has a definition, in the world of AI, each word or phrase gets a unique numeric ‘code’ that captures its meaning based on how it’s used in the real world. When a user submits a question, the question is embedded using the same vector space. The most similar queries will be closer together in the vector space. Thus, the closest embedding (or ‘k’ closest embeddings) are returned to the user, providing them the best result for their question. (For a deep read on embeddings see: Vicki Boykis on What are Embeddings) COGINITI AI ASSISTANT + RAG Coginiti’s architecture presents a number of design constraints when it comes to implementing new services. Coginiti is deployed software, so any of its services need to be small enough to run containerized on a single server or scale out to serve thousands of users. Many of Coginiti’s customers deploy into highly secure environments limiting our ability to call outside services. We performed our initial testing and proof-of-concept using OpenAI’s embedding API, but as a practical matter, this means that Coginiti cannot make use of such services because of their public nature. We needed to find an embeddings model that could run in a container and be relatively fast using CPUs rather than GPUs for compute. We ended up testing six model families: Model nameToken number limitModel size (GB)Embedding dimensions (output vector size)Time to embed 1024 tokens (seconds)Time to process Analytics Catalog (seconds)Performance on MTEB leaderboard (Retrieval)all-MiniLM-L6-v22560.093840.16~6.551UAE-Large-V15121.34 (0.33 GB quantized)10241.081613bge-large-en-v1.55121.3410241.09~2004bge-base-en-v1.55120.447680.3449.96bge-small-en-v1.55120.113840.243410gte-large5120.6710241.081557gte-base5120.227680.245413gte-small5120.073840.203720gte-tiny5120.053840.1214.741udever-bloom-7b1204828.84096N/A (couldn’t run it on CPU)N/A21voyage-lite-01-instruct4096N/A (cloud service)1024N/AN/A1 To create our embeddings Coginiti selected the BGE-M3 embedding model series as a state-of-the-art multi-lingual and cross-lingual model. The multi-lingual and cross-lingual model support is important given we out embeddings consist of a specialized SQL, but also metadata that might be in a variety of written languages. The performance of the model due to its small size, acceptable token limit, and speed was also a critical factor. This embedding model runs as a standalone service within the Coginiti stack, so no data is sent outside the application stack, and it gives us a consistent embeddings tool across all of our clients. The embeddings for each catalog asset also need to be stored in a way that Coginiti can easily perform a vector similarity search. There are a number of open source databases purpose built as vector databases, Pinecone and Weaviate being two of the leading candidates. Using a standalone vector store would have meant adding yet another service to our stack, which would add additional management complexity and grow the resource demands to run the product. Fortunately, Coginiti makes use of Postgres as a backend storage layer which has an available extension, pg_vector, to enable storing embeddings and performing vector similarity search directly within Postgres. Our testing showed that pg_vector was more than adequate for our vector storage and search needs. As of this writing pg_vector is available with the managed Postgres service from every major cloud provider and it’s easily installable for self managed users. CONCLUSION We are excited to release our implementation of retrieval augmented generation for Coginiti Team and Enterprise customers! Being able to combine domain specific customer data with the existing power of generative large language models is a powerful combination. Looking forward to learning with our customers how this improves their workflows and accuracy working with Coginiti’s AI Assistant. If you would like to give it a trial or see a demo, reach out to schedule a call. * * Build Trusted Data Products Try Free FOOTER1 * Products * Coginiti Pro * Coginiti Team * Coginiti Enterprise * Capabilities * Powerful Query & Analysis * Share & Reuse Curated Assets * Deep Database Support FOOTER2 * Company * About Us * Partners * Careers * Newsroom * Responsible AI FOOTER3 * Resources * Live & On-Demand Training * White Papers * SQL Tutorials * Blog FOOTER4 * Support * Help Center * Coginiti Pro EULA * Coginiti Team EULA * Contact * P: 669-228-0280 * E: info@coginiti.co Connect with us Copyright © 2024, Coginiti Corp Privacy Policy