www.coginiti.co Open in urlscan Pro
52.22.180.1  Public Scan

Submitted URL: https://email.coginiti.co/e3t/Ctc/ZS+113/cChL-04/VWKvbG4ypGtfW7ytX3b416pfBW1qZjFf5dV4GJN5g75kd3qgyTW95jsWP6lZ3q1W3m_dCJ6mr...
Effective URL: https://www.coginiti.co/blog/implementing-retrieval-augmented-generation-coginiti/?utm_campaign=Coginiti%20Monthly%20New...
Submission: On May 09 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

This website stores cookies on your computer. These cookies are used to collect
information about how you interact with our website and allow us to remember
you. We use this information in order to improve and customize your browsing
experience and for analytics and metrics about our visitors both on this website
and other media. To find out more about the cookies we use, see our Privacy
Policy

If you decline, your information won’t be tracked when you visit this website. A
single cookie will be used in your browser to remember your preference not to be
tracked.

Accept Decline

 * Platform
   * Data Workspace Edition
     * Coginiti Pro
       * For Individual Data Scientists, Engineers, and Analysts
     * Coginiti Team
       * Collaborative Practices for Teams & Departments
     * Coginiti Enterprise
       * Manage Collaborative Practices & Data Efficiency for Global
         Organizations
   * Capabilities
     * Deep Database & Object Store Support
     * AI Assistant
     * API for Your Data Stack
     * CoginitiScript Language
     * Powerful Query & Analysis
     * Share & Reuse Curated Assets
   * Industry Solutions
     * Insurance
     * Financial Services
     * Healthcare
     * Consumer Packaged Goods
   * All Your Favorite Data Platforms
     * Redshift
     * Netezza
     * Yellowbrick
     * More...
 * Services
 * Resources
   * Success Stories
     * Data Hero Profiles
       * Join us in celebrating those shaping the future of business
     * Case Studies
       * Real-world success stories in data-driven transformations
   * Learning Center
     * Live & On-Demand Training
     * White Papers
     * Blog
     * SQL Tutorials
   * User Resources
     * Release Notes
     * Documentation & Help Center
     * Submit a Support Request
 * Company
   * About
     * Our Story
       * We believe the future of intelligence is collaborative.
     * Responsible AI
       * Coginiti's AI commitment
   * Community
     * Partners
       * We’re proud of the relationships we’ve built.
     * Newsroom
       * Coginiti announcements and media.
   * Connect
     * Careers
       * View opportunities to join and grow with us.
     * Contact Us
       * Reach out to the Coginiti crew.
 * Pricing
 * Support
 * Try Free


IMPLEMENTATION RETRIEVAL AUGMENTED GENERATION IN COGINITI

Matthew Mullins
March 15, 2024

Coginiti has implemented Retrieval Augmented Generation (RAG) for Coginiti Team
and Enterprise customers to improve the quality of AI Assistant interactions.
Retrieval Augmented Generation is a way to enhance the quality of responses from
a large language model by supplying it with relevant domain knowledge that was
not part of its original training data. What follows are some of the technical
details about what Retrieval Augmented Generation is, how it works, and how we
went about implementing it in Coginiti. 


COGINITI’S AI ASSISTANT

Coginiti’s AI Assistant today is a direct integration to the model service for a
number of large language models with some proprietary prompting. Coginiti ships
with support for the public APIs from OpenAI and Anthropic along with cloud
model services such as AWS Bedrock and Azure. Coginiti’s strategy is to support
customer model choices the same way we support their choices around data
platforms.

When a user submits a question to Coginiti’s AI Assistant, we enrich the context
of the session with some default prompts to enhance the context of the user
enquiry. These prompts include the name of the connected data platform, such as
AWS Redshift or Databricks, which helps the connected language model generate
the appropriate platform syntax when generating SQL.  Coginiti also provides a
map of the connected schema, including table names, column names, and primary
key relationships. These prompts are included to increase the accuracy of the
model response. 



We test these prompts and various database schema across each of our supported
model services for consistent results, though language models are
nondeterministic so performance can vary. Coginiti also enables users to create
static custom prompts, so they can tailor the interaction to the model for their
own needs. (Yes, you can make it talk like a pirate)

Using Coginiti’s AI Assistant gives consistently good results with text-to-sql
generation, error explanations, performance recommendations, and general data
guidance. However, we wanted the AI Assistant implementation to be able to
handle more complex questions about the business. To be able to do that, the
language model needs to have more exposure to information about the business.
One way to approach this would be to fine-tune a model based on information from
the business, but this is time consuming, expensive, and somewhat brittle. (If
you want help though, reach out) Retrieval Augmented Generation presents a less
expensive and more flexible way to get context relevant information to the model
during the user interaction. 


WHAT IS RETRIEVAL AUGMENTED GENERATION

Retrieval Augmented Generation could have also been called Search Augmented
Generation because it involves searching over a collection of data for relevant
results, then injecting those results into the context of the user’s chat
session. The Coginiti product was well positioned for this kind of
implementation because it ships with an embedded repository of relevant data in
the form of our Analytics Catalog. The Analytics Catalog stores all your data
and analytics assets from critical transformation code, cleansing routines, and
query logic, along with relevant metadata in the form of documentation,
comments, and tags. (Hint: if you are not enhancing your catalog assets with
metadata today, you will want to start so you can fully leverage the RAG
capabilities).



Coginiti could have easily enabled keyword search across catalog assets since it
already ships with a search engine, but keyword search is limiting. Traditional
keyword search though is like using a flashlight in the dark to find something
specific—you might find things that are labeled correctly but miss out on
related items. In contrast, semantic search uses a broader beam to illuminate
not just the exact words but also associated concepts and meanings. More
powerful would be to use semantic search as a way to improve search accuracy for
users by capturing the meaning of search queries. For example, an analyst
searching for “holiday sales trends” would receive a mix of all queries
mentioning “holiday,” “sales,” or “trends,” which could include irrelevant
information. With semantic search, the system understands the context of
“holiday sales trends” and recognizes the analyst’s likely intent to analyze
customer purchasing behavior during holiday seasons.

Semantic search algorithms create a vector space comprised of embeddings of all
entries of a context corpus, in our case, the Analytics Catalog. That’s a
complicated way of saying that embeddings are a way of translating words or
phrases into a language that computers can understand better. Just like each
word in a dictionary has a definition, in the world of AI, each word or phrase
gets a unique numeric ‘code’ that captures its meaning based on how it’s used in
the real world. When a user submits a question, the question is embedded using
the same vector space. The most similar queries will be closer together in the
vector space. Thus, the closest embedding (or ‘k’ closest embeddings) are
returned to the user, providing them the best result for their question. (For a
deep read on embeddings see: Vicki Boykis on What are Embeddings)


COGINITI AI ASSISTANT + RAG

Coginiti’s architecture presents a number of design constraints when it comes to
implementing new services. Coginiti is deployed software, so any of its services
need to be small enough to run containerized on a single server or scale out to
serve thousands of users. Many of Coginiti’s customers deploy into highly secure
environments limiting our ability to call outside services. We performed our
initial testing and proof-of-concept using OpenAI’s embedding API, but as a
practical matter, this means that Coginiti cannot make use of such services
because of their public nature. We needed to find an embeddings model that could
run in a container and be relatively fast using CPUs rather than GPUs for
compute. We ended up testing six model families:

Model nameToken number limitModel size (GB)Embedding dimensions (output vector
size)Time to embed 1024 tokens (seconds)Time to process Analytics Catalog
(seconds)Performance on MTEB leaderboard
(Retrieval)all-MiniLM-L6-v22560.093840.16~6.551UAE-Large-V15121.34 (0.33 GB
quantized)10241.081613bge-large-en-v1.55121.3410241.09~2004bge-base-en-v1.55120.447680.3449.96bge-small-en-v1.55120.113840.243410gte-large5120.6710241.081557gte-base5120.227680.245413gte-small5120.073840.203720gte-tiny5120.053840.1214.741udever-bloom-7b1204828.84096N/A
(couldn’t run it on CPU)N/A21voyage-lite-01-instruct4096N/A (cloud
service)1024N/AN/A1

To create our embeddings Coginiti selected the BGE-M3 embedding model series as
a state-of-the-art multi-lingual and cross-lingual model. The multi-lingual and
cross-lingual model support is important given we out embeddings consist of a
specialized SQL, but also metadata that might be in a variety of written
languages. The performance of the model due to its small size, acceptable token
limit, and speed was also a critical factor. This embedding model runs as a
standalone service within the Coginiti stack, so no data is sent outside the
application stack, and it gives us a consistent embeddings tool across all of
our clients.

The embeddings for each catalog asset also need to be stored in a way that
Coginiti can easily perform a vector similarity search. There are a number of
open source databases purpose built as vector databases, Pinecone and Weaviate
being two of the leading candidates. Using a standalone vector store would have
meant adding yet another service to our stack, which would add additional
management complexity and grow the resource demands to run the product.
Fortunately, Coginiti makes use of Postgres as a backend storage layer which has
an available extension, pg_vector, to enable storing embeddings and performing
vector similarity search directly within Postgres. Our testing showed that
pg_vector was more than adequate for our vector storage and search needs. As of
this writing pg_vector is available with the managed Postgres service from every
major cloud provider and it’s easily installable for self managed users.




CONCLUSION

We are excited to release our implementation of retrieval augmented generation
for Coginiti Team and Enterprise customers! Being able to combine domain
specific customer data with the existing power of generative large language
models is a powerful combination. Looking forward to learning with our customers
how this improves their workflows and accuracy working with Coginiti’s AI
Assistant. If you would like to give it a trial or see a demo, reach out to
schedule a call.

 * 
 * 

Build Trusted Data Products

Try Free


FOOTER1

 * Products
   * Coginiti Pro
   * Coginiti Team
   * Coginiti Enterprise
 * Capabilities
   * Powerful Query & Analysis
   * Share & Reuse Curated Assets
   * Deep Database Support


FOOTER2

 * Company
   * About Us
   * Partners
   * Careers
   * Newsroom
   * Responsible AI


FOOTER3

 * Resources
   * Live & On-Demand Training
   * White Papers
   * SQL Tutorials
   * Blog


FOOTER4

 * Support
   * Help Center
   * Coginiti Pro EULA
   * Coginiti Team EULA
 * Contact
   * P: 669-228-0280
   * E: info@coginiti.co

Connect with us


Copyright © 2024, Coginiti Corp

Privacy Policy