Accurate AI Responses with Peaka's RAG Pipeline and Upstash Vector
Problem: Generative AI Hallucinations and Accuracy
The era of artificial intelligence (AI) is here, presenting vast opportunities for harnessing data for business purposes. However, these prospects are not without their hurdles, including hallucinations, accuracy concerns, and potential security issues.
AI hallucinations can be amusing when you are chit-chatting with your favorite AI bot. But if you want to build generative AI applications for your specific business needs, you want to minimize the risk of AI hallucination. To achieve this, you need to feed domain- or product-specific information (context) into the model, which will be used to generate responses relevant to your needs.
Solution: RAG
One powerful technique that has emerged to address these challenges and enhance the accuracy of AI-generated text is Retrieval-augmented Generation (RAG), which combines data retrieval with text generation. RAG provides the context for the type of data needed and tells AI what to look for, which improves precision in data retrieval. By establishing clear limits on which data to use and which to ignore, RAG ensures that AI will generate precise, accurate, and contextually-relevant responses, mitigating the hallucination and security risks.
Building the optimal data stack for RAG
The effectiveness of RAG is directly linked to the efficiency of data retrieval. The more accurate and relevant the retrieved data, the better the context provided to the AI model, leading to more precise and contextually appropriate responses.
The contextual information that you will feed into the AI model comes from a number of database indexes.
- Full-text indexes lend themselves to semantic searches, where the context of a query is important, or fuzzy searches, where the query lacks precision due to spelling mistakes or variations in expressions.
- Vector databases are good at spotting data points with similar attributes based on vector distance.
- Graph databases, on the other hand, are perfect for capturing relationships between different entities and conducting network analysis.
All these databases have relative advantages and disadvantages compared to others. You need to use them in various combinations, mixing and matching to create the optimum data stack for your needs. How much of the context comes from which type of database depends on the actual domain. Finding the optimum balance between different databases is key to creating the best possible context and generating the most relevant result.
However, this is easier said than done. Pulling in data from different platforms and tools can turn into a multistep operation, which users are not keen on. Users want data retrieval to be a single-step process that takes fuzziness and semantic relationships into account.
Single-step context creation with Peaka
This is where Peaka comes in, as it offers an efficient data retrieval process to guide AI models. Establishing a semantic layer over various data sources, Peaka dynamically builds the optimum data stack required for the context. It even goes beyond accessing databases, connecting to SaaS tools and retrieving SaaS data without having to move it to a database.
By taking care of data retrieval and context preparation in a single SQL query, Peaka eliminates the need for multi-step operations and development that RAG requires. It provides AI models with precise contextual information so they can generate accurate and hallucination-free results for product-specific tasks.
In this blog post we will build a RAG pipeline for a movie recommendation chatbot by using Peaka We will add PostgreSQL and Upstash Vector as connections to Peaka and leverage Peaka’s Node Client to query data for the RAG pipeline.
The basic architecture will look like below:
TL;DR
If you want to skip the implementation details, check out the finished code on Github. Follow the instructions on Readme to run the project on your local machine.
If you want to try the demo for your self, we have deployed the project on Vercel. You can try the demo your self by clicking this link.
Prerequisites
You will need the following accounts for this project:
Tech Stack
Technology | Description |
---|---|
https://www.peaka.com/ | A zero-ETL data integration platform with single-step context generation capability |
https://upstash.com/docs/vector/overall/getstarted | The serverless vector database which will be used for storing vector embeddings. |
https://openai.com/ | An artificial intelligence research lab focused on developing advanced AI technologies. |
https://vercel.com/templates/ai | Library for building AI-powered streaming text and chat UIs. |
https://www.postgresql.org/ | A powerful, open source object-relational database system with over 35 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance. |
https://docs.nlkit.com/nlux/ | NLUX is an open-source JavaScript library for creating elegant and performant conversational user interfaces. |
https://nextjs.org/ | The React Framework for the Web. Nextjs will be used for building the chatbot app. |
Create Peaka Project and API Key
Once you login to Peaka, you need to create your project and connect sample data sets. Checkout Peaka Documentation for creating your Project and follow detailed instructions by clicking here. Enter your created project and click Connect sample data sets
button on the screen as shown in the image below:
In the sample data set both PostgreSQL and Upstash Vector data sources are already added. We will use these data sources for our demo app.
After you create your project, setup connections, and create your catalogs in Peaka, you need to generate a Peaka API Key to use it in our project.
Check out Peaka Documentation for creating your API Key and follow the detailed instructions by clicking here. Copy and save your Peaka API Key.
Create a Next.js Application
To create a new project, first navigate to the directory that you want to create your project in using your terminal. Then, run the following commands:
- Use Tailwind CSS (for UI design)
- Use "App Router"
After this step, go to your project's folder with the cd
command and install the necessary libraries
From now on, when you are at the directory of your project the command
npm run dev
is going to be sufficient to run the project on localhost:3000.
If you need further clarification, you can refer to the Readme or the Next.js documentation
Create .env file
Now create a file called .env
in your project and add it to the .gitignore
file if you are considering to add this project to your Github account. In this file, we will store our API keys.
Create Peaka Service
Create a service
folder under the root folder in your project and create a peaka.service.ts
file inside this folder. We will need to implement two methods in this service class. The first method is getRentedMoviesOfUser
, which will fetch all the movies rented by a customer. It will query the PostgreSQL database with the following SQL Query.
Then we need to implement getMovieRecommendationFromVectorDatabase
method, which will query both Upstash Vector index and will join the results with PostgreSQL
database to get all of the metadata of the movies. The query will be like this:
Finally, peaka.service.ts
will look like this:
Create Chat Prompts
Create a folder config
in the root directory of the project and create a config.ts
under this folder. We will define our system prompt and OpenAI parameters for our chatbot in here with lodash templates and export them like below:
Implement Chatbot API
In the chatbot, we should first create a POST endpoint. The input of this endpoint should be the message of the user and output should be the message generated by LLM running on OpenAI.
First, we will create route.ts
under app/api/chat
. This file will have the POST endpoint with /api/chat
extension in the url.
We will use ai-sdk of Vercel for response streaming and use langchain open library to interact with LLM.
The code is straight forward with an algorithm is like this:
- Get the prompt from request body
- Get all movies rented by user with user email
- Get recommended movies by running query Upstash Vector
- Ask OpenAI to extract
SearchCriteria
to from query - Filter recommended movies
SearchCritera
rating only - If a recommended movie is already rented remove it from recommended movie list
- Finally, feed the recommended movies and already rented movies to the LLM and stream the response of LLM to frontend
Implementation of api/chat
route should like this:
Implement Chatbot UI
We have the POST endpoint ready. Now we need the UI for our chatbot. For the UI, we will use NLUX
. We choose NLUX
because it provides easy integration with Vercel AI SDK.
Let's open pages.tsx
file to build our chat window. The following code will implement a very basic chatbot UI for this demo. We will use AiChat
component from NLUX
and need to implement ChatAdapter
interface in order to communicate with the backend. Then, we provide conversationOptions
to our AiChat
component which will built-in chat prompt for demo purposes.
After we finish our implementation, the UI should look like this:
Let’s try one of our sample prompts and see what our movie recommendation bot recommends:
As you can see, our chat bot recommended a movie according to our search criteria by combining movie data from PostgreSQL and Upstash Vector with Peaka’s query engine. Our simple RAG pipeline is feeding the LLM with the names of non-existent movies in the training data of OpenAI.
Conclusion
In this tutorial, we’ve demonstrated how you can build a RAG pipeline using Peaka and Upstash Vector. By leveraging Peaka’s unified query engine and Upstash Vector's efficient similarity search capabilities, we provided the necessary data to the LLM's context window based on the user's query to ensure accurate answers and avoid hallucinations.
If you have any questions or comments, feel free to reach out to me on GitHub.