Use Langchain.js and Azure OpenAI to create an awesome QA RAG Web Application. In this quick read you will learn how you can leverage Node.js to do some amazing things with AI.
Demo: Mpesa for Business Setup QA RAG Application
In this tutorial we are going to build a Question-Answering RAG Chat Web App. We utilize Node.js and HTML, CSS, JS. We also incorporate Langchain.js + Azure OpenAI + MongoDB Vector Store (MongoDB Search Index). Get a quick look below.
Note: Documents and illustrations shared here are for demo purposes only and Microsoft or its products are not part of Mpesa. The content demonstrated here should be used for educational purposes only. Additionally, all views shared here are solely mine.
What you will need:
- An active Azure subscription, get Azure for Student for free or get started with Azure for 12 months free.
- VS Code
- Basic knowledge in JavaScript (not a must)
- Access to Azure OpenAI, click here if you don't have access.
- Create a MongoDB account (You can also use Azure Cosmos DB vector store)
Setting Up the Project
In order to build this project, you will have to fork this repository and clone it. GitHub Repository link: https://github.com/tiprock-network/azure-qa-rag-mpesa . Follow the steps highlighted in the README.md to setup the project under Setting Up the Node.js Application.
Create Resources that you Need
In order to do this, you will need to have Azure CLI or Azure Developer CLI installed in your computer. Go ahead and follow the steps indicated in the README.md to create Azure resources under Azure Resources Set Up with Azure CLI.
You might want to use Azure CLI to login in differently use a code. Here's how you can do this. Instead of using az login. You can do
az login --use-code-device
OR you would prefer using Azure Developer CLI and execute this command instead
azd auth login --use-device-code
Remember to update the .env file with the values you have used to name Azure OpenAI instance, Azure models and even the API Keys you have obtained while creating your resources.
Setting Up MongoDB
After accessing you MongoDB account get the URI link to your database and add it to the .env file along with your database name and vector store collection name you specified while creating your indexes for a vector search.
Running the Project
In order to run this Node.js project you will need to start the project using the following command.
npm run dev
The Vector Store
The vector store used in this project is MongoDB store where the word embeddings were stored in MongoDB. From the embeddings model instance we created on Azure AI Foundry we are able to create embeddings that can be stored in a vector store. The following code below shows our embeddings model instance.
//create new embedding model instance
const azOpenEmbedding = new AzureOpenAIEmbeddings({
azureADTokenProvider,
azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME,
azureOpenAIApiEmbeddingsDeploymentName: process.env.AZURE_OPENAI_API_DEPLOYMENT_EMBEDDING_NAME,
azureOpenAIApiVersion: process.env.AZURE_OPENAI_API_VERSION,
azureOpenAIBasePath: "https://eastus2.api.cognitive.microsoft.com/openai/deployments"
});
The code in uploadDoc.js offers a simple way to do embeddings and store them to MongoDB. In this approach the text from the documents is loaded using the PDFLoader from Langchain community. The following code demonstrates how the embeddings are stored in the vector store.
// Call the function and handle the result with await
const storeToCosmosVectorStore = async () => {
try {
const documents = await returnSplittedContent()
//create store instance
const store = await MongoDBAtlasVectorSearch.fromDocuments(
documents,
azOpenEmbedding,
{
collection: vectorCollection,
indexName: "myrag_index",
textKey: "text",
embeddingKey: "embedding",
}
)
if(!store){
console.log('Something wrong happened while creating store or getting store!')
return false
}
console.log('Done creating/getting and uploading to store.')
return true
} catch (e) {
console.log(`This error occurred: ${e}`)
return false
}
}
In this setup, Question Answering (QA) is achieved by integrating Azure OpenAI’s GPT-4o with MongoDB Vector Search through LangChain.js. The system processes user queries via an LLM (Large Language Model), which retrieves relevant information from a vectorized database, ensuring contextual and accurate responses. Azure OpenAI Embeddings convert text into dense vector representations, enabling semantic search within MongoDB. The LangChain RunnableSequence structures the retrieval and response generation workflow, while the StringOutputParser ensures proper text formatting. The most relevant code snippets to include are: AzureChatOpenAI instantiation, MongoDB connection setup, and the API endpoint handling QA queries using vector search and embeddings. There are some code snippets below to explain major parts of the code.
Azure AI Chat Completion Model
This is the model used in this implementation of RAG, where we use it as the model for chat completion. Below is a code snippet for it.
const llm = new AzureChatOpenAI({
azTokenProvider,
azureOpenAIApiInstanceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME,
azureOpenAIApiDeploymentName: process.env.AZURE_OPENAI_API_DEPLOYMENT_NAME,
azureOpenAIApiVersion: process.env.AZURE_OPENAI_API_VERSION
})
Using a Runnable Sequence to give out Chat Output
This shows how a runnable sequence can be used to give out a response given the particular output format/ output parser added on to the chain.
//Stream response
app.post(`${process.env.BASE_URL}/az-openai/runnable-sequence/stream/chat`, async (req,res) => {
//check for human message
const { chatMsg } = req.body
if(!chatMsg) return res.status(201).json({
message:'Hey, you didn\'t send anything.'
})
//put the code in an error-handler
try{
//create a prompt template format template
const prompt = ChatPromptTemplate.fromMessages(
[
["system", `You are a French-to-English translator that detects if a message isn't in French. If it's not, you respond, "This is not French." Otherwise, you translate it to English.`],
["human", `${chatMsg}`]
]
)
//runnable chain
const chain = RunnableSequence.from([prompt, llm, outPutParser])
//chain result
let result_stream = await chain.stream()
//set response headers
res.setHeader('Content-Type','application/json')
res.setHeader('Transfer-Encoding','chunked')
//create readable stream
const readable = Readable.from(result_stream)
res.status(201).write(`{"message": "Successful translation.", "response": "`);
readable.on('data', (chunk) => {
// Convert chunk to string and write it
res.write(`${chunk}`);
});
readable.on('end', () => {
// Close the JSON response properly
res.write('" }');
res.end();
});
readable.on('error', (err) => {
console.error("Stream error:", err);
res.status(500).json({ message: "Translation failed.", error: err.message });
});
}catch(e){
//deliver a 500 error response
return res.status(500).json(
{
message:'Failed to send request.',
error:e
}
)
}
})
To run the front end of the code, go to your BASE_URL with the port given. This enables you to run the chatbot above and achieve similar results. The chatbot is basically HTML+CSS+JS. Where JavaScript is mainly used with fetch API to get a response. Thanks for reading. I hope you play around with the code and learn some new things.
Additional Reads
Updated Mar 10, 2025
Version 1.0theophilusO
Brass Contributor
Joined October 20, 2022
Educator Developer Blog
Follow this blog board to get notified when there's new activity