Building a Text Assistant That Uses Your Knowledge Base with ChatGPT

When you first discover ChatGPT, one of the first things people do, myself included, is ask it questions. Since the model is trained on fixed data, it often responds with incomplete answers that do not take into account recent events. However, its responses are generally thorough and well-formulated. It could answer many more questions, and much more accurately, if provided with the knowledge needed to respond.

And that is exactly what I want to share with you today: how to build your own text assistant that uses an existing knowledge base to answer your questions.

A Quick Note on Data Privacy

For this tutorial, I will be using the OpenAI API. OpenAI currently does not use data received through their API to train their models (see this article). However, because a service's terms of use can change, I advise against sending overly sensitive data through the OpenAI API, or through any other LLM that is not self-hosted.

With that out of the way, let us get started with the tutorial.

How to Approach This

In this article, I want to show you a simple and quick way to build this assistant, and for that we will need two main components:

A mechanism to retrieve elements from the knowledge base
The API call to ChatGPT with the right prompts

Retrieving Knowledge Base Elements

For this project, you need a way to retrieve elements from the knowledge base in text format. The purpose of this mechanism is to determine, from the sentence the user sends to your assistant, which elements of your knowledge base can answer it, and gather them together.

Since ChatGPT models limit the number of tokens you can send, and therefore the number of characters (1 token is roughly equivalent to 1 syllable), you need to retrieve only the elements that are useful for answering the user's request.

You will therefore need a service that searches through your various data sources for the relevant information. For instance, you can use an indexing engine like ElasticSearch, Apache Solr, Lucene, or another solution to query and retrieve information. You can also use cloud services such as Azure Cognitive Search or AWS CloudSearch.

Calling OpenAI

Now that you have retrieved the data, you need to send it to ChatGPT along with the user's request.

Using the OpenAI API

OpenAI provides libraries in Python and JavaScript for communicating with ChatGPT. Community-developed libraries exist in other languages as well.

This library is very easy to use: just install it, generate an API key from your OpenAI account, and you are good to go. Here is an integration example in TypeScript:

async getAnswer(question: string, postsContents: string[]): Promise<string>
{
const posts = postsContents.join("\n");
const response = await this.openAi.chat.completions.create({
model: "gpt-3.5-turbo-16k",
messages: [{
role: "system",
content: "You will receive a markdown text preceded by 'Text:'"
}, {
role: "system",
content: "You will receive a question preceded by 'Question:'"
}, {
role: "system",
content: "Summarize the text I am going to give you"
}, {
role: "system",
content: "Answer the question using the text I provided"
}, {
role: "system",
content: "Your response should not include the question you are answering"
}, {
role: "user",
content: `Text: ${posts}\n\nQuestion: ${question}`
}]
});
return response.choices[0].message.content || '';
}

As you can see, the API is very straightforward to use. The response returned by OpenAI is somewhat complex, but the documentation makes it easy to navigate.

You can adapt the system prompts to your needs to format ChatGPT's response and vary the outputs. You can even let the user add their own. In this example, which is intentionally simple, I left some parameters at their default values, such as the number of output tokens, temperature, etc. I will let you explore the OpenAI SDK documentation to tailor your assistant to your needs.

Conclusion and Next Steps

Integrating ChatGPT into an application is very straightforward thanks to the SDK, which is easy to use and especially quick to integrate for an experienced developer. All you need is an API key and you can build whatever you want. The pricing is affordable, especially if you continue to use GPT 3.5, which remains highly performant and affordable at $0.002 per 1,000 tokens (approximately 300-400 words). You can also use GPT 4, which provides even better responses, but is 10 times more expensive.

I encourage you to look into the model pricing if you want more details.

Possible Improvements

To go further and give you some leads to continue, there is also the possibility of using embeddings, which simplify the search through your documents by transforming them.

And that is exactly what I want to share with you today: how to build your own text assistant that uses an existing knowledge base to answer your questions.

A Quick Note on Data Privacy

With that out of the way, let us get started with the tutorial.

How to Approach This

In this article, I want to show you a simple and quick way to build this assistant, and for that we will need two main components:

A mechanism to retrieve elements from the knowledge base
The API call to ChatGPT with the right prompts

Retrieving Knowledge Base Elements

Calling OpenAI

Now that you have retrieved the data, you need to send it to ChatGPT along with the user's request.

Using the OpenAI API

OpenAI provides libraries in Python and JavaScript for communicating with ChatGPT. Community-developed libraries exist in other languages as well.

This library is very easy to use: just install it, generate an API key from your OpenAI account, and you are good to go. Here is an integration example in TypeScript:

async getAnswer(question: string, postsContents: string[]): Promise<string>
{
const posts = postsContents.join("\n");
const response = await this.openAi.chat.completions.create({
model: "gpt-3.5-turbo-16k",
messages: [{
role: "system",
content: "You will receive a markdown text preceded by 'Text:'"
}, {
role: "system",
content: "You will receive a question preceded by 'Question:'"
}, {
role: "system",
content: "Summarize the text I am going to give you"
}, {
role: "system",
content: "Answer the question using the text I provided"
}, {
role: "system",
content: "Your response should not include the question you are answering"
}, {
role: "user",
content: `Text: ${posts}\n\nQuestion: ${question}`
}]
});
return response.choices[0].message.content || '';
}

As you can see, the API is very straightforward to use. The response returned by OpenAI is somewhat complex, but the documentation makes it easy to navigate.

Conclusion and Next Steps

I encourage you to look into the model pricing if you want more details.

Possible Improvements

To go further and give you some leads to continue, there is also the possibility of using embeddings, which simplify the search through your documents by transforming them.

Building a Text Assistant That Uses Your Knowledge Base with ChatGPT

A Quick Note on Data Privacy

How to Approach This

Retrieving Knowledge Base Elements

Calling OpenAI

Using the OpenAI API

Conclusion and Next Steps

Possible Improvements

Similar articles

N8N, What's That All About?

How AI Is Revolutionizing Marketing (Without Replacing You)

AI Training Needs Assessment Framework: A Guide for HR Directors and Managers

Newsletter

Go further

RAG pour Accès à l'Information

Extraction Documentaire Multimodale avec Gemini 2.5 Flash

Automatisation IA des audits énergétiques industriels

Machine Learning

Intelligence Artificielle

Formation Tech & IA

Building a Text Assistant That Uses Your Knowledge Base with ChatGPT

A Quick Note on Data Privacy

How to Approach This

Retrieving Knowledge Base Elements

Calling OpenAI

Using the OpenAI API

Conclusion and Next Steps

Possible Improvements

Similar articles

N8N, What's That All About?

How AI Is Revolutionizing Marketing (Without Replacing You)

AI Training Needs Assessment Framework: A Guide for HR Directors and Managers

Newsletter

Go further

RAG pour Accès à l'Information

Extraction Documentaire Multimodale avec Gemini 2.5 Flash

Automatisation IA des audits énergétiques industriels

Machine Learning

Intelligence Artificielle

Formation Tech & IA