How to Chat with Scanned Documents using ChatGPT

May 19, 2023

Large language models (LLMs) are emerging which can help us in tasks like text summarization, translation, question answering, etc and OpenAI’s ChatGPT is one of the state-of-art models which is available via its web or API interfaces.

ChatGPT can be useful for document management. One use case of it is to answer questions with the document providing the background knowledge.

In the previous article, we’ve built a web application to scan documents and run OCR with Dynamic Web TWAIN and Tesseract.js. In this article, we are going to add the ability to chat with scanned documents to it.

Online demo

Use LangChain + ChatGPT to Chat with Scanned Documents

LangChain is a library aiming to assist in calling LLMs. We can use it to let ChatGPT answer questions with scanned documents providing extra knowledge. Here are the steps to use it in a web application:

Install LangChain

npm install langchain

Create a Vector Store from the Text of Scanned Documents

A Vector Store stores memories in a VectorDB and queries the top-K most “salient” docs every time it is called. It can be used to provide related background knowledge for a chat.

Create an OpenAI embeddings model.

// Create the models
const embeddings = new OpenAIEmbeddings({openAIApiKey: apikey});

Use RecursiveCharacterTextSplitter to split the text into split documents.

const splitter = new RecursiveCharacterTextSplitter({
  chunkSize: 4000,
  chunkOverlap: 200,
});
const docs = await splitter.createDocuments([text]);

Create a Vector Store using the split documents and the embeddings model.

const store = await MemoryVectorStore.fromDocuments(docs, embeddings);

Start a Chain to Answer Questions over Scanned Documents

Create a new OpenAI model.

const model = new OpenAI({openAIApiKey: apikey, temperature: 0 });

Create a new QA chain.

const chain = loadQARefineChain(model);

Select the documents related to the question from the Vector Store.
```
const relevantDocs = await store.similaritySearch(question);
```

Call the chain to answer the question over the selected documents.

// Call the chain
const res = await chain.call({
  input_documents: relevantDocs,
  question,
});

Have a Test

Next, let’s test if it works well. Here, we test two example documents.

Example 1 - Release Notes of Dynamic Web TWAIN

Suppose we would like to know when the v18.2 version of Dynamic Web TWAIN was released.

If we don’t use the document as the background knowledge, it will produce the following answer:

Without Release Notes

After using the document, it can correctly answer the question:

With Release Notes

Example 2 - BBC News about World Snooker Championship

Suppose we would like to know who won the World Snooker Championship 2023.

If we don’t use the news as the background knowledge, it will produce the following answer:

Without News

After using the news, it can correctly answer the question:

With News

Drawbacks

ChatGPT sometimes makes up things which are not true.

Source Code

Checkout the source code of the demo to have a try:

https://github.com/tony-xlh/Chat-with-Scanned-Documents/

LANGUAGES

PLATFORMS

FEATURED