Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,60 @@ A chain for performing question-answering tasks with a retrieval component.
| Name | Description |
| ------------------------------ | ----------------------------- |
| ConversationalRetrievalQAChain | Final node to return response |


## Building a Simple RAG Pipeline

This example demonstrates a simple Retrieval-Augmented Generation (RAG) pipeline using a PDF document, OpenAI embeddings, an in-memory vector store, and a Conversational Retrieval QA Chain.

### Required Nodes

- PDF File Loader
- Recursive Character Text Splitter
- OpenAI Embeddings
- In-Memory Vector Store
- OpenAI Chat Model
- Conversational Retrieval QA Chain
Comment thread
shonibarej marked this conversation as resolved.
- Buffer Memory (Recommended)

### Node Connections

```text
Recursive Character Text Splitter → PDF File Loader

PDF File Loader → In-Memory Vector Store

OpenAI Embeddings → In-Memory Vector Store

In-Memory Vector Store (Retriever) → Conversational Retrieval QA Chain

OpenAI Chat Model → Conversational Retrieval QA Chain
Comment thread
shonibarej marked this conversation as resolved.

Buffer Memory → Conversational Retrieval QA Chain
```

### Explanation

The PDF File Loader loads the document content, while the Recursive Character Text Splitter breaks the document into smaller chunks for retrieval.

The OpenAI Embeddings node converts document chunks into vector representations. These embeddings are then stored in the In-Memory Vector Store together with the document chunks.

When a user asks a question, the Conversational Retrieval QA Chain retrieves the most relevant document chunks from the vector store and sends them to the chat model to generate a grounded response.

### Common Beginner Mistakes

- Connecting the embedding model directly to the Conversational Retrieval QA Chain
- Forgetting to connect the vector store retriever to the Conversational Retrieval QA Chain
- Confusing embeddings with chat models
- Forgetting to include required prompt variables such as `{context}` and `{question}`
- Connecting incorrect node output types together
Comment thread
shonibarej marked this conversation as resolved.
- Forgetting to add a Memory node, which prevents the chain from remembering previous interactions

### Recommended Prompt Rules

To reduce hallucinations and unsupported responses, consider adding rules such as:

- Use only the provided context to answer
- Do not use external knowledge
- If the information is not available in the provided context, clearly state that it was not found
- Avoid unsupported assumptions