RAG Agent - Byne Documentation

Introduction

In this guide, we’ll walk through the process of building an Agentic RAG (Retrieval-Augmented Generation) system using Bynesoft’s API. Unlike traditional RAG pipelines, we’ll be creating a RAG Agent that can intelligently interact with a knowledge base. This guide is also available as a Colab workbook.

What is Agentic RAG?

Agentic RAG combines the power of RAG with the flexibility and intelligence of AI agents. In a traditional RAG system, the process of retrieving information and generating responses is relatively straightforward. With Agentic RAG, we introduce an intelligent agent that can:

Interpret user queries more effectively
Decide how to best search the knowledge base
Combine information from multiple sources
Ask follow-up questions when needed
Provide more contextually relevant and accurate responses

Why Choose RAG Agents over Simple RAG?

RAG Agents offer several advantages over simple RAG pipelines:

Improved context understanding: Agents can maintain context over multiple interactions, leading to more coherent and relevant responses.
Dynamic query refinement: Agents can reformulate queries based on initial results, improving retrieval accuracy.
Multi-step reasoning: Complex queries that require information from multiple documents can be handled more effectively.
Customizable behavior: Agents can be fine-tuned to follow specific guidelines or exhibit particular traits, making them more adaptable to various use cases.
Interactive clarification: When information is ambiguous or insufficient, agents can ask users for clarification, leading to better results. By building a RAG Agent, we’re creating a more intelligent and flexible system that can handle a wider range of tasks and provide more valuable insights from your knowledge base.

Prerequisites

A Bynesoft API key
Python environment with the requests library installed

Creating a Knowledge Base

First, we need to set up our authentication headers using our Bynesoft API key.

import requests

KEY = 'API_KEY'
headers = {
    'X-API-Key': KEY
}

Now, let’s create a knowledge base:

knowledge_base_obj = requests.post(
    "https://app.docs.bynesoft.com/api/knowledge-base/",
    headers=headers,
    json={
        "name": "myKB_1",
        "type": "query",
        "paragraphs": {"chunkSize": 400, "chunkOverlap": 200},
    },
).json()

knowledge_base_id = knowledge_base_obj["id"]

This creates a knowledge base with a chunk size of 400 tokens and an overlap of 200 tokens. The response will include the knowledge base ID, which we’ll use in subsequent steps.

Ingesting Documents

To add documents to our knowledge base, we’ll follow these steps:

Obtain an upload link
Upload the document
Create a job for processing
Trigger the processing

Step 1: Obtain an Upload Link

document_name = "Critical-AI-Challenges-30105-Nvidia-Supermicro.pdf"
document_path = "Critical-AI-Challenges-30105-Nvidia-Supermicro.pdf"
link = requests.get(
    "https://app.docs.bynesoft.com/api/connectors/local/s3-upload-links",
    headers=headers,
    params={"kb": knowledge_base_id, "fileName": [document_name]},
)
link_uri = link.json()[document_name]

Step 2: Upload the Document

mime_headers = {"Content-Type": "application/pdf"}
upload = requests.put(link_uri, data=open(document_path, "rb"), headers=mime_headers)

Step 3: Create a Job for Processing

job_id = requests.post(
    f"https://app.docs.bynesoft.com/api/knowledge-base/{knowledge_base_id}/jobs",
    headers=headers,
).json()

Step 4: Trigger the Processing

import json

processing = requests.put(
    f"https://app.docs.bynesoft.com/api/knowledge-base/{knowledge_base_id}/jobs/{job_id}",
    headers=headers,
    data=json.dumps(
        [
            {
                "fileName": document_name,
                "lastModified": "Wed Jul 3 2024",
                "connector": "local",
            }
        ]
    ),
)
trigger_status = requests.post(
    f"https://app.docs.bynesoft.com/api/knowledge-base/{knowledge_base_id}/jobs/{job_id}/trigger",
    headers=headers,
)

Creating a RAG Agent

To implement our Agentic RAG system, we’ll create an agent with an execution layer and a prompt template. This agent will be responsible for intelligently querying our knowledge base and providing contextual responses.

Step 1: Create an Execution Layer

exe_layer = requests.post(
    f"https://app.docs.bynesoft.com/api/users/execution-layers",
    headers=headers,
    data=json.dumps(
        {
            "name": "kb_exec",
            "description": "Searches a knowledge base",
            "sourceFabric": {
                "name": "knowledgeBase",
                "config": {
                    "kbId": knowledge_base_id,
                    "searchOptions": {
                        "hybrid": {"enabled": "false", "keywords": []},
                        "cosineSimilarityScoreThreshold": 0.8,
                        "fileIdsFilter": [],
                    },
                },
            },
        }
    ),
)
exe_layer_id = exe_layer.json()["id"]

Step 2: Create a Prompt Template

prompt = "You are a helpful assistant. Your job is to search files using the function kb_exec and help the user based on the data you find."
template = requests.post(
    f"https://app.docs.bynesoft.com/api/ask/templates",
    headers=headers,
    data=json.dumps({"name": "kb_prompt", "type": "llm", "template": prompt}),
)
template_id = template.json()["id"]

Step 3: Create the RAG Agent

agent = requests.post(
    f"https://app.docs.bynesoft.com/api/users/agents",
    headers=headers,
    data=json.dumps(
        {
            "name": "kb_agent",
            "templateId": template_id,
            "description": "Searches a knowledge base",
            "exeLayers": [exe_layer_id],
        }
    ),
)
agent_id = agent.json()["id"]

Querying the Knowledge Base with the RAG Agent

Now that we have our RAG Agent set up, we can query our knowledge base:

agent_resp = requests.post(
    f"https://app.docs.bynesoft.com/api/ask/agents/{agent_id}/query",
    headers=headers,
    data=json.dumps({}),
    params={"q": "What is the rationale for running LLMs on-prem?"},
)
response = agent_resp.json()
print(response)

This will return a response from the RAG Agent based on the information in your knowledge base. The agent will intelligently search for relevant information, combine insights from multiple sources if necessary, and provide a comprehensive answer to the query.

Conclusion

In this guide, we’ve walked through the process of creating an Agentic RAG system using Bynesoft’s API. We’ve covered:

Creating a knowledge base
Ingesting documents
Creating a RAG Agent with an execution layer and prompt template
Querying the knowledge base using the intelligent agent

This Agentic RAG setup allows you to build more powerful and flexible applications compared to traditional RAG pipelines. The RAG Agent can handle complex queries, maintain context over multiple interactions, and provide more accurate and contextually relevant responses. You can expand on this foundation to create sophisticated applications such as:

Intelligent chatbots that can engage in multi-turn conversations about your documents
Advanced document analysis tools that can draw insights from multiple sources
Automated research assistants that can handle complex, multi-step queries
Interactive knowledge exploration systems that can guide users through large document collections

By leveraging the power of Agentic RAG, you’re creating a more dynamic and adaptable system that can provide greater value from your knowledge base.

Get Started

​Introduction

​What is Agentic RAG?

​Why Choose RAG Agents over Simple RAG?

​Prerequisites

​Creating a Knowledge Base

​Ingesting Documents

​Step 1: Obtain an Upload Link

​Step 2: Upload the Document

​Step 3: Create a Job for Processing

​Step 4: Trigger the Processing

​Creating a RAG Agent

​Step 1: Create an Execution Layer

​Step 2: Create a Prompt Template

​Step 3: Create the RAG Agent

​Querying the Knowledge Base with the RAG Agent

​Conclusion