Skip to main content

Command Palette

Search for a command to run...

Advanced RAG: Routing

Let's search with intelligence

Updated
3 min read
Advanced RAG: Routing

Introduction:

Previously, we learned about some “Query Translation” Techniques: “ How to break a query and get the gist of it? “. In this article, we’re gonna work on the data that is stored in our Database.

Problem statement:

Suppose we have a huge database where we have information about JavaScript, Node.js, Python, Ruby, Rust, and all the stuff about modern Web Development. Now, play some Q&As.

  • When I search for something in there, is it searching the whole Database?

    -Yes.

  • Does it help if I hugely upgrade the quality of my query and then search?

    -No, actually. I have upgraded the query, but did not make any way to search the Database efficiently.

🤔
Then what should we do?

Some approach:

Let’s think of an easy approach. First, we will mark our data chunks according to their domains. Then, when we search the Database, we will specifically search on the targeted domains. In this way we can reduce the cost of operation, right?

This is called “Routing”. Easy, right?

This is the basic process of “Routing”.

Routing:

Definition:

Routing is a technique where the system intelligently decides which retrieval strategy, knowledge source, or processing path to use based on the characteristics of the incoming query.

Think of it like a smart receptionist:

  • Medical questions → Route to medical expert

  • Legal questions → Route to the legal department

  • Technical questions → Route to the engineering team

  • Simple questions → Route to the general information desk

How Routing Works:

Usually, Routing works in these 4 steps:

  1. Query Analysis and Classification:

    The router examines the incoming query to determine:

    • Query Type: Question, comparison, how-to, definition, etc.

    • Domain: Technical, business, medical, legal, etc.

    • Complexity: Simple factual vs complex analytical

    • Intent: Information seeking, problem-solving, decision-making

  2. Route Decision:

    Based on the analysis, the router decides:

    • Which retrieval method to use (vector search, keyword search, hybrid)

    • Which knowledge source to query (general docs, technical docs, specific databases)

    • Which processing strategy to apply (direct retrieval, decomposition, HyDE, parallel)

    • Which model/prompt to use for generation

  3. Execution:

    The query is sent down the chosen path with appropriate configurations

  4. Response:

    Results are formatted according to the route's specifications.

Some Procedural Examples:

Let’s simulate this process with a question: “How do I deploy a React app?“

  1. Query Analysis:

    • Type: How-to/Procedural

    • Domain: Web Development

    • Complexity: Medium

    • Intent: Problem-solving

  2. Routing Decision:

    • Knowledge Source: Development documentation

    • Method: Keyword + semantic search

    • Strategy: Step-by-step retrieval

    • Response Format: Numbered instructions

  3. Execution:

    Will go to the chunk where the information about Development is stored. Run some Query Retrieval Techniques. Get the Data.

  4. Response:

    Give the extracted Data.

Types of Routing:

Two types of Routing are widely followed in the industry:

  1. Logical Routing

  2. Semantic Routing

We will learn about them in detail in the later articles.

Conclusion:

Routing is essential when we need to handle a large amount of data. But for smaller applications, it only increases complications.

Tour with GenAI

Part 3 of 12

This series explores how LLMs like ChatGPT go beyond chat, diving into automation, from sending requests to getting intelligent responses. Learn how real-world LLM-powered systems are built behind the scenes.

Up next

HyDE (Hypothetical Document Embeddings)

Let's search with some context