Microsoft Dev Blogs

Semantic Router using Azure AI Search

thumbnail

Semantic Router using Azure AI Search

TL;DR

This blog post explores the use of a semantic router using Azure AI Search to handle query routing in AI systems, focusing on a banking chatbot use case. The approach aims to direct user queries to the most relevant nodes, such as specialized AI models, tools, or data sources.

Introduction

Query routing in chatbot systems involves directing user queries to the most appropriate response or action based on the detected intent. Modern systems utilize LLMs for intent classification, but accuracy can be a challenge as the number of routing options increases. This post introduces a routing system to address these challenges efficiently and at scale.

Possible Solution – Semantic Router using Azure AI Search

By leveraging Azure AI Search with hybrid search and re-ranking, along with phi-3-mini-4k-instruct for intention detection and relevance checking, a semantic router can be implemented. Node utterances are indexed in Azure AI Search, and the system executes semantic search based on query intentions.

Advantages of this Approach

  • Accuracy: This approach may offer improved accuracy compared to few-shot prompting, as it can handle a larger number of routes efficiently.
  • Comparison to Fine-tuned SLMs: The extensibility of this approach sets it apart from fine-tuning smaller language models, making it a promising solution for routing.

Results and Summary

  • Using GPT-3.5 with Azure AI Search proved to be the most accurate, achieving 100% accuracy for both utterance and node matching.
  • GPT-4o showed strong performance with 96.67% utterance accuracy, 100% node accuracy, and a fast execution time of 1.11 seconds on average.
  • GPT-4o-mini demonstrated good results with 93.33% utterance accuracy, 96.67% node accuracy, and an average execution time of 1.17 seconds – slightly slower than GPT-4o.

In conclusion, the semantic router using Azure AI Search offers a promising solution for efficient and accurate query routing in AI systems, particularly in complex use cases like banking chatbots.