Skip to main content
Cloud / Azure / Products / Model Router - Automatic LLM routing

Model Router - Automatic LLM routing

Model Router routes each prompt in real time to the best LLM, optimizing cost, latency and quality from a single deployment.

ai-machine-learning
Pricing Model Pay-per-use, billed at the selected model's input rate
Availability East US 2 and Sweden Central; Data Zone Standard for EU data residency
Data Sovereignty Data Zone Standard available for EU data residency
Reliability N/A (the underlying models' SLAs apply) SLA

What is Model Router?

Model Router is a trained language model in Microsoft Foundry that routes each prompt in real time to the most suitable large language model (LLM). You deploy Model Router like any other Foundry model and get a single deployment that bundles several LLMs behind one unified chat interface. The routing decision is based on attributes such as complexity, required reasoning and task type. Your application code does not need to change.

Model Router solves a concrete problem in running AI applications: if you send every request to the same powerful model, you also pay the highest price for trivial tasks. Model Router uses smaller and cheaper models when they are sufficient and falls back to larger or reasoning models when the task requires it. This lowers cost and latency while keeping quality comparable. Dozens of underlying models from multiple providers are currently available, including the GPT-5 series as well as Claude, Grok, DeepSeek and Llama models.

Core features

  • Real-time routing from one deployment: Model Router analyzes each prompt at runtime and selects the right model without storing your prompts. You manage one deployment instead of many individual model deployments.
  • Three routing modes plus model selection: Balanced (default) picks the most cost-effective model within a narrow quality band of roughly 1 to 2 percent. Cost widens the band to roughly 5 to 6 percent for maximum savings. Quality selects the highest-rated model and ignores cost. Model subset lets you define which models are eligible for routing at all.
  • Automatic failover and prompt caching: If a model has a transient issue, Model Router transparently redirects the request to the next most appropriate model. Failover is enabled by default. Prompt caching is used automatically when the selected model supports it.
  • Vision, tools and governance: Model Router accepts image input for vision-enabled chats but makes the routing decision based on text only. Audio input is not processed. Agentic scenarios with tools in the Foundry Agent Service are supported, and Azure Policy centrally controls which models may be included in a deployment.

Typical use cases

Cost optimization at high volume: Applications with many simple requests and occasional complex tasks benefit from Cost or Balanced mode. Trivial requests are routed to cheap models, so the budget stays reserved for the genuinely demanding tasks.

Unified interface for mixed workloads: Teams that want to cover different task types through one API, from short classifications to multi-step reasoning, get a single chat interface that selects the appropriate model for each request.

Higher availability through failover: Applications that need stable response times use the built-in automatic failover. If a model is temporarily unavailable, the next most suitable model takes over without the application having to implement that logic.

Benefits

  • Lower cost and reduced latency at comparable quality, because smaller models are used when they are sufficient.
  • Less operational overhead through a single deployment instead of many individual model deployments.
  • More control over cost, compliance and performance through routing modes and model subset, combined with Azure Policy governance.

Integration with innFactory

As a Microsoft Solutions Partner, innFactory supports you with the adoption and operation of this service.

Typical Use Cases

Cost optimization for high request volumes without quality loss
Unified chat interface for mixed tasks of varying complexity
Automatic failover between models for higher availability
Agentic scenarios with tool support in the Foundry Agent Service

Frequently Asked Questions

What is Model Router?

Model Router is a trained language model in Microsoft Foundry that analyzes each prompt in real time and routes it to the most suitable large language model. You deploy it like any other Foundry model and get a single deployment that bundles several LLMs behind one interface. Your application code stays unchanged.

When should I use Model Router?

Model Router is a good fit when your application handles tasks of varying complexity and you do not want to pay for an expensive model on every request. Simple requests go to smaller, cheaper models, while complex reasoning tasks go to more capable ones. It also helps with higher availability through automatic failover and supports agentic scenarios in the Foundry Agent Service.

How much does Model Router cost?

Usage is billed on a pay-per-use basis: you pay for input prompts at the rate of the underlying model that was selected, as listed on the pricing page. There is no separate routing fee. You can monitor your deployment costs in the Azure portal. Prompt caching reduces costs further when the selected model supports it.

Is Model Router available in the EU and how is data handled?

Model Router is available in the East US 2 and Sweden Central regions and supports the Global Standard and Data Zone Standard deployment types. With Data Zone Standard, requests stay within data zone boundaries, which enables EU data residency. Model Router does not store your prompts and only routes to models that are compatible with your access and data zone boundaries.

Microsoft Solutions Partner

innFactory is a Microsoft Solutions Partner. We provide expert consulting, implementation, and managed services for Azure.

Microsoft Solutions Partner Microsoft Data & AI

Similar Products from Other Clouds

Other cloud providers offer comparable services in this category. As a multi-cloud partner, we help you choose the right solution.

74 comparable products found across other clouds.

Ready to start with Model Router - Automatic LLM routing?

Our certified Azure experts help you with architecture, integration, and optimization.

Schedule Consultation