HomeEnterprise & CloudClaude on Microsoft Azure Foundry
intermediate12 min read· Module 11, Lesson 4

🟦Claude on Microsoft Azure Foundry

Run Claude through Azure — setup and deployment

What Is Azure AI Foundry?

Azure AI Foundry (formerly Azure AI Studio) is Microsoft's unified platform for building, deploying, and managing AI applications on Azure. It provides a single hub where you can discover, test, and consume foundation models from multiple providers — including Anthropic's Claude family.

When you access Claude through Azure AI Foundry you are not calling the Anthropic API directly. Instead, every request is routed through Microsoft Azure infrastructure. This means billing, identity management, networking, and compliance all stay inside your existing Azure organization.

Think of Azure AI Foundry as a "model catalog + deployment engine" within Azure. You pick the model you want, deploy it to an endpoint, and Azure handles everything else — infrastructure, scaling, and enterprise security.


Why Use Claude on Azure Foundry?

ReasonDetails
Azure-native billingClaude usage appears on your consolidated Azure invoice — no separate Anthropic billing account required.
Entra ID (Azure AD)Access is governed by Microsoft Entra ID roles and policies you already manage — no separate API keys needed.
ComplianceAzure compliance certifications (SOC 2, HIPAA, ISO 27001, FedRAMP) apply to model endpoints automatically.
Private networkingReach the API through Azure Private Endpoints and VNet integration — traffic never leaves the Microsoft backbone.
Unified toolingUse the same Azure CLI, Azure Portal, Bicep/ARM templates, and Terraform provider you use for everything else.
Enterprise supportA single Microsoft Unified Support contract covers both infrastructure and model access.
Content safetyAzure AI Content Safety filters can be layered on top of Claude responses for additional moderation.

If your organization is already invested in Azure, Foundry is often the path of least resistance for adopting Claude at scale.


Subscription Requirements

Before you can use Claude on Azure AI Foundry, you need:

  1. An active Azure subscription — Pay-As-You-Go, Enterprise Agreement, or CSP all work.
  2. Sufficient quota — Claude models require quota allocation in your target region.
  3. Resource provider registration — The Microsoft.MachineLearningServices resource provider must be registered in your subscription.
  4. Contributor or higher role — You need at least Contributor access on the resource group where you will create the Foundry resource.

To register the resource provider via CLI:

Terminal
az provider register --namespace Microsoft.MachineLearningServices az provider show --namespace Microsoft.MachineLearningServices --query "registrationState"

Note: Registration can take a few minutes. Wait until the state shows Registered.


Creating an Azure AI Foundry Resource

Step 1 — Create a Resource Group (if needed)

Terminal
az group create \\ --name rg-ai-foundry \\ --location eastus2

Step 2 — Create the AI Foundry Hub

You can create a Foundry hub through the Azure Portal or the CLI.

Portal method:

  1. Open the Azure Portal and search for Azure AI Foundry.
  2. Click + Create and select Hub.
  3. Choose your subscription, resource group, and region.
  4. Give it a name (e.g. hub-ai-production).
  5. Configure networking (public or private endpoint).
  6. Click Review + Create.

CLI method:

Terminal
az ml workspace create \\ --name hub-ai-production \\ --resource-group rg-ai-foundry \\ --kind hub \\ --location eastus2

Step 3 — Create a Project Inside the Hub

A project is a workspace within the hub where you deploy models and manage endpoints.

Terminal
az ml workspace create \\ --name project-claude \\ --resource-group rg-ai-foundry \\ --kind project \\ --hub-id /subscriptions/<sub-id>/resourceGroups/rg-ai-foundry/providers/Microsoft.MachineLearningServices/workspaces/hub-ai-production \\ --location eastus2

Provisioning Claude Deployments

Once your hub and project exist, you can deploy Claude models from the Model Catalog.

Using the Portal

  1. Open your AI Foundry project in the Portal.
  2. Navigate to Model Catalog in the left menu.
  3. Filter by provider → Anthropic.
  4. Select the model you want (e.g. Claude Sonnet 4).
  5. Click Deploy.
  6. Choose a deployment name (e.g. claude-sonnet-4-prod).
  7. Select the deployment type:
    • Serverless API (Pay-as-you-go) — recommended for most workloads.
    • Managed compute — for dedicated throughput needs.
  8. Accept the Anthropic usage terms.
  9. Click Deploy.

Using the CLI

Terminal
az ml serverless-endpoint create \\ --resource-group rg-ai-foundry \\ --workspace-name project-claude \\ --name claude-sonnet-4-prod \\ --model-id azureml://registries/azureml-anthropic/models/Claude-sonnet-4

Important: Not all Claude models are available in every Azure region. Check the Model Catalog for availability in your target region.


Available Model IDs on Azure Foundry

Model identifiers on Azure AI Foundry differ from both the direct Anthropic API and other cloud providers:

Anthropic ModelAzure Foundry Model ID
claude-opus-4-6-20250514Claude-opus-4
claude-sonnet-4-6-20250514Claude-sonnet-4
claude-haiku-3-5-20241022Claude-3.5-haiku

Important: Always use the Azure Foundry model ID when deploying or referencing models. The direct Anthropic model names will not work. The deployment name you choose during provisioning is what you use in API calls.


Endpoint Differences from the Direct API

When using Claude through Azure AI Foundry, several key differences apply compared to the direct Anthropic API:

AspectDirect Anthropic APIAzure AI Foundry
Base URLhttps://api.anthropic.comhttps://<deployment-name>.<region>.models.ai.azure.com
Auth headerx-api-key: sk-ant-...Authorization: Bearer <token> or api-key: <key>
API path/v1/messages/v1/messages (same)
API versionanthropic-version headerapi-version query parameter
BillingAnthropic accountAzure subscription
Rate limitsPer Anthropic tierPer Azure quota allocation
Content filteringAnthropic safetyAzure AI Content Safety + Anthropic safety

The request and response body format is largely the same — Azure AI Foundry uses the Messages API format, so your existing message structures, system prompts, and tool definitions carry over.


Authentication

Azure AI Foundry supports two authentication methods:

Method 1 — API Key (Simple)

Each serverless deployment generates a primary and secondary key.

Terminal
# Retrieve your deployment keys az ml serverless-endpoint get-credentials \\ --resource-group rg-ai-foundry \\ --workspace-name project-claude \\ --name claude-sonnet-4-prod

Use the key in your requests:

Terminal
curl -X POST "https://claude-sonnet-4-prod.eastus2.models.ai.azure.com/v1/messages" \\ -H "api-key: <your-api-key>" \\ -H "Content-Type: application/json" \\ -d '{ "model": "claude-sonnet-4-6-20250514", "max_tokens": 1024, "messages": [ {"role": "user", "content": "What is the capital of France?"} ] }'

Method 2 — Microsoft Entra ID (Recommended for Production)

Use managed identities or service principals for keyless authentication:

Terminal
# Get a token using Azure CLI TOKEN=$(az account get-access-token \\ --resource https://ml.azure.com \\ --query accessToken -o tsv) curl -X POST "https://claude-sonnet-4-prod.eastus2.models.ai.azure.com/v1/messages" \\ -H "Authorization: Bearer $TOKEN" \\ -H "Content-Type: application/json" \\ -d '{ "model": "claude-sonnet-4-6-20250514", "max_tokens": 1024, "messages": [ {"role": "user", "content": "Explain quantum computing in simple terms."} ] }'

Best practice: Always use Entra ID authentication in production. Reserve API keys for local development and quick prototyping.


SDK Configuration

Using the Anthropic SDK

The official Anthropic SDK does not natively support Azure AI Foundry the way it supports Bedrock or Vertex AI. Instead, you use the Azure AI Inference SDK or call the REST API directly.

Using the Azure AI Inference SDK — Python

Terminal
pip install azure-ai-inference azure-identity
Python
from azure.ai.inference import ChatCompletionsClient from azure.core.credentials import AzureKeyCredential endpoint = "https://claude-sonnet-4-prod.eastus2.models.ai.azure.com" api_key = "<your-api-key>" client = ChatCompletionsClient( endpoint=endpoint, credential=AzureKeyCredential(api_key), ) response = client.complete( messages=[ {"role": "user", "content": "What is the capital of France?"} ], max_tokens=1024, ) print(response.choices[0].message.content)

Using Entra ID with the SDK

Python
from azure.ai.inference import ChatCompletionsClient from azure.identity import DefaultAzureCredential endpoint = "https://claude-sonnet-4-prod.eastus2.models.ai.azure.com" client = ChatCompletionsClient( endpoint=endpoint, credential=DefaultAzureCredential(), ) response = client.complete( messages=[ {"role": "user", "content": "Explain machine learning briefly."} ], max_tokens=1024, ) print(response.choices[0].message.content)

Using the Azure AI Inference SDK — TypeScript

Terminal
npm install @azure-rest/ai-inference @azure/core-auth
TypeScript
const endpoint = "https://claude-sonnet-4-prod.eastus2.models.ai.azure.com"; const apiKey = "<your-api-key>"; const client = ModelClient(endpoint, new AzureKeyCredential(apiKey)); async function main() { const response = await client.path("/chat/completions").post({ body: { messages: [ { role: "user", content: "What is the capital of France?" } ], max_tokens: 1024, }, }); if (response.status === "200") { console.log(response.body.choices[0].message.content); } } main();

Pricing on Azure Foundry

Azure AI Foundry uses pay-as-you-go pricing for serverless Claude deployments. You are billed per token — both input and output tokens are metered.

ModelInput (per 1M tokens)Output (per 1M tokens)
Claude Opus 4~$15.00~$75.00
Claude Sonnet 4~$3.00~$15.00
Claude 3.5 Haiku~$0.80~$4.00

Note: Pricing may vary slightly from direct Anthropic pricing. Azure may add a small markup. Always check the Azure Pricing Calculator for current rates in your region.

Key pricing considerations:

  • No minimum commitment for serverless deployments.
  • Azure Reserved Capacity may offer discounts for high-volume usage.
  • Data transfer within the same region is free; cross-region traffic incurs standard Azure networking charges.
  • Billing appears as a line item under your Azure AI services consumption.

Beta Features and Limitations

Some Anthropic features may not be available immediately on Azure AI Foundry:

FeatureStatus on Azure Foundry
Messages APIFully supported
StreamingFully supported
Vision (image inputs)Supported
Tool use / Function callingSupported
Extended thinkingMay have delayed availability
Prompt cachingMay not be supported
Batch APINot available
Computer useNot available
PDF inputCheck availability

Tip: Azure AI Foundry typically lags a few weeks behind the direct Anthropic API for new features. Always check the Azure documentation for the latest feature matrix.

Content Safety Filters

Azure adds an additional content safety layer on top of Claude's built-in safety:

  • Requests and responses pass through Azure AI Content Safety.
  • You can configure filter severity levels (low, medium, high) per category.
  • Categories include: hate, sexual, violence, and self-harm.
  • Filtered content returns a content_filter response instead of model output.

Complete Example — Building a Chat Application

Here is a full working example that ties everything together:

Python
from azure.ai.inference import ChatCompletionsClient from azure.ai.inference.models import SystemMessage, UserMessage, AssistantMessage from azure.identity import DefaultAzureCredential # Configuration ENDPOINT = os.environ.get( "AZURE_CLAUDE_ENDPOINT", "https://claude-sonnet-4-prod.eastus2.models.ai.azure.com" ) # Use Entra ID in production, API key for development if os.environ.get("AZURE_CLAUDE_API_KEY"): from azure.core.credentials import AzureKeyCredential credential = AzureKeyCredential(os.environ["AZURE_CLAUDE_API_KEY"]) else: credential = DefaultAzureCredential() client = ChatCompletionsClient( endpoint=ENDPOINT, credential=credential, ) def chat(user_message: str, history: list = None) -> str: """Send a message and return Claude's response.""" if history is None: history = [] messages = [ SystemMessage(content="You are a helpful assistant for Azure developers."), *history, UserMessage(content=user_message), ] response = client.complete( messages=messages, max_tokens=2048, temperature=0.7, ) assistant_reply = response.choices[0].message.content history.append(UserMessage(content=user_message)) history.append(AssistantMessage(content=assistant_reply)) return assistant_reply # Usage if __name__ == "__main__": conversation_history = [] while True: user_input = input("You: ") if user_input.lower() in ("exit", "quit"): break reply = chat(user_input, conversation_history) print(f"Claude: {reply}")

Troubleshooting Common Issues

IssueCauseSolution
401 UnauthorizedInvalid or expired key/tokenRegenerate API key or refresh Entra ID token
404 Not FoundWrong endpoint URLVerify deployment name and region in the URL
429 Too Many RequestsRate limit exceededRequest quota increase in Azure Portal
ModelNotFoundModel not deployedDeploy the model from Model Catalog first
Content filteredAzure Content Safety triggeredAdjust filter levels or modify prompt
Region unavailableModel not available in chosen regionDeploy in a supported region (e.g. eastus2, westus)

Summary

Azure AI Foundry gives enterprise Azure customers a seamless way to adopt Claude without leaving their existing cloud ecosystem. The key steps are:

  1. Create an Azure AI Foundry hub and project.
  2. Deploy the Claude model you need from the Model Catalog.
  3. Authenticate using Entra ID (production) or API keys (development).
  4. Call the endpoint using the Azure AI Inference SDK or REST API.
  5. Monitor usage through Azure Cost Management and the Foundry dashboard.

By keeping everything inside Azure, you get unified billing, centralized identity management, compliance inheritance, and enterprise-grade networking — all while using the same Claude models available on the direct API.