HomeProduction & DeploymentDeploying Claude Apps

intermediate15 min read· Module 10, Lesson 5

☁️Deploying Claude Apps

Deploy your Claude-powered app to Vercel, AWS, or any cloud provider

Deploying Claude-Powered Applications

You have built your Claude-powered app locally and it works great. Now it is time to ship it to the world. This lesson covers everything you need to know about deploying Claude applications to production, from choosing a platform to handling environment variables, health checks, and graceful shutdown.

Deployment Options Overview

There are many ways to host a Claude-powered app. Each platform has trade-offs in terms of cost, complexity, and control.

Platform	Best For	Pricing Model	Cold Start
Vercel	Next.js apps, serverless APIs	Per-invocation	Fast
AWS Lambda	Event-driven, high-scale	Per-invocation	Medium
Railway	Full-stack apps, databases	Usage-based	None
Fly.io	Global edge, long-running	Usage-based	None
Google Cloud Run	Container-based, autoscaling	Per-request	Medium
Render	Simple deployments, static + API	Fixed + usage	None

How to Choose

Vercel is the easiest if you are using Next.js. Zero-config deployment with automatic HTTPS, preview URLs, and edge functions.
AWS Lambda gives you the most control and scales to virtually unlimited concurrency, but setup is more complex.
Railway and Fly.io are great middle-ground options that run containers or processes without the overhead of AWS.
Docker containers work on any of these platforms and give you portability.

Next.js API Routes with Claude

The most common deployment pattern for Claude apps is a Next.js API route. This keeps your API key on the server and exposes a clean endpoint for your frontend.

App Router API Route (Route Handler)

TypeScript
// app/api/chat/route.ts

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

export async function POST(request: NextRequest) {
  try {
    const { message, conversationHistory } = await request.json();

    if (!message || typeof message !== "string") {
      return NextResponse.json(
        { error: "Message is required" },
        { status: 400 }
      );
    }

    const messages = [
      ...(conversationHistory || []),
      { role: "user" as const, content: message },
    ];

    const response = await client.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 1024,
      messages,
    });

    const assistantMessage =
      response.content[0].type === "text" ? response.content[0].text : "";

    return NextResponse.json({
      response: assistantMessage,
      usage: response.usage,
    });
  } catch (error: unknown) {
    console.error("Claude API error:", error);
    const message =
      error instanceof Error ? error.message : "Internal server error";
    return NextResponse.json({ error: message }, { status: 500 });
  }
}

Streaming API Route

For longer responses, streaming gives users immediate feedback instead of waiting for the full response.

TypeScript
// app/api/chat/stream/route.ts

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

export async function POST(request: NextRequest) {
  const { message } = await request.json();

  const stream = await client.messages.stream({
    model: "claude-sonnet-4-20250514",
    max_tokens: 1024,
    messages: [{ role: "user", content: message }],
  });

  const encoder = new TextEncoder();

  const readable = new ReadableStream({
    async start(controller) {
      for await (const event of stream) {
        if (
          event.type === "content_block_delta" &&
          event.delta.type === "text_delta"
        ) {
          controller.enqueue(
            encoder.encode(`data: ${JSON.stringify({ text: event.delta.text })}\n\n`)
          );
        }
      }
      controller.enqueue(encoder.encode("data: [DONE]\n\n"));
      controller.close();
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/event-stream",
      "Cache-Control": "no-cache",
      Connection: "keep-alive",
    },
  });
}

Environment Variables in Production

Your ANTHROPIC_API_KEY must never be committed to source control. Every platform has its own way of setting environment variables securely.

Setting Environment Variables

Platform	Command / UI
Vercel	Dashboard > Project > Settings > Environment Variables
AWS Lambda	AWS Console > Lambda > Configuration > Environment Variables
Railway	Dashboard > Project > Variables
Fly.io	`fly secrets set ANTHROPIC_API_KEY=sk-ant-...`
Docker	`docker run -e ANTHROPIC_API_KEY=sk-ant-... myapp`

Best Practices for Secrets

Terminal
# .env.local (for local development only - NEVER commit this)
ANTHROPIC_API_KEY=sk-ant-api03-xxxxx
ANTHROPIC_MODEL=claude-sonnet-4-20250514
MAX_TOKENS=1024
RATE_LIMIT_PER_MINUTE=60

GITIGNORE
# .gitignore - always exclude env files
.env
.env.local
.env.production
.env*.local

Validating Environment Variables at Startup

TypeScript
// lib/env.ts
function getRequiredEnv(name: string): string {
  const value = process.env[name];
  if (!value) {
    throw new Error(
      `Missing required environment variable: ${name}`
    );
  }
  return value;
}

export const config = {
  anthropicApiKey: getRequiredEnv("ANTHROPIC_API_KEY"),
  model: process.env.ANTHROPIC_MODEL || "claude-sonnet-4-20250514",
  maxTokens: parseInt(process.env.MAX_TOKENS || "1024", 10),
  rateLimitPerMinute: parseInt(
    process.env.RATE_LIMIT_PER_MINUTE || "60",
    10
  ),
  },

Vercel Deployment (Step by Step)

Vercel is the most popular platform for deploying Next.js applications. Here is a complete walkthrough.

Step 1: Prepare Your Project

Terminal
# Make sure your project builds locally
npm run build

# Verify no secrets are in your codebase
grep -r "sk-ant" --include="*.ts" --include="*.tsx" --include="*.js" .

Step 2: Push to GitHub

Terminal
git init
git add .
git commit -m "Initial commit"
git remote add origin https://github.com/yourname/claude-app.git
git push -u origin main

Step 3: Connect to Vercel

Go to vercel.com and sign in with GitHub.
Click "Add New Project".
Select your repository.
Vercel auto-detects Next.js and configures build settings.
Before deploying, add your environment variables:
- ANTHROPIC_API_KEY = your API key
- Select which environments it applies to (Production, Preview, Development).
Click "Deploy".

Step 4: Configure Serverless Function Timeout

Claude responses can take several seconds. Increase the timeout for your API routes.

JSON
// vercel.json
{
  "functions": {
    "app/api/**/*.ts": {
      "maxDuration": 30
    }
  }
}

Step 5: Set Up Preview Deployments

Every pull request gets its own preview URL automatically. Make sure your environment variables are set for the Preview environment too, so testers can try Claude features on preview branches.

AWS Lambda Deployment

For teams already on AWS, Lambda provides serverless Claude endpoints.

Lambda Handler

TypeScript
// handler.ts
  APIGatewayProxyEvent,
  APIGatewayProxyResult,
} from "aws-lambda";

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

export async function handler(
  event: APIGatewayProxyEvent
): Promise<APIGatewayProxyResult> {
  if (event.httpMethod !== "POST") {
    return { statusCode: 405, body: "Method not allowed" };
  }

  try {
    const { message } = JSON.parse(event.body || "{}");

    const response = await client.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 1024,
      messages: [{ role: "user", content: message }],
    });

    const text =
      response.content[0].type === "text"
        ? response.content[0].text
        : "";

    return {
      statusCode: 200,
      headers: {
        "Content-Type": "application/json",
        "Access-Control-Allow-Origin": "*",
      },
      body: JSON.stringify({ response: text }),
    };
  } catch (error: unknown) {
    console.error("Lambda error:", error);
    return {
      statusCode: 500,
      body: JSON.stringify({ error: "Internal server error" }),
    };
  }
}

Deploying with the Serverless Framework

YAML
# serverless.yml
service: claude-api
provider:
  name: aws
  runtime: nodejs20.x
  timeout: 30
  environment:
    ANTHROPIC_API_KEY: ${ssm:/claude/api-key}
functions:
  chat:
    handler: handler.handler
    events:
      - http:
          path: /chat
          method: post
          cors: true

Terminal

# Deploy to AWS
npx serverless deploy --stage production

Docker Containerization

Docker lets you package your Claude app into a portable container that runs anywhere.

Dockerfile for a Next.js Claude App

DOCKERFILE
# Dockerfile
FROM node:20-alpine AS base

# Install dependencies
FROM base AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production

# Build the application
FROM base AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# Production image
FROM base AS runner
WORKDIR /app
ENV NODE_ENV=production

RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 nextjs

COPY --from=builder /app/public ./public
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static

USER nextjs
EXPOSE 3000
ENV PORT=3000
ENV HOSTNAME="0.0.0.0"

CMD ["node", "server.js"]

Building and Running

Terminal
# Build the image
docker build -t claude-app .

# Run with environment variables
docker run -p 3000:3000 \
  -e ANTHROPIC_API_KEY=sk-ant-api03-xxxxx \
  claude-app

# Or use an env file
docker run -p 3000:3000 --env-file .env.production claude-app

Docker Compose for Development

YAML
# docker-compose.yml
version: "3.8"
services:
  app:
    build: .
    ports:
      - "3000:3000"
    environment:
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
      - NODE_ENV=production
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/api/health"]
      interval: 30s
      timeout: 10s
      retries: 3

CORS and API Proxy Patterns

When your frontend and backend are on different domains, you need to handle Cross-Origin Resource Sharing (CORS).

CORS Middleware for API Routes

TypeScript
// middleware.ts

const ALLOWED_ORIGINS = [
  "https://yourapp.com",
  "https://www.yourapp.com",
  process.env.NODE_ENV === "development" ? "http://localhost:3000" : "",
].filter(Boolean);

export function middleware(request: NextRequest) {
  const origin = request.headers.get("origin") || "";

  if (request.method === "OPTIONS") {
    return new NextResponse(null, {
      status: 204,
      headers: {
        "Access-Control-Allow-Origin": ALLOWED_ORIGINS.includes(origin)
          ? origin
          : "",
        "Access-Control-Allow-Methods": "POST, OPTIONS",
        "Access-Control-Allow-Headers": "Content-Type, Authorization",
        "Access-Control-Max-Age": "86400",
      },
    });
  }

  const response = NextResponse.next();
  if (ALLOWED_ORIGINS.includes(origin)) {
    response.headers.set("Access-Control-Allow-Origin", origin);
  }
  return response;
}

export const config = {
  matcher: "/api/:path*",
  },

API Proxy Pattern

Instead of calling the Anthropic API directly from the client (which would expose your key), proxy all requests through your own API.

Browser --> Your API (/api/chat) --> Anthropic API
              |
              +-- API key stays on server
              +-- Rate limiting
              +-- Request validation
              +-- Usage logging

This is the only safe architecture for web applications. Never call the Anthropic API directly from client-side JavaScript.

Health Checks

Every production service needs a health check endpoint that monitoring tools and load balancers can poll.

TypeScript
// app/api/health/route.ts

export async function GET() {
  const health = {
    status: "ok",
    timestamp: new Date().toISOString(),
    version: process.env.APP_VERSION || "unknown",
    uptime: process.uptime(),
    checks: {
      anthropicKey: !!process.env.ANTHROPIC_API_KEY,
      nodeEnv: process.env.NODE_ENV,
    },
  };

  const isHealthy = health.checks.anthropicKey;

  return NextResponse.json(health, {
    status: isHealthy ? 200 : 503,
  });
}

Graceful Shutdown

When your server receives a shutdown signal (during deployments or scaling), finish any in-progress Claude requests before exiting.

TypeScript
// server.ts (custom Node.js server)

const server = createServer(app);
let isShuttingDown = false;
const activeRequests = new Set<string>();

function generateRequestId(): string {
  return Math.random().toString(36).substring(2, 15);
}

// Track active requests
server.on("request", (req, res) => {
  const id = generateRequestId();
  activeRequests.add(id);
  res.on("finish", () => activeRequests.delete(id));
});

// Graceful shutdown handler
function shutdown(signal: string) {
  console.log(`Received ${signal}. Starting graceful shutdown...`);
  isShuttingDown = true;

  // Stop accepting new connections
  server.close(() => {
    console.log("Server closed. Exiting.");
    process.exit(0);
  });

  // Force exit after 30 seconds
  setTimeout(() => {
    console.error(
      `Forcing exit. ${activeRequests.size} requests still pending.`
    );
    process.exit(1);
  }, 30_000);
}

process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));

server.listen(process.env.PORT || 3000);

Deployment Checklist

Before deploying your Claude application to production, walk through this checklist.

Security

API key is in environment variables, not in code
.env files are in .gitignore
No secrets in build logs or client-side bundles
CORS is configured to allow only your domains
Input validation on all API routes
Rate limiting is in place

Performance

Streaming is enabled for long responses
Serverless function timeout is set to at least 30 seconds
Response caching for repeated identical queries
Connection pooling if using a database
Bundle size is optimized (no server-only code in client bundle)

Reliability

Health check endpoint is set up
Graceful shutdown handles in-flight requests
Error responses use proper HTTP status codes
Logging is configured (structured JSON logs)
Retry logic for transient Anthropic API failures

Monitoring

Request latency is tracked
Error rates are monitored
Token usage is logged per request
Alerts are set for error rate spikes
Cost tracking is in place

Testing

API routes are tested with sample requests
Error cases are tested (invalid input, missing API key, rate limits)
Preview deployments are verified before merging to production
Load testing for expected traffic levels

Key takeaway: Deploying a Claude app is fundamentally the same as deploying any API-driven application. Keep your API key on the server, use streaming for better UX, handle errors gracefully, and monitor your usage and costs.

Module 10

5/7

💰 Cost Optimization at Scale

Testing Production AI Apps 🧪

5/7