🛡️Error Handling & Best Practices

Rate limits, retries, error codes, and production-ready patterns

Error Handling & Best Practices

Building production applications with the Claude API requires robust error handling. In this lesson, you'll learn how to handle every type of error professionally.

Common HTTP Status Codes

Status Code	Meaning	How to Handle
400	Bad Request	Check request format and parameters
401	Unauthorized	Verify your API key
403	Forbidden	You don't have access
404	Not Found	Model or endpoint doesn't exist
429	Rate Limited	Wait and retry
500	Internal Error	Retry after a delay
529	Overloaded	Server is busy — retry later

Retry with Exponential Backoff

JavaScript
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

async function callWithRetry(messages, maxRetries = 3) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      const response = await client.messages.create({
        model: "claude-sonnet-4-20250514",
        max_tokens: 1024,
        messages,
      });
      return response;
    } catch (error) {
      // Don't retry client errors (400, 401, 403)
      if (error.status >= 400 && error.status < 500 && error.status !== 429) {
        throw error;
      }

      if (attempt === maxRetries) {
        throw error;
      }

      // Exponential backoff: 1s, 2s, 4s
      const delay = Math.pow(2, attempt) * 1000;
      console.log(
        `Attempt ${attempt + 1} failed. Retrying in ${delay}ms...`
      );
      await new Promise((resolve) => setTimeout(resolve, delay));
    }
  }
}

Python Retry Pattern

Python
import anthropic
import time

client = anthropic.Anthropic()

def call_with_retry(messages, max_retries=3):
    for attempt in range(max_retries + 1):
        try:
            response = client.messages.create(
                model="claude-sonnet-4-20250514",
                max_tokens=1024,
                messages=messages,
            )
            return response
        except anthropic.RateLimitError:
            if attempt == max_retries:
                raise
            delay = (2 ** attempt)
            print(f"Rate limited. Retrying in {delay}s...")
            time.sleep(delay)
        except anthropic.APIStatusError as e:
            if e.status_code >= 500:
                if attempt == max_retries:
                    raise
                delay = (2 ** attempt)
                time.sleep(delay)
            else:
                raise

Handling Rate Limits

JavaScript
async function handleRateLimit(messages) {
  try {
    return await client.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 1024,
      messages,
    });
  } catch (error) {
    if (error.status === 429) {
      // Use the retry-after header if available
      const retryAfter = error.headers?.["retry-after"];
      const waitTime = retryAfter ? parseInt(retryAfter) * 1000 : 60000;

      console.log(`Rate limited. Waiting ${waitTime / 1000}s...`);
      await new Promise((resolve) => setTimeout(resolve, waitTime));

      // Retry
      return await client.messages.create({
        model: "claude-sonnet-4-20250514",
        max_tokens: 1024,
        messages,
      });
    }
    throw error;
  }
}

Input Validation Before Sending

JavaScript
function validateRequest(messages, maxTokens) {
  if (!messages || messages.length === 0) {
    throw new Error("Messages are required");
  }

  for (let i = 0; i < messages.length; i++) {
    const role = messages[i].role;
    if (role !== "user" && role !== "assistant") {
      throw new Error(`Invalid role: ${role}`);
    }
  }

  if (messages[0].role !== "user") {
    throw new Error("First message must be from user");
  }

  if (maxTokens < 1 || maxTokens > 8192) {
    throw new Error("max_tokens must be between 1 and 8192");
  }

  return true;
}

Model Fallback Pattern

JavaScript
async function callWithFallback(messages) {
  const models = [
    "claude-sonnet-4-20250514",
    "claude-haiku-3-5-20241022",
  ];

  for (const model of models) {
    try {
      return await client.messages.create({
        model,
        max_tokens: 1024,
        messages,
      });
    } catch (error) {
      console.warn(`Model ${model} failed: ${error.message}`);
      continue;
    }
  }

  throw new Error("All models failed");
}

Circuit Breaker Pattern

JavaScript
class CircuitBreaker {
  constructor(threshold = 5, resetTimeMs = 60000) {
    this.failures = 0;
    this.threshold = threshold;
    this.resetTimeMs = resetTimeMs;
    this.state = "CLOSED"; // CLOSED, OPEN, HALF_OPEN
    this.lastFailureTime = null;
  }

  async call(fn) {
    if (this.state === "OPEN") {
      if (Date.now() - this.lastFailureTime > this.resetTimeMs) {
        this.state = "HALF_OPEN";
      } else {
        throw new Error("Circuit breaker is OPEN");
      }
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  onSuccess() {
    this.failures = 0;
    this.state = "CLOSED";
  }

  onFailure() {
    this.failures++;
    this.lastFailureTime = Date.now();
    if (this.failures >= this.threshold) {
      this.state = "OPEN";
    }
  }
}

Production Best Practices Checklist

Set timeouts — Never wait forever
Use circuit breakers — Stop requests temporarily when errors spike
Log every request — For monitoring and debugging
Set budgets — Configure spending limits in Anthropic Console
Use fallback models — If Sonnet fails, try Haiku
Validate inputs — Before sending any request
Handle all status codes — Each one needs different treatment
Monitor token usage — Track costs in real-time

Summary

Handle each error code differently
Use exponential backoff for retries
Log every request and error
Validate inputs before sending
Use fallback models for reliability
Implement circuit breakers for production systems

Next: We'll learn about batch processing to save 50% on costs.

Module 5

5/7

🌊 Streaming Responses

System Prompts Masterclass 🎭

5/7