💰How AI Pricing Works
Understand tokens, pricing tiers, and how to estimate and reduce your AI costs
How AI Pricing Works
When you use Claude via the API, you pay based on the amount of text you send and receive. Understanding how pricing works helps you build cost-effective applications.
What Are Tokens?
Tokens are the basic unit of text that AI models process. A token is not exactly one word — it's more like a "chunk" of text.
Rules of Thumb:
1 token ≈ 4 charactersin English1 token ≈ ¾ of a wordin English100 tokens ≈ 75 words1,000 tokens ≈ 1.5 pages of text- Non-Latin scripts (Arabic, Chinese, etc.) use more tokens per word (roughly 1.5x-2x)
Examples:
| Text | Approximate Tokens |
|---|---|
| "Hello" | 1 |
| "Hello, how are you today?" | 7 |
| A short email | 50-100 |
| A full page of text | 500-700 |
| A full book (80,000 words) | ~100,000 |
Input Tokens vs Output Tokens
Every API call involves two types of tokens:
- Input tokens — what you send to Claude (your question, context, system instructions)
- Output tokens — what Claude sends back to you (the response)
Important: Output tokens are always more expensive than input tokens (typically 3-5x more).
Claude Model Pricing Table
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context Window |
|---|---|---|---|
| Claude Opus 4 | $15.00 | $75.00 | 200K |
| Claude Sonnet 4 | $3.00 | $15.00 | 200K |
| Claude Haiku 3.5 | $0.80 | $4.00 | 200K |
How to Calculate Costs
Formula:
Cost = (input_tokens × input_price) + (output_tokens × output_price)
Practical example with Claude Sonnet 4:
- Sending: 1,000 input tokens
- Receiving: 500 output tokens
- Cost = (1,000/1,000,000 × $3.00) + (500/1,000,000 × $15.00)
- Cost = $0.003 + $0.0075 = $0.0105 (just over one cent!)
Code Example: Tracking Costs
// Track cost after each API call
const response = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [{ role: "user", content: "What is the capital of France?" }],
});
const inputTokens = response.usage.input_tokens;
const outputTokens = response.usage.output_tokens;
// Claude Sonnet 4 pricing
const inputCost = (inputTokens / 1_000_000) * 3.0;
const outputCost = (outputTokens / 1_000_000) * 15.0;
const totalCost = inputCost + outputCost;
console.log("Input tokens:", inputTokens);
console.log("Output tokens:", outputTokens);
console.log("Total cost: $" + totalCost.toFixed(6));Python Example
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "What is the capital of France?"}],
)
input_tokens = response.usage.input_tokens
output_tokens = response.usage.output_tokens
# Claude Sonnet 4 pricing
input_cost = (input_tokens / 1_000_000) * 3.0
output_cost = (output_tokens / 1_000_000) * 15.0
total_cost = input_cost + output_cost
print(f"Input tokens: {input_tokens}")
print(f"Output tokens: {output_tokens}")
print(f"Total cost: ${total_cost:.6f}")Usage Tiers and Rate Limits
Anthropic has different usage tiers:
| Tier | Monthly Spend Limit | Requests per Minute |
|---|---|---|
| Free | Very limited | 5 RPM |
| Tier 1 | $100 | 50 RPM |
| Tier 2 | $500 | 100 RPM |
| Tier 3 | $1,000 | 200 RPM |
| Tier 4 | $5,000+ | 400 RPM |
Tips to Save Money
- Use the right model — Don't use Opus for simple tasks that Haiku can handle
- Minimize input tokens — Remove unnecessary text from your prompts
- Set max_tokens wisely — Don't request more than you need
- Use prompt caching — Save up to 90% on repeated prompts
- Batch processing — 50% discount when you don't need real-time responses
- Monitor usage — Track costs daily to avoid surprises
Estimating Real Project Costs
Let's say you're building a customer support chatbot:
| Parameter | Value |
|---|---|
| Messages per day | 1,000 |
| Avg input tokens per message | 500 |
| Avg output tokens per message | 300 |
| Model | Claude Haiku 3.5 |
| Daily cost | (500K/1M × $0.80) + (300K/1M × $4.00) = $1.60 |
| Monthly cost | ~$48 |
With Sonnet 4 instead of Haiku, the monthly cost would be ~$195. With Opus 4, it would be ~$975.
Cost Comparison: Claude vs Competitors
| Provider | Model | Input/1M | Output/1M |
|---|---|---|---|
| Anthropic | Claude Sonnet 4 | $3.00 | $15.00 |
| OpenAI | GPT-4o | $2.50 | $10.00 |
| Gemini 1.5 Pro | $3.50 | $10.50 | |
| Anthropic | Claude Haiku 3.5 | $0.80 | $4.00 |
| OpenAI | GPT-4o mini | $0.15 | $0.60 |
Prices change frequently. Always check the official pricing pages for the latest numbers.
Useful Tools
- Anthropic Console — Dashboard showing usage and costs
- Usage API — Programmatically track usage
- Response usage field — Every API response includes token counts used
Key Takeaways
- You pay per token, not per request
- Output tokens cost 3-5x more than input tokens
- Choose the smallest model that meets your quality needs
- Use caching and batching to save significantly
- Always monitor and set spending limits
Next: We'll learn how to choose the right Claude model for your specific use case.