Optimizing Prompts for Claude: Token Reduction and Formatting Guide

2026-03-28

Claude 3.5 Sonnet and Claude 3 Opus offer some of the largest context windows in the industry — up to 200K tokens. But tokens cost money, and a larger context window doesn't mean you should fill it carelessly. Here's how to optimize your Claude prompts for cost and quality.

Claude's pricing model

| Model | Input Cost | Output Cost | Context Window | |---|---|---|---| | Claude 3.5 Sonnet | $3 / 1M tokens | $15 / 1M tokens | 200K | | Claude 3 Opus | $15 / 1M tokens | $75 / 1M tokens | 200K | | Claude 3 Haiku | $0.25 / 1M tokens | $1.25 / 1M tokens | 200K |

At Opus pricing, a 50K-token input costs $0.75. If you're running 100 queries a day, that's $75/day — $2,250/month just for input tokens. Reducing input by 30% saves $675/month.

Strategy 1: Token compression

Token compression removes filler words, redundant phrases, and unnecessary formatting without changing the meaning. CleanMyPrompt's compress text for Claude tool does this automatically.

Before compression (87 tokens):

I would like you to please analyze the following customer support ticket and provide me with a detailed summary of the main issues that the customer is experiencing, along with any potential solutions that you might be able to suggest based on the information provided.

After compression (42 tokens):

Analyze this support ticket. Summarize main issues and suggest solutions based on the information.

Same meaning, 52% fewer tokens. Over thousands of queries, this adds up.

Strategy 2: Structured input formatting

Claude processes structured data more efficiently than natural language blocks. Convert your input from prose to structure:

Instead of:

The customer's name is John Smith and they live at 123 Main St. They purchased order #45678 on March 15 and the total was $299.99. They're complaining about a damaged item.

Use:

Customer: [REDACTED]
Order: #45678
Date: 2026-03-15
Amount: $299.99
Issue: Damaged item received

Benefits:

Strategy 3: System prompt optimization

Your system prompt is sent with every request. If it's 500 tokens, that's 500 tokens billed per call. Compress it aggressively:

Bloated system prompt (180 tokens):

You are a helpful customer support assistant working for our company. Your job is to analyze customer inquiries and provide helpful, empathetic, and accurate responses. You should always be polite and professional. If you don't know the answer, say so. Always try to resolve the issue in one response if possible.

Optimized (45 tokens):

Customer support assistant. Analyze inquiries, provide empathetic and accurate responses. Resolve in one response when possible. Say "I don't know" if uncertain.

Same behavior, 75% fewer tokens per request.

Strategy 4: PII redaction as token reduction

Personal information takes tokens. Names, addresses, email addresses, and phone numbers contribute to token count without adding analytical value. Redacting PII serves double duty — privacy protection AND cost reduction.

Consider a customer support ticket:

From: john.smith.enterprise@longcompanyname.com
Subject: Issue with order #ORD-2026-03-15-78456-SMITH

Dear Support Team,

My name is Jonathan Alexander Smith III and I'm writing from 4521 West Magnolia Boulevard, Suite 302, Burbank, California 91505. My phone number is (818) 555-0147 and my secondary email is j.smith.personal@anotherlongdomain.com...

After redaction:

From: [EMAIL]
Subject: Issue with order [ORDER-ID]

Dear Support Team,

My name is [NAME] and I'm writing from [ADDRESS]. My phone number is [PHONE] and my secondary email is [EMAIL]...

The redacted version is both safer and cheaper to process.

Strategy 5: Context window management

Claude's 200K context window is generous, but not every token is equal. Research shows that information in the middle of long contexts gets less attention than information at the beginning or end.

Best practices:

  1. Put instructions first: System prompt → task description → data
  2. Put critical data at the beginning and end: Claude attends most to the edges
  3. Chunk long documents: Instead of pasting a 100-page report, extract the relevant sections first
  4. Use CleanMyPrompt's formatting tools: The PDF cleaner removes page numbers, headers, and formatting artifacts that waste context space

Strategy 6: Model selection

Not every query needs Opus. Use a tiered approach:

Many teams default to the most powerful model. Running your queries through Haiku first and only escalating complex ones to Sonnet/Opus can reduce costs by 80%.

Putting it all together

The optimal Claude workflow:

  1. Clean the text: Remove formatting noise with CleanMyPrompt
  2. Redact PII: Strip personal data to reduce tokens and protect privacy
  3. Compress: Use Squeeze Mode to eliminate filler words
  4. Structure: Convert prose to structured format
  5. Choose the right model: Match complexity to capability
  6. Send: Paste the optimized prompt into Claude

This pipeline reduces input tokens by 30-50% while improving response quality (cleaner input = more focused output).

Try it

Paste a real prompt into the Claude token compressor and see how many tokens you save. Everything runs in your browser — nothing is sent to any server.