InferShrink
Cut your LLM costs 80%+ with one line of code
How It Works
Intelligent routing without changing your workflow
01 Classify
Rule-based complexity scoring analyzes your prompts instantly to determine task difficulty.
02 Route
Simple tasks go to cheaper models automatically. gpt-4o → gpt-4o-mini seamlessly.
03 Track
See your savings in real-time. Every request is logged with cost comparison metrics.
Drop-in Replacement
Works with your existing OpenAI and Anthropic clients
OpenAI
Anthropic
Google
import openai
from infershrink import optimize
client = optimize(openai.Client())
# Use exactly as before — InferShrink handles the rest
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is 2+2?"}],
)
# Simple question → routed to gpt-4o-mini (95% cheaper)
# Complex tasks stay on gpt-4o automatically
import anthropic
from infershrink import optimize
client = optimize(anthropic.Anthropic())
# claude-opus → claude-sonnet for simple tasks
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello world"}]
)
from openai import OpenAI
from infershrink import optimize
# Gemini via OpenAI-compatible endpoint
client = optimize(OpenAI(
api_key="your-gemini-key",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
))
response = client.chat.completions.create(
model="gemini-2.5-pro",
messages=[{"role": "user", "content": "What is 2+2?"}],
)
# Simple → routed to gemini-2.5-flash (90%+ cheaper)
Features
Zero dependencies (core)
Same-provider routing
Streaming support
OpenAI + Anthropic + Google
CLI included
582 tests, CI/CD
Pricing
14-day free trial on all plans. No credit card required.
| Pro $19/mo | Team $49/mo | |
|---|---|---|
| Requests/mo | 50,000 | 500,000 |
| Model routing | ✅ | ✅ |
| Compression | ✅ | ✅ |
| Retrieval | ✅ | ✅ |
| Deduplication | ✅ | ✅ |
| Support | Priority + team management | |
| Start 14-Day Trial → | Start 14-Day Trial → |
Get in Touch
Questions, enterprise needs, or just want to chat about LLM costs