InferShrink

Cut your LLM costs 80%+ with one line of code

View on PyPI →

How It Works

Intelligent routing without changing your workflow

01 Classify

Rule-based complexity scoring analyzes your prompts instantly to determine task difficulty.

02 Route

Simple tasks go to cheaper models automatically. gpt-4o → gpt-4o-mini seamlessly.

03 Track

See your savings in real-time. Every request is logged with cost comparison metrics.

Drop-in Replacement

Works with your existing OpenAI and Anthropic clients

OpenAI
Anthropic
Google
import openai
from infershrink import optimize

client = optimize(openai.Client())

# Use exactly as before — InferShrink handles the rest
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is 2+2?"}],
)
# Simple question → routed to gpt-4o-mini (95% cheaper)
# Complex tasks stay on gpt-4o automatically
import anthropic
from infershrink import optimize

client = optimize(anthropic.Anthropic())

# claude-opus → claude-sonnet for simple tasks
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello world"}]
)
from openai import OpenAI
from infershrink import optimize

# Gemini via OpenAI-compatible endpoint
client = optimize(OpenAI(
    api_key="your-gemini-key",
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
))

response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[{"role": "user", "content": "What is 2+2?"}],
)
# Simple → routed to gemini-2.5-flash (90%+ cheaper)

Features

Zero dependencies (core)
Same-provider routing
Streaming support
OpenAI + Anthropic + Google
CLI included
582 tests, CI/CD

Pricing

14-day free trial on all plans. No credit card required.

Pro $19/mo Team $49/mo
Requests/mo 50,000 500,000
Model routing
Compression
Retrieval
Deduplication
Support Email Priority + team management
Start 14-Day Trial → Start 14-Day Trial →

Get in Touch

Questions, enterprise needs, or just want to chat about LLM costs

Copied to clipboard!