Source Repository
This documentation is from amiable-dev/litellm-langfuse-railway.
Last synced: 2026-01-03 | Commit: 5a45454
LiteLLM + Langfuse: LLM Gateway with Full Observability¶
A production-ready LLM gateway that provides a unified API for 100+ LLM providers with full observability, cost tracking, and rate limiting.
🎯 What You Get¶
┌─────────────────────────────────────────────────────────────────────┐
│ Your Applications │
│ (Any app using OpenAI SDK format - Python, JS, etc.) │
└────────────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ LiteLLM Proxy │
│ • Unified OpenAI-compatible API │
│ • 100+ LLM providers (OpenAI, Claude, Gemini, Bedrock, etc.) │
│ • Virtual keys with budgets │
│ • Rate limiting & load balancing │
│ • Cost tracking per key/team │
│ • Automatic fallbacks │
└────────────────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ Langfuse │
│ • Full trace visibility │
│ • Token usage & cost analytics │
│ • Prompt management & versioning │
│ • Evaluation pipelines │
│ • Team collaboration │
└─────────────────────────────────────────────────────────────────────┘
🏗️ Architecture¶
| Service | Purpose | Port |
|---|---|---|
| LiteLLM | LLM Gateway/Proxy | 4000 |
| Langfuse Web | Observability UI & API | 3000 |
| Langfuse Worker | Async trace processing | 3030 |
| PostgreSQL | Transactional data | 5432 |
| ClickHouse | Analytics (traces, scores) | 8123/9000 |
| Redis | Caching & queues | 6379 |
| MinIO | Object storage (S3-compatible) | 9000 |
🚀 Quick Start¶
1. Deploy to Railway¶
Click the button above or use Railway CLI:
2. Get Your Endpoints¶
After deployment, you'll have two public URLs:
- LiteLLM:
https://litellm-xxx.up.railway.app - Langfuse:
https://langfuse-web-xxx.up.railway.app
3. Configure LiteLLM with Your API Keys¶
Access the LiteLLM Admin UI:
URL: https://litellm-xxx.up.railway.app/ui
Username: admin
Password: (from LITELLM_MASTER_KEY or UI_PASSWORD env var)
Add your LLM provider keys via the UI or API:
# Add OpenAI
curl -X POST 'https://litellm-xxx.up.railway.app/model/new' \
-H 'Authorization: Bearer YOUR_LITELLM_MASTER_KEY' \
-H 'Content-Type: application/json' \
-d '{
"model_name": "gpt-4o",
"litellm_params": {
"model": "openai/gpt-4o",
"api_key": "sk-YOUR_OPENAI_KEY"
}
}'
# Add Claude
curl -X POST 'https://litellm-xxx.up.railway.app/model/new' \
-H 'Authorization: Bearer YOUR_LITELLM_MASTER_KEY' \
-H 'Content-Type: application/json' \
-d '{
"model_name": "claude-sonnet",
"litellm_params": {
"model": "anthropic/claude-sonnet-4-20250514",
"api_key": "sk-ant-YOUR_ANTHROPIC_KEY"
}
}'
4. Connect Langfuse to LiteLLM¶
Get your Langfuse API keys from the Langfuse UI:
1. Open https://langfuse-web-xxx.up.railway.app
2. Create an account and project
3. Go to Settings → API Keys
4. Copy the public and secret keys
Update LiteLLM environment variables in Railway:
5. Start Using It!¶
from openai import OpenAI
client = OpenAI(
api_key="YOUR_LITELLM_MASTER_KEY", # or a virtual key
base_url="https://litellm-xxx.up.railway.app"
)
response = client.chat.completions.create(
model="gpt-4o", # or "claude-sonnet", etc.
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
All requests are automatically traced in Langfuse! 🎉
💰 Cost Tracking & Budgets¶
Create Virtual Keys with Budgets¶
# Create a key with $100/month budget
curl -X POST 'https://litellm-xxx.up.railway.app/key/generate' \
-H 'Authorization: Bearer YOUR_LITELLM_MASTER_KEY' \
-H 'Content-Type: application/json' \
-d '{
"models": ["gpt-4o", "claude-sonnet"],
"max_budget": 100,
"budget_duration": "1mo",
"metadata": {"team": "engineering"}
}'
Track Spending¶
# Get spend by key
curl 'https://litellm-xxx.up.railway.app/spend/keys' \
-H 'Authorization: Bearer YOUR_LITELLM_MASTER_KEY'
# Get spend by model
curl 'https://litellm-xxx.up.railway.app/spend/models' \
-H 'Authorization: Bearer YOUR_LITELLM_MASTER_KEY'
📊 Observability with Langfuse¶
View Traces¶
Open Langfuse UI to see: - Every LLM request with full context - Token usage and costs - Latency metrics - Error rates - User sessions
Prompt Management¶
- Create prompt templates in Langfuse UI
- Version and A/B test prompts
- Fetch prompts via API:
from langfuse import Langfuse
langfuse = Langfuse(
public_key="pk-lf-xxx",
secret_key="sk-lf-xxx",
host="https://langfuse-web-xxx.up.railway.app"
)
prompt = langfuse.get_prompt("my-prompt-template")
compiled = prompt.compile(variable="value")
Evaluations¶
Run LLM-as-judge evaluations:
from langfuse import Langfuse
langfuse = Langfuse(...)
# Score a trace
langfuse.score(
trace_id="xxx",
name="helpfulness",
value=0.9,
comment="Response was helpful"
)
🔧 Configuration¶
LiteLLM Environment Variables¶
| Variable | Description | Required |
|---|---|---|
LITELLM_MASTER_KEY |
Admin API key (starts with sk-) | Yes |
LITELLM_SALT_KEY |
Encryption key for stored credentials | Yes |
DATABASE_URL |
PostgreSQL connection string | Yes |
LANGFUSE_PUBLIC_KEY |
Langfuse public key | For tracing |
LANGFUSE_SECRET_KEY |
Langfuse secret key | For tracing |
LANGFUSE_HOST |
Langfuse URL | For tracing |
Langfuse Environment Variables¶
| Variable | Description | Required |
|---|---|---|
NEXTAUTH_SECRET |
Session encryption | Yes |
SALT |
Data encryption salt | Yes |
ENCRYPTION_KEY |
32-byte hex encryption key | Yes |
DATABASE_URL |
PostgreSQL connection string | Yes |
CLICKHOUSE_URL |
ClickHouse HTTP URL | Yes |
REDIS_HOST |
Redis hostname | Yes |
📈 Scaling¶
Horizontal Scaling¶
For high-throughput scenarios:
- LiteLLM: Add more replicas via Railway settings
- Langfuse Worker: Scale workers for faster trace processing
- Redis: Consider Railway Redis add-on for HA
Recommended Resources¶
| Load Level | LiteLLM | Langfuse | PostgreSQL | ClickHouse |
|---|---|---|---|---|
| Low (<100 req/min) | 512MB | 512MB | 256MB | 512MB |
| Medium (<1k req/min) | 1GB | 1GB | 512MB | 1GB |
| High (<10k req/min) | 2GB | 2GB | 1GB | 2GB |
🔐 Security Best Practices¶
- Rotate keys regularly: Generate new
LITELLM_MASTER_KEYperiodically - Use virtual keys: Don't expose master key to applications
- Set budgets: Prevent runaway costs with key budgets
- Enable RBAC: Use Langfuse teams for access control
- Audit logs: Review Langfuse traces for anomalies
🛠️ Troubleshooting¶
LiteLLM not connecting to models¶
# Test model connection
curl -X POST 'https://litellm-xxx.up.railway.app/chat/completions' \
-H 'Authorization: Bearer YOUR_KEY' \
-H 'Content-Type: application/json' \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "test"}]}'
Check: - API keys are correct in LiteLLM model config - Model name matches what you configured
Traces not appearing in Langfuse¶
- Verify Langfuse keys are set in LiteLLM
- Check Langfuse worker logs:
railway logs -s langfuse-worker - Ensure Redis is healthy:
railway logs -s redis
ClickHouse migrations failing¶
# Check ClickHouse logs
railway logs -s clickhouse
# Verify connection
railway run -s langfuse-web -- wget -qO- http://clickhouse:8123/ping
💵 Estimated Costs¶
| Component | Railway Usage | Est. Cost/Month |
|---|---|---|
| LiteLLM | ~$5-15 | Compute |
| Langfuse Web | ~$5-10 | Compute |
| Langfuse Worker | ~$3-8 | Compute |
| PostgreSQL | ~$5-10 | Compute + Storage |
| ClickHouse | ~$5-15 | Compute + Storage |
| Redis | ~$3-5 | Compute |
| MinIO | ~$3-5 | Compute + Storage |
| Total | $29-68/month |
Actual costs depend on usage. Railway charges based on resource consumption.
🔗 Resources¶
📝 License¶
This template combines open-source projects: - LiteLLM: MIT License - Langfuse: MIT License (self-hosted)
Built with ❤️ for the AI developer community.
Questions? Open an issue or reach out on Railway Discord.