Testing Methodology
How we measure and monitor LLM latency across providers
Martian Status performs comprehensive latency testing on various LLM providers to help you understand real-world performance characteristics. Our testing methodology is designed to simulate different usage patterns and provide actionable insights.
Short input prompt (~100 chars) → Short output (256 chars). Simulates quick Q&A, simple commands, or brief interactions.
Short input prompt (~100 chars) → Long output (6,400 chars). Simulates content generation, story writing, or detailed explanations.
Long input prompt (~3,000 chars) → Short output (256 chars). Simulates summarization, extraction, or analysis tasks.
Long input prompt (~3,000 chars) → Long output (6,400 chars). Simulates document processing, code generation, or comprehensive analysis.
Client Types
Direct HTTP Requests
Raw HTTP POST requests to provider endpoints. Tests both /chat/completions (OpenAI-compatible) and /messages (Anthropic) endpoints.
Official SDKs
Uses official OpenAI and Anthropic SDKs to test through their native client libraries.
Streaming Modes
Streaming Response
Measures time to receive the complete streamed response. Simulates real-time applications where tokens are processed as they arrive.
Batch Response
Measures time to receive the complete response in one batch. Simulates batch processing or when streaming isn't needed.
Latency Measurement
Latency is measured from request initiation to response completion:
Direct Providers
Martian Proxies
Automated Testing
Tests run automatically every 10 minutes via scheduled cron jobs. Each test cycle executes all configured model/provider/client/stream/test-type combinations in parallel to minimize execution time and capture a snapshot of performance across all configurations.
Error Handling
Tests capture and categorize different failure modes:
Data Storage
Results are stored in a PostgreSQL database with automatic cleanup of data older than 30 days. This ensures we maintain relevant performance history while managing storage efficiently.