Documentation Index
Fetch the complete documentation index at: https://docs.costa.app/llms.txt
Use this file to discover all available pages before exploring further.
Rate Limits by Model
costa/enterprise-coder-v1
costa/enterprise-coder-v1
Standard Coding ModelEnterprise Tier:
- Requests: 5,000 per minute
- Tokens: 1,000,000 per minute
- Concurrent requests: 100
- Requests: 500 per minute
- Tokens: 200,000 per minute
- Concurrent requests: 20
costa/secure-claude-3-5-sonnet
costa/secure-claude-3-5-sonnet
Advanced Reasoning ModelEnterprise Tier:
- Requests: 2,000 per minute
- Tokens: 800,000 per minute
- Concurrent requests: 50
- Requests: 200 per minute
- Tokens: 100,000 per minute
- Concurrent requests: 10
costa/enterprise-gpt-4-turbo
costa/enterprise-gpt-4-turbo
Fast General Purpose ModelEnterprise Tier:
- Requests: 8,000 per minute
- Tokens: 1,500,000 per minute
- Concurrent requests: 120
- Requests: 800 per minute
- Tokens: 300,000 per minute
- Concurrent requests: 25
costa/compliance-assistant
costa/compliance-assistant
Compliance Specialist ModelEnterprise Tier:
- Requests: 1,000 per minute
- Tokens: 500,000 per minute
- Concurrent requests: 30
- Requests: 100 per minute
- Tokens: 50,000 per minute
- Concurrent requests: 5
Rate Limit Headers
Costa returns standard rate limit headers with every API response:Error Responses
When rate limits are exceeded, Costa returns a429 Too Many Requests status:
Rate Limit Optimization
Token Optimization Strategies
Input Optimization
Reduce Input Tokens• Remove unnecessary whitespace and comments
• Use concise, specific prompts
• Exclude irrelevant code context
• Summarize large code blocks
Output Optimization
Control Output Tokens• Set appropriate
max_tokens limits
• Use specific instructions for concise responses
• Request code snippets instead of full files
• Use streaming for real-time applicationsEnterprise Features
Dedicated Rate Limits
Enterprise customers can request dedicated rate limit pools:Team Isolation
Separate Limits per Team• Independent rate limits for each development team
• Prevent one team from affecting others
• Custom limits based on team size and usage
Project Allocation
Project-Specific Limits• Allocate rate limits to specific projects
• Priority queuing for critical applications
• Burst capacity for deployment periods
Rate Limit Monitoring
Dashboard Monitoring
Dashboard Monitoring
Real-time Usage Tracking• Live rate limit consumption graphs
• Historical usage patterns
• Team and project breakdowns
• Alert thresholds and notifications
API Monitoring
API Monitoring
Programmatic Monitoring• Rate limit usage API endpoints
• Webhook notifications for limit approaches
• Custom alerting integrations
• Usage forecasting and planning
Support
Rate Limit Issues
Contact support for rate limit increases or technical issues
Enterprise Sales
Discuss custom rate limits and dedicated infrastructure options
Rate Limit Increases: Enterprise customers can request rate limit increases based on legitimate business needs. Contact our support team with your use case details.