# AI Error Handling Guide ## Error Types The AI system classifies errors into these categories: | Error Type | Cause | User Message | Retryable | |------------|-------|--------------|-----------| | `rate_limit` | Too many requests | "Rate limit exceeded. Wait a few minutes." | Yes | | `quota_exceeded` | Billing limit | "API quota exceeded. Check billing." | No | | `model_not_found` | Invalid model | "Model not available. Check settings." | No | | `invalid_api_key` | Bad API key | "Invalid API key. Check settings." | No | | `context_length` | Prompt too large | "Request too large. Try fewer items." | Yes* | | `parse_error` | AI returned invalid JSON | "Response parse error. Flagged for review." | Yes | | `timeout` | Request took too long | "Request timed out. Try again." | Yes | | `network_error` | Connection issue | "Network error. Check connection." | Yes | | `content_filter` | Content blocked | "Content filtered. Check input data." | No | | `server_error` | OpenAI server issue | "Server error. Try again later." | Yes | *Context length errors can be retried with smaller batches. ## Error Classification ```typescript import { classifyAIError, shouldRetry, getRetryDelay } from '@/server/services/ai-errors' try { const response = await openai.chat.completions.create(params) } catch (error) { const classified = classifyAIError(error) console.error(`AI Error: ${classified.type} - ${classified.message}`) if (shouldRetry(classified.type)) { const delay = getRetryDelay(classified.type) // Wait and retry } else { // Fall back to algorithm } } ``` ## Graceful Degradation When AI fails, the platform automatically handles it: ### AI Assignment 1. Logs the error 2. Falls back to algorithmic assignment: - Matches by expertise tag overlap - Balances workload across jurors - Respects constraints (max assignments) ### AI Filtering 1. Logs the error 2. Flags all projects for manual review 3. Returns error message to admin ### Award Eligibility 1. Logs the error 2. Returns all projects as "needs manual review" 3. Admin can apply deterministic rules instead ### Mentor Matching 1. Logs the error 2. Falls back to keyword-based matching 3. Uses availability scoring ## Retry Strategy | Error Type | Retry Count | Delay | |------------|-------------|-------| | `rate_limit` | 3 | Exponential (1s, 2s, 4s) | | `timeout` | 2 | Fixed 5s | | `network_error` | 3 | Exponential (1s, 2s, 4s) | | `server_error` | 3 | Exponential (2s, 4s, 8s) | | `parse_error` | 1 | None | ## Monitoring ### Error Logging All AI errors are logged to: 1. Console (development) 2. `AIUsageLog` table with `status: 'ERROR'` 3. `AuditLog` for security-relevant failures ### Checking Errors ```sql -- Recent AI errors SELECT created_at, action, model, error_message FROM ai_usage_log WHERE status = 'ERROR' ORDER BY created_at DESC LIMIT 20; -- Error rate by action SELECT action, COUNT(*) FILTER (WHERE status = 'ERROR') as errors, COUNT(*) as total, ROUND(100.0 * COUNT(*) FILTER (WHERE status = 'ERROR') / COUNT(*), 2) as error_rate FROM ai_usage_log GROUP BY action; ``` ## Troubleshooting ### High Error Rate 1. Check OpenAI status page for outages 2. Verify API key is valid and not rate-limited 3. Review error messages in logs 4. Consider switching to a different model ### Consistent Parse Errors 1. The AI model may be returning malformed JSON 2. Try a more capable model (gpt-4o instead of gpt-3.5) 3. Check if prompts are being truncated 4. Review recent responses in logs ### All Requests Failing 1. Test connection in Settings → AI 2. Verify API key hasn't been revoked 3. Check billing status in OpenAI dashboard 4. Review network connectivity ### Slow Responses 1. Consider using gpt-4o-mini for speed 2. Reduce batch sizes 3. Check for rate limiting (429 errors) 4. Monitor OpenAI latency ## Error Response Format When errors occur, services return structured responses: ```typescript // AI Assignment error response { success: false, suggestions: [], error: "Rate limit exceeded. Wait a few minutes and try again.", fallbackUsed: true, } // AI Filtering error response { projectId: "...", meetsCriteria: false, confidence: 0, reasoning: "AI error: Rate limit exceeded", flagForReview: true, } ``` ## Implementing Custom Error Handling ```typescript import { classifyAIError, shouldRetry, getRetryDelay, getUserFriendlyMessage, logAIError, } from '@/server/services/ai-errors' async function callAIWithRetry( operation: () => Promise, serviceName: string, maxRetries: number = 3 ): Promise { let lastError: Error | null = null for (let attempt = 1; attempt <= maxRetries; attempt++) { try { return await operation() } catch (error) { const classified = classifyAIError(error) logAIError(serviceName, 'operation', classified) if (!shouldRetry(classified.type) || attempt === maxRetries) { throw new Error(getUserFriendlyMessage(classified.type)) } const delay = getRetryDelay(classified.type) * attempt await new Promise(resolve => setTimeout(resolve, delay)) lastError = error as Error } } throw lastError } ``` ## See Also - [AI System Architecture](./ai-system.md) - [AI Configuration Guide](./ai-configuration.md) - [AI Services Reference](./ai-services.md)