5.3 KiB

Raw Blame History

AI Error Handling Guide

Error Types

The AI system classifies errors into these categories:

Error Type	Cause	User Message	Retryable
`rate_limit`	Too many requests	"Rate limit exceeded. Wait a few minutes."	Yes
`quota_exceeded`	Billing limit	"API quota exceeded. Check billing."	No
`model_not_found`	Invalid model	"Model not available. Check settings."	No
`invalid_api_key`	Bad API key	"Invalid API key. Check settings."	No
`context_length`	Prompt too large	"Request too large. Try fewer items."	Yes*
`parse_error`	AI returned invalid JSON	"Response parse error. Flagged for review."	Yes
`timeout`	Request took too long	"Request timed out. Try again."	Yes
`network_error`	Connection issue	"Network error. Check connection."	Yes
`content_filter`	Content blocked	"Content filtered. Check input data."	No
`server_error`	OpenAI server issue	"Server error. Try again later."	Yes

*Context length errors can be retried with smaller batches.

Error Classification

import { classifyAIError, shouldRetry, getRetryDelay } from '@/server/services/ai-errors'

try {
  const response = await openai.chat.completions.create(params)
} catch (error) {
  const classified = classifyAIError(error)

  console.error(`AI Error: ${classified.type} - ${classified.message}`)

  if (shouldRetry(classified.type)) {
    const delay = getRetryDelay(classified.type)
    // Wait and retry
  } else {
    // Fall back to algorithm
  }
}

Graceful Degradation

When AI fails, the platform automatically handles it:

AI Assignment

Logs the error
Falls back to algorithmic assignment:
- Matches by expertise tag overlap
- Balances workload across jurors
- Respects constraints (max assignments)

AI Filtering

Logs the error
Flags all projects for manual review
Returns error message to admin

Award Eligibility

Logs the error
Returns all projects as "needs manual review"
Admin can apply deterministic rules instead

Mentor Matching

Logs the error
Falls back to keyword-based matching
Uses availability scoring

Retry Strategy

Error Type	Retry Count	Delay
`rate_limit`	3	Exponential (1s, 2s, 4s)
`timeout`	2	Fixed 5s
`network_error`	3	Exponential (1s, 2s, 4s)
`server_error`	3	Exponential (2s, 4s, 8s)
`parse_error`	1	None

Monitoring

Error Logging

All AI errors are logged to:

Console (development)
AIUsageLog table with status: 'ERROR'
AuditLog for security-relevant failures

Checking Errors

-- Recent AI errors
SELECT
  created_at,
  action,
  model,
  error_message
FROM ai_usage_log
WHERE status = 'ERROR'
ORDER BY created_at DESC
LIMIT 20;

-- Error rate by action
SELECT
  action,
  COUNT(*) FILTER (WHERE status = 'ERROR') as errors,
  COUNT(*) as total,
  ROUND(100.0 * COUNT(*) FILTER (WHERE status = 'ERROR') / COUNT(*), 2) as error_rate
FROM ai_usage_log
GROUP BY action;

Troubleshooting

High Error Rate

Check OpenAI status page for outages
Verify API key is valid and not rate-limited
Review error messages in logs
Consider switching to a different model

Consistent Parse Errors

The AI model may be returning malformed JSON
Try a more capable model (gpt-4o instead of gpt-3.5)
Check if prompts are being truncated
Review recent responses in logs

All Requests Failing

Test connection in Settings → AI
Verify API key hasn't been revoked
Check billing status in OpenAI dashboard
Review network connectivity

Slow Responses

Consider using gpt-4o-mini for speed
Reduce batch sizes
Check for rate limiting (429 errors)
Monitor OpenAI latency

Error Response Format

When errors occur, services return structured responses:

// AI Assignment error response
{
  success: false,
  suggestions: [],
  error: "Rate limit exceeded. Wait a few minutes and try again.",
  fallbackUsed: true,
}

// AI Filtering error response
{
  projectId: "...",
  meetsCriteria: false,
  confidence: 0,
  reasoning: "AI error: Rate limit exceeded",
  flagForReview: true,
}

Implementing Custom Error Handling

import {
  classifyAIError,
  shouldRetry,
  getRetryDelay,
  getUserFriendlyMessage,
  logAIError,
} from '@/server/services/ai-errors'

async function callAIWithRetry<T>(
  operation: () => Promise<T>,
  serviceName: string,
  maxRetries: number = 3
): Promise<T> {
  let lastError: Error | null = null

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await operation()
    } catch (error) {
      const classified = classifyAIError(error)
      logAIError(serviceName, 'operation', classified)

      if (!shouldRetry(classified.type) || attempt === maxRetries) {
        throw new Error(getUserFriendlyMessage(classified.type))
      }

      const delay = getRetryDelay(classified.type) * attempt
      await new Promise(resolve => setTimeout(resolve, delay))
      lastError = error as Error
    }
  }

  throw lastError
}

5.3 KiB Raw Blame History