Optimize AI system with batching, token tracking, and GDPR compliance
Build and Push Docker Image / build (push) Successful in 9m11s Details

- Add AIUsageLog model for persistent token/cost tracking
- Implement batched processing for all AI services:
  - Assignment: 15 projects/batch
  - Filtering: 20 projects/batch
  - Award eligibility: 20 projects/batch
  - Mentor matching: 15 projects/batch
- Create unified error classification (ai-errors.ts)
- Enhance anonymization with comprehensive project data
- Add AI usage dashboard to Settings page
- Add usage stats endpoints to settings router
- Create AI system documentation (5 files)
- Create GDPR compliance documentation (2 files)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Matt 2026-02-03 11:58:12 +01:00
parent a72e815d3a
commit 928b1c65dc
19 changed files with 4103 additions and 601 deletions

View File

@ -0,0 +1,176 @@
# AI Configuration Guide
## Admin Settings
Navigate to **Settings → AI** to configure AI features.
### Available Settings
| Setting | Description | Default |
|---------|-------------|---------|
| `ai_enabled` | Master switch for AI features | `true` |
| `ai_provider` | AI provider (OpenAI only currently) | `openai` |
| `ai_model` | Model to use | `gpt-4o` |
| `openai_api_key` | API key (encrypted) | - |
| `ai_send_descriptions` | Include project descriptions | `true` |
## Supported Models
### Standard Models (GPT)
| Model | Speed | Quality | Cost | Recommended For |
|-------|-------|---------|------|-----------------|
| `gpt-4o` | Fast | Best | Medium | Production use |
| `gpt-4o-mini` | Very Fast | Good | Low | High-volume, cost-sensitive |
| `gpt-4-turbo` | Medium | Very Good | High | Complex analysis |
| `gpt-3.5-turbo` | Very Fast | Basic | Very Low | Simple tasks only |
### Reasoning Models (o-series)
| Model | Speed | Quality | Cost | Recommended For |
|-------|-------|---------|------|-----------------|
| `o1` | Slow | Excellent | Very High | Complex reasoning tasks |
| `o1-mini` | Medium | Very Good | High | Moderate complexity |
| `o3-mini` | Medium | Good | Medium | Cost-effective reasoning |
**Note:** Reasoning models use different API parameters:
- `max_completion_tokens` instead of `max_tokens`
- No `temperature` parameter
- No `response_format: json_object`
- System messages become "developer" role
The platform automatically handles these differences via `buildCompletionParams()`.
## Cost Estimates
### Per 1M Tokens (USD)
| Model | Input | Output |
|-------|-------|--------|
| gpt-4o | $2.50 | $10.00 |
| gpt-4o-mini | $0.15 | $0.60 |
| gpt-4-turbo | $10.00 | $30.00 |
| gpt-3.5-turbo | $0.50 | $1.50 |
| o1 | $15.00 | $60.00 |
| o1-mini | $3.00 | $12.00 |
| o3-mini | $1.10 | $4.40 |
### Typical Usage Per Operation
| Operation | Projects | Est. Tokens | Est. Cost (gpt-4o) |
|-----------|----------|-------------|-------------------|
| Filter 100 projects | 100 | ~10,000 | ~$0.10 |
| Assign 50 projects | 50 | ~15,000 | ~$0.15 |
| Award eligibility | 100 | ~10,000 | ~$0.10 |
| Mentor matching | 60 | ~12,000 | ~$0.12 |
## Rate Limits
OpenAI enforces rate limits based on your account tier:
| Tier | Requests/Min | Tokens/Min |
|------|--------------|------------|
| Tier 1 | 500 | 30,000 |
| Tier 2 | 5,000 | 450,000 |
| Tier 3+ | Higher | Higher |
The platform handles rate limits with:
- Batch processing (reduces request count)
- Error classification (detects rate limit errors)
- Manual retry guidance in UI
## Environment Variables
```env
# Required for AI features
OPENAI_API_KEY=sk-your-api-key
# Optional overrides (normally set via admin UI)
OPENAI_MODEL=gpt-4o
```
## Testing Connection
1. Go to **Settings → AI**
2. Enter your OpenAI API key
3. Click **Save AI Settings**
4. Click **Test Connection**
The test verifies:
- API key validity
- Model availability
- Basic request/response
## Monitoring Usage
### Admin Dashboard
Navigate to **Settings → AI** to see:
- Current month cost
- Token usage by feature
- Usage by model
- 30-day usage trend
### Database Queries
```sql
-- Current month usage
SELECT
action,
SUM(total_tokens) as tokens,
SUM(estimated_cost_usd) as cost
FROM ai_usage_log
WHERE created_at >= date_trunc('month', NOW())
GROUP BY action;
-- Top users by cost
SELECT
u.email,
SUM(l.estimated_cost_usd) as total_cost
FROM ai_usage_log l
JOIN users u ON l.user_id = u.id
GROUP BY u.id
ORDER BY total_cost DESC
LIMIT 10;
```
## Troubleshooting
### "Model not found"
- Verify the model is available with your API key tier
- Some models (o1, o3) require specific API access
- Try a more common model like `gpt-4o-mini`
### "Rate limit exceeded"
- Wait a few minutes before retrying
- Consider using a smaller batch size
- Upgrade your OpenAI account tier
### "All projects flagged"
1. Check **Settings → AI** for correct API key
2. Verify model is available
3. Check console logs for specific error messages
4. Test connection with the button in settings
### "Invalid API key"
1. Verify the key starts with `sk-`
2. Check the key hasn't been revoked in OpenAI dashboard
3. Ensure no extra whitespace in the key
## Best Practices
1. **Use gpt-4o-mini** for high-volume operations (filtering many projects)
2. **Use gpt-4o** for critical decisions (final assignments)
3. **Monitor costs** regularly via the usage dashboard
4. **Test with small batches** before running on full dataset
5. **Keep descriptions enabled** for better matching accuracy
## See Also
- [AI System Architecture](./ai-system.md)
- [AI Services Reference](./ai-services.md)
- [AI Error Handling](./ai-errors.md)

View File

@ -0,0 +1,208 @@
# AI Error Handling Guide
## Error Types
The AI system classifies errors into these categories:
| Error Type | Cause | User Message | Retryable |
|------------|-------|--------------|-----------|
| `rate_limit` | Too many requests | "Rate limit exceeded. Wait a few minutes." | Yes |
| `quota_exceeded` | Billing limit | "API quota exceeded. Check billing." | No |
| `model_not_found` | Invalid model | "Model not available. Check settings." | No |
| `invalid_api_key` | Bad API key | "Invalid API key. Check settings." | No |
| `context_length` | Prompt too large | "Request too large. Try fewer items." | Yes* |
| `parse_error` | AI returned invalid JSON | "Response parse error. Flagged for review." | Yes |
| `timeout` | Request took too long | "Request timed out. Try again." | Yes |
| `network_error` | Connection issue | "Network error. Check connection." | Yes |
| `content_filter` | Content blocked | "Content filtered. Check input data." | No |
| `server_error` | OpenAI server issue | "Server error. Try again later." | Yes |
*Context length errors can be retried with smaller batches.
## Error Classification
```typescript
import { classifyAIError, shouldRetry, getRetryDelay } from '@/server/services/ai-errors'
try {
const response = await openai.chat.completions.create(params)
} catch (error) {
const classified = classifyAIError(error)
console.error(`AI Error: ${classified.type} - ${classified.message}`)
if (shouldRetry(classified.type)) {
const delay = getRetryDelay(classified.type)
// Wait and retry
} else {
// Fall back to algorithm
}
}
```
## Graceful Degradation
When AI fails, the platform automatically handles it:
### AI Assignment
1. Logs the error
2. Falls back to algorithmic assignment:
- Matches by expertise tag overlap
- Balances workload across jurors
- Respects constraints (max assignments)
### AI Filtering
1. Logs the error
2. Flags all projects for manual review
3. Returns error message to admin
### Award Eligibility
1. Logs the error
2. Returns all projects as "needs manual review"
3. Admin can apply deterministic rules instead
### Mentor Matching
1. Logs the error
2. Falls back to keyword-based matching
3. Uses availability scoring
## Retry Strategy
| Error Type | Retry Count | Delay |
|------------|-------------|-------|
| `rate_limit` | 3 | Exponential (1s, 2s, 4s) |
| `timeout` | 2 | Fixed 5s |
| `network_error` | 3 | Exponential (1s, 2s, 4s) |
| `server_error` | 3 | Exponential (2s, 4s, 8s) |
| `parse_error` | 1 | None |
## Monitoring
### Error Logging
All AI errors are logged to:
1. Console (development)
2. `AIUsageLog` table with `status: 'ERROR'`
3. `AuditLog` for security-relevant failures
### Checking Errors
```sql
-- Recent AI errors
SELECT
created_at,
action,
model,
error_message
FROM ai_usage_log
WHERE status = 'ERROR'
ORDER BY created_at DESC
LIMIT 20;
-- Error rate by action
SELECT
action,
COUNT(*) FILTER (WHERE status = 'ERROR') as errors,
COUNT(*) as total,
ROUND(100.0 * COUNT(*) FILTER (WHERE status = 'ERROR') / COUNT(*), 2) as error_rate
FROM ai_usage_log
GROUP BY action;
```
## Troubleshooting
### High Error Rate
1. Check OpenAI status page for outages
2. Verify API key is valid and not rate-limited
3. Review error messages in logs
4. Consider switching to a different model
### Consistent Parse Errors
1. The AI model may be returning malformed JSON
2. Try a more capable model (gpt-4o instead of gpt-3.5)
3. Check if prompts are being truncated
4. Review recent responses in logs
### All Requests Failing
1. Test connection in Settings → AI
2. Verify API key hasn't been revoked
3. Check billing status in OpenAI dashboard
4. Review network connectivity
### Slow Responses
1. Consider using gpt-4o-mini for speed
2. Reduce batch sizes
3. Check for rate limiting (429 errors)
4. Monitor OpenAI latency
## Error Response Format
When errors occur, services return structured responses:
```typescript
// AI Assignment error response
{
success: false,
suggestions: [],
error: "Rate limit exceeded. Wait a few minutes and try again.",
fallbackUsed: true,
}
// AI Filtering error response
{
projectId: "...",
meetsCriteria: false,
confidence: 0,
reasoning: "AI error: Rate limit exceeded",
flagForReview: true,
}
```
## Implementing Custom Error Handling
```typescript
import {
classifyAIError,
shouldRetry,
getRetryDelay,
getUserFriendlyMessage,
logAIError,
} from '@/server/services/ai-errors'
async function callAIWithRetry<T>(
operation: () => Promise<T>,
serviceName: string,
maxRetries: number = 3
): Promise<T> {
let lastError: Error | null = null
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await operation()
} catch (error) {
const classified = classifyAIError(error)
logAIError(serviceName, 'operation', classified)
if (!shouldRetry(classified.type) || attempt === maxRetries) {
throw new Error(getUserFriendlyMessage(classified.type))
}
const delay = getRetryDelay(classified.type) * attempt
await new Promise(resolve => setTimeout(resolve, delay))
lastError = error as Error
}
}
throw lastError
}
```
## See Also
- [AI System Architecture](./ai-system.md)
- [AI Configuration Guide](./ai-configuration.md)
- [AI Services Reference](./ai-services.md)

View File

@ -0,0 +1,222 @@
# AI Prompts Reference
This document describes the prompts used by each AI service. All prompts are optimized for token efficiency while maintaining accuracy.
## Design Principles
1. **Concise system prompts** - Under 100 tokens where possible
2. **Structured output** - JSON format for reliable parsing
3. **Clear field names** - Consistent naming across services
4. **Score ranges** - 0-1 for confidence, 1-10 for quality
## Filtering Prompt
**Purpose:** Evaluate projects against admin-defined criteria
### System Prompt
```
Project screening assistant. Evaluate each project against the criteria.
Return JSON: {"projects": [{project_id, meets_criteria: bool, confidence: 0-1, reasoning: str, quality_score: 1-10, spam_risk: bool}]}
Assess description quality and relevance objectively.
```
### User Prompt Template
```
CRITERIA: {criteria_text}
PROJECTS: {anonymized_project_array}
Evaluate each project against the criteria. Return JSON.
```
### Example Response
```json
{
"projects": [
{
"project_id": "P1",
"meets_criteria": true,
"confidence": 0.9,
"reasoning": "Project focuses on coral reef restoration, matching ocean conservation criteria",
"quality_score": 8,
"spam_risk": false
}
]
}
```
---
## Assignment Prompt
**Purpose:** Match jurors to projects by expertise
### System Prompt
```
Match jurors to projects by expertise. Return JSON assignments.
Each: {juror_id, project_id, confidence_score: 0-1, expertise_match_score: 0-1, reasoning: str (1-2 sentences)}
Distribute workload fairly. Avoid assigning jurors at capacity.
```
### User Prompt Template
```
JURORS: {anonymized_juror_array}
PROJECTS: {anonymized_project_array}
CONSTRAINTS: {N} reviews/project, max {M}/juror
EXISTING: {existing_assignments}
Return JSON: {"assignments": [...]}
```
### Example Response
```json
{
"assignments": [
{
"juror_id": "juror_001",
"project_id": "project_005",
"confidence_score": 0.85,
"expertise_match_score": 0.9,
"reasoning": "Juror expertise in marine biology aligns with coral restoration project"
}
]
}
```
---
## Award Eligibility Prompt
**Purpose:** Determine project eligibility for special awards
### System Prompt
```
Award eligibility evaluator. Evaluate projects against criteria, return JSON.
Format: {"evaluations": [{project_id, eligible: bool, confidence: 0-1, reasoning: str}]}
Be objective. Base evaluation only on provided data. No personal identifiers in reasoning.
```
### User Prompt Template
```
CRITERIA: {criteria_text}
PROJECTS: {anonymized_project_array}
Evaluate eligibility for each project.
```
### Example Response
```json
{
"evaluations": [
{
"project_id": "P3",
"eligible": true,
"confidence": 0.95,
"reasoning": "Project is based in Italy and focuses on Mediterranean biodiversity"
}
]
}
```
---
## Mentor Matching Prompt
**Purpose:** Recommend mentors for projects
### System Prompt
```
Match mentors to projects by expertise. Return JSON.
Format for each project: {"matches": [{project_id, mentor_matches: [{mentor_index, confidence_score: 0-1, expertise_match_score: 0-1, reasoning: str}]}]}
Rank by suitability. Consider expertise alignment and availability.
```
### User Prompt Template
```
PROJECTS:
P1: Category=STARTUP, Issue=HABITAT_RESTORATION, Tags=[coral, reef], Desc=Project description...
P2: ...
MENTORS:
0: Expertise=[marine biology, coral], Availability=2/5
1: Expertise=[business development], Availability=0/3
...
For each project, rank top {N} mentors.
```
### Example Response
```json
{
"matches": [
{
"project_id": "P1",
"mentor_matches": [
{
"mentor_index": 0,
"confidence_score": 0.92,
"expertise_match_score": 0.95,
"reasoning": "Marine biology expertise directly matches coral restoration focus"
}
]
}
]
}
```
---
## Anonymized Data Structure
All projects sent to AI use this structure:
```typescript
interface AnonymizedProjectForAI {
project_id: string // P1, P2, etc.
title: string // Sanitized (PII removed)
description: string // Truncated + PII stripped
category: string | null // STARTUP | BUSINESS_CONCEPT
ocean_issue: string | null
country: string | null
region: string | null
institution: string | null
tags: string[]
founded_year: number | null
team_size: number
has_description: boolean
file_count: number
file_types: string[]
wants_mentorship: boolean
submission_source: string
submitted_date: string | null // YYYY-MM-DD
}
```
### What Gets Stripped
- Team/company names
- Email addresses
- Phone numbers
- External URLs
- Real project/user IDs
- Internal comments
---
## Token Optimization Tips
1. **Batch projects** - Process 15-20 per request
2. **Truncate descriptions** - 300-500 chars based on task
3. **Use abbreviated fields** - `desc` vs `description`
4. **Compress constraints** - Inline in prompt
5. **Request specific fields** - Only what you need
## Prompt Versioning
| Service | Version | Last Updated |
|---------|---------|--------------|
| Filtering | 2.0 | 2025-01 |
| Assignment | 2.0 | 2025-01 |
| Award Eligibility | 2.0 | 2025-01 |
| Mentor Matching | 2.0 | 2025-01 |
## See Also
- [AI System Architecture](./ai-system.md)
- [AI Services Reference](./ai-services.md)
- [AI Configuration Guide](./ai-configuration.md)

View File

@ -0,0 +1,249 @@
# AI Services Reference
## 1. AI Filtering Service
**File:** `src/server/services/ai-filtering.ts`
**Purpose:** Evaluate projects against admin-defined criteria text
### Input
- List of projects (anonymized)
- Criteria text (e.g., "Projects must be based in Mediterranean region")
- Rule configuration (PASS/REJECT/FLAG actions)
### Output
Per-project result:
- `meets_criteria` - Boolean
- `confidence` - 0-1 score
- `reasoning` - Explanation
- `quality_score` - 1-10 rating
- `spam_risk` - Boolean flag
### Configuration
- **Batch Size:** 20 projects per API call
- **Description Limit:** 500 characters
- **Token Usage:** ~1500-2500 tokens per batch
### Example Criteria
- "Filter out any project without a description"
- "Only include projects founded after 2020"
- "Reject projects with fewer than 2 team members"
- "Projects must be based in Mediterranean region"
### Usage
```typescript
import { aiFilterProjects } from '@/server/services/ai-filtering'
const results = await aiFilterProjects(
projects,
'Only include projects with ocean conservation focus',
userId,
roundId
)
```
---
## 2. AI Assignment Service
**File:** `src/server/services/ai-assignment.ts`
**Purpose:** Match jurors to projects based on expertise alignment
### Input
- List of jurors with expertise tags
- List of projects with tags/category
- Constraints:
- Required reviews per project
- Max assignments per juror
- Existing assignments (to avoid duplicates)
### Output
Suggested assignments:
- `jurorId` - Juror to assign
- `projectId` - Project to assign
- `confidenceScore` - 0-1 match confidence
- `expertiseMatchScore` - 0-1 expertise overlap
- `reasoning` - Explanation
### Configuration
- **Batch Size:** 15 projects per batch (all jurors included)
- **Description Limit:** 300 characters
- **Token Usage:** ~2000-3500 tokens per batch
### Fallback Algorithm
When AI is unavailable, uses:
1. Tag overlap scoring (60% weight)
2. Load balancing (40% weight)
3. Constraint satisfaction
### Usage
```typescript
import { generateAIAssignments } from '@/server/services/ai-assignment'
const result = await generateAIAssignments(
jurors,
projects,
{
requiredReviewsPerProject: 3,
maxAssignmentsPerJuror: 10,
existingAssignments: [],
},
userId,
roundId
)
```
---
## 3. Award Eligibility Service
**File:** `src/server/services/ai-award-eligibility.ts`
**Purpose:** Determine which projects qualify for special awards
### Input
- Award criteria text (plain language)
- List of projects (anonymized)
- Optional: Auto-tag rules (field-based matching)
### Output
Per-project:
- `eligible` - Boolean
- `confidence` - 0-1 score
- `reasoning` - Explanation
- `method` - 'AI' or 'AUTO'
### Configuration
- **Batch Size:** 20 projects per API call
- **Description Limit:** 400 characters
- **Token Usage:** ~1500-2500 tokens per batch
### Auto-Tag Rules
Deterministic rules can be combined with AI:
```typescript
const rules: AutoTagRule[] = [
{ field: 'country', operator: 'equals', value: 'Italy' },
{ field: 'competitionCategory', operator: 'equals', value: 'STARTUP' },
]
```
### Usage
```typescript
import { aiInterpretCriteria, applyAutoTagRules } from '@/server/services/ai-award-eligibility'
// Deterministic matching
const autoResults = applyAutoTagRules(rules, projects)
// AI-based criteria interpretation
const aiResults = await aiInterpretCriteria(
'Projects focusing on marine biodiversity',
projects,
userId,
awardId
)
```
---
## 4. Mentor Matching Service
**File:** `src/server/services/mentor-matching.ts`
**Purpose:** Recommend mentors for projects based on expertise
### Input
- Project details (single or batch)
- Available mentors with expertise tags and availability
### Output
Ranked list of mentor matches:
- `mentorId` - Mentor ID
- `confidenceScore` - 0-1 overall match
- `expertiseMatchScore` - 0-1 expertise overlap
- `reasoning` - Explanation
### Configuration
- **Batch Size:** 15 projects per batch
- **Description Limit:** 350 characters
- **Token Usage:** ~1500-2500 tokens per batch
### Fallback Algorithm
Keyword-based matching when AI unavailable:
1. Extract keywords from project tags/description
2. Match against mentor expertise tags
3. Factor in availability (assignments vs max)
### Usage
```typescript
import {
getAIMentorSuggestions,
getAIMentorSuggestionsBatch
} from '@/server/services/mentor-matching'
// Single project
const matches = await getAIMentorSuggestions(prisma, projectId, 5, userId)
// Batch processing
const batchResults = await getAIMentorSuggestionsBatch(
prisma,
projectIds,
5,
userId
)
```
---
## Common Patterns
### Token Logging
All services log usage to `AIUsageLog`:
```typescript
await logAIUsage({
userId,
action: 'FILTERING',
entityType: 'Round',
entityId: roundId,
model,
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
totalTokens: usage.totalTokens,
batchSize: projects.length,
itemsProcessed: projects.length,
status: 'SUCCESS',
})
```
### Error Handling
All services use unified error classification:
```typescript
try {
// AI call
} catch (error) {
const classified = classifyAIError(error)
logAIError('ServiceName', 'functionName', classified)
if (classified.retryable) {
// Retry logic
} else {
// Fall back to algorithm
}
}
```
### Anonymization
All services anonymize before sending to AI:
```typescript
const { anonymized, mappings } = anonymizeProjectsForAI(projects, 'FILTERING')
if (!validateAnonymizedProjects(anonymized)) {
throw new Error('GDPR compliance check failed')
}
```
## See Also
- [AI System Architecture](./ai-system.md)
- [AI Configuration Guide](./ai-configuration.md)
- [AI Error Handling](./ai-errors.md)

View File

@ -0,0 +1,143 @@
# MOPC AI System Architecture
## Overview
The MOPC platform uses AI (OpenAI GPT models) for four core functions:
1. **Project Filtering** - Automated eligibility screening against admin-defined criteria
2. **Jury Assignment** - Smart juror-project matching based on expertise alignment
3. **Award Eligibility** - Special award qualification determination
4. **Mentor Matching** - Mentor-project recommendations based on expertise
## System Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ ADMIN INTERFACE │
│ (Rounds, Filtering, Awards, Assignments, Mentor Assignment) │
└─────────────────────────┬───────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ tRPC ROUTERS │
│ filtering.ts │ assignment.ts │ specialAward.ts │ mentor.ts │
└─────────────────────────┬───────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ AI SERVICES │
│ ai-filtering.ts │ ai-assignment.ts │ ai-award-eligibility.ts │
│ │ mentor-matching.ts │
└─────────────────────────┬───────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ ANONYMIZATION LAYER │
│ anonymization.ts │
│ - PII stripping - ID replacement - Text sanitization │
└─────────────────────────┬───────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ OPENAI CLIENT │
│ lib/openai.ts │
│ - Model detection - Parameter building - Token tracking │
└─────────────────────────┬───────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ OPENAI API │
│ GPT-4o │ GPT-4o-mini │ o1 │ o3-mini (configurable) │
└─────────────────────────────────────────────────────────────────┘
```
## Data Flow
1. **Admin triggers AI action** (filter projects, suggest assignments)
2. **Router validates permissions** and fetches data from database
3. **AI Service prepares data** for processing
4. **Anonymization Layer strips PII**, replaces IDs, sanitizes text
5. **OpenAI Client builds request** with correct parameters for model type
6. **Request sent to OpenAI API**
7. **Response parsed and de-anonymized**
8. **Results stored in database**, usage logged
9. **UI updated** with results
## Key Components
### OpenAI Client (`lib/openai.ts`)
Handles communication with OpenAI API:
- `getOpenAI()` - Get configured OpenAI client
- `getConfiguredModel()` - Get the admin-selected model
- `buildCompletionParams()` - Build API parameters (handles reasoning vs standard models)
- `isReasoningModel()` - Detect o1/o3/o4 series models
### Anonymization Service (`server/services/anonymization.ts`)
GDPR-compliant data preparation:
- `anonymizeForAI()` - Basic anonymization for assignment
- `anonymizeProjectsForAI()` - Comprehensive project anonymization for filtering/awards
- `validateAnonymization()` - Verify no PII in anonymized data
- `deanonymizeResults()` - Map AI results back to real IDs
### Token Tracking (`server/utils/ai-usage.ts`)
Cost and usage monitoring:
- `logAIUsage()` - Log API calls to database
- `calculateCost()` - Compute estimated cost by model
- `getAIUsageStats()` - Retrieve usage statistics
- `getCurrentMonthCost()` - Get current billing period totals
### Error Handling (`server/services/ai-errors.ts`)
Unified error classification:
- `classifyAIError()` - Categorize API errors
- `shouldRetry()` - Determine if error is retryable
- `getUserFriendlyMessage()` - Get human-readable error messages
## Batching Strategy
All AI services process data in batches to avoid token limits:
| Service | Batch Size | Reason |
|---------|------------|--------|
| AI Assignment | 15 projects | Include all jurors per batch |
| AI Filtering | 20 projects | Balance throughput and cost |
| Award Eligibility | 20 projects | Consistent with filtering |
| Mentor Matching | 15 projects | All mentors per batch |
## Fallback Behavior
All AI services have algorithmic fallbacks when AI is unavailable:
1. **Assignment** - Expertise tag matching + load balancing
2. **Filtering** - Flag all projects for manual review
3. **Award Eligibility** - Flag all for manual review
4. **Mentor Matching** - Keyword-based matching algorithm
## Security Considerations
1. **API keys** stored encrypted in database
2. **No PII** sent to OpenAI (enforced by anonymization)
3. **Audit logging** of all AI operations
4. **Role-based access** to AI features (admin only)
## Files Reference
| File | Purpose |
|------|---------|
| `lib/openai.ts` | OpenAI client configuration |
| `server/services/ai-filtering.ts` | Project filtering service |
| `server/services/ai-assignment.ts` | Jury assignment service |
| `server/services/ai-award-eligibility.ts` | Award eligibility service |
| `server/services/mentor-matching.ts` | Mentor matching service |
| `server/services/anonymization.ts` | Data anonymization |
| `server/services/ai-errors.ts` | Error classification |
| `server/utils/ai-usage.ts` | Token tracking |
## See Also
- [AI Services Reference](./ai-services.md)
- [AI Configuration Guide](./ai-configuration.md)
- [AI Error Handling](./ai-errors.md)
- [AI Prompts Reference](./ai-prompts.md)

View File

@ -0,0 +1,217 @@
# AI Data Processing - GDPR Compliance Documentation
## Overview
This document describes how project data is processed by AI services in the MOPC Platform, ensuring compliance with GDPR Articles 5, 6, 13-14, 25, and 32.
## Legal Basis
| Processing Activity | Legal Basis | GDPR Article |
|---------------------|-------------|--------------|
| AI-powered project filtering | Legitimate interest | Art. 6(1)(f) |
| AI-powered jury assignment | Legitimate interest | Art. 6(1)(f) |
| AI-powered award eligibility | Legitimate interest | Art. 6(1)(f) |
| AI-powered mentor matching | Legitimate interest | Art. 6(1)(f) |
**Legitimate Interest Justification:** AI processing is used to efficiently evaluate ocean conservation projects and match appropriate reviewers, directly serving the platform's purpose of managing the Monaco Ocean Protection Challenge.
## Data Minimization (Article 5(1)(c))
The AI system applies strict data minimization:
- **Only necessary fields** sent to AI (no names, emails, phone numbers)
- **Descriptions truncated** to 300-500 characters maximum
- **Team size** sent as count only (no member details)
- **Dates** sent as year-only or ISO date (no timestamps)
- **IDs replaced** with sequential anonymous identifiers (P1, P2, etc.)
## Anonymization Measures
### Data NEVER Sent to AI
| Data Type | Reason |
|-----------|--------|
| Personal names | PII - identifying |
| Email addresses | PII - identifying |
| Phone numbers | PII - identifying |
| Physical addresses | PII - identifying |
| External URLs | Could identify individuals |
| Internal project/user IDs | Could be cross-referenced |
| Team member details | PII - identifying |
| Internal comments | May contain PII |
| File content | May contain PII |
### Data Sent to AI (Anonymized)
| Field | Type | Purpose | Anonymization |
|-------|------|---------|---------------|
| project_id | String | Reference | Replaced with P1, P2, etc. |
| title | String | Spam detection | PII patterns removed |
| description | String | Criteria matching | Truncated, PII stripped |
| category | Enum | Filtering | As-is (no PII) |
| ocean_issue | Enum | Topic filtering | As-is (no PII) |
| country | String | Geographic eligibility | As-is (country name only) |
| region | String | Regional eligibility | As-is (zone name only) |
| institution | String | Student identification | As-is (institution name only) |
| tags | Array | Keyword matching | As-is (no PII expected) |
| founded_year | Number | Age filtering | Year only, not full date |
| team_size | Number | Team requirements | Count only |
| file_count | Number | Document checks | Count only |
| file_types | Array | File requirements | Type names only |
| wants_mentorship | Boolean | Mentorship filtering | As-is |
| submission_source | Enum | Source filtering | As-is |
| submitted_date | String | Deadline checks | Date only, no time |
## Technical Safeguards
### PII Detection and Stripping
```typescript
// Patterns detected and removed before AI processing
const PII_PATTERNS = {
email: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g,
phone: /(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/g,
url: /https?:\/\/[^\s]+/g,
ssn: /\d{3}-\d{2}-\d{4}/g,
ipv4: /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/g,
}
```
### Validation Before Every AI Call
```typescript
// GDPR compliance enforced before EVERY API call
export function enforceGDPRCompliance(data: unknown[]): void {
for (const item of data) {
const { valid, violations } = validateNoPersonalData(item)
if (!valid) {
throw new Error(`GDPR compliance check failed: ${violations.join(', ')}`)
}
}
}
```
### ID Anonymization
Real IDs are never sent to AI. Instead:
- Projects: `cm1abc123...``P1`, `P2`, `P3`
- Jurors: `cm2def456...``juror_001`, `juror_002`
- Results mapped back using secure mapping tables
## Data Retention
| Data Type | Retention | Deletion Method |
|-----------|-----------|-----------------|
| AI usage logs | 12 months | Automatic deletion |
| Anonymized prompts | Not stored | Sent directly to API |
| AI responses | Not stored | Parsed and discarded |
**Note:** OpenAI does not retain API data for training (per their API Terms). API data is retained for up to 30 days for abuse monitoring, configurable to 0 days.
## Subprocessor: OpenAI
| Aspect | Details |
|--------|---------|
| Subprocessor | OpenAI, Inc. |
| Location | United States |
| DPA Status | Data Processing Agreement in place |
| Safeguards | Standard Contractual Clauses (SCCs) |
| Compliance | SOC 2 Type II, GDPR-compliant |
| Data Use | API data NOT used for model training |
**OpenAI DPA:** https://openai.com/policies/data-processing-agreement
## Audit Trail
All AI processing is logged:
```typescript
await prisma.aIUsageLog.create({
data: {
userId: ctx.user.id, // Who initiated
action: 'FILTERING', // What type
entityType: 'Round', // What entity
entityId: roundId, // Which entity
model: 'gpt-4o', // What model
totalTokens: 1500, // Resource usage
status: 'SUCCESS', // Outcome
},
})
```
## Data Subject Rights
### Right of Access (Article 15)
Users can request:
- What data was processed by AI
- When AI processing occurred
- What decisions were made
**Implementation:** Export AI usage logs for user's projects.
### Right to Erasure (Article 17)
When a user requests deletion:
- AI usage logs for their projects can be deleted
- No data remains at OpenAI (API data not retained for training)
**Note:** Since only anonymized data is sent to AI, there is no personal data at OpenAI to delete.
### Right to Object (Article 21)
Users can request to opt out of AI processing:
- Admin can disable AI features per round
- Manual review fallback available for all AI features
## Risk Assessment
### Risk: PII Leakage to AI Provider
| Factor | Assessment |
|--------|------------|
| Likelihood | Very Low |
| Impact | Medium |
| Mitigation | Automated PII detection, validation before every call |
| Residual Risk | Very Low |
### Risk: AI Decision Bias
| Factor | Assessment |
|--------|------------|
| Likelihood | Low |
| Impact | Low |
| Mitigation | Human review of all AI suggestions, algorithmic fallback |
| Residual Risk | Very Low |
### Risk: Data Breach at Subprocessor
| Factor | Assessment |
|--------|------------|
| Likelihood | Very Low |
| Impact | Low (only anonymized data) |
| Mitigation | OpenAI SOC 2 compliance, no PII sent |
| Residual Risk | Very Low |
## Compliance Checklist
- [x] Data minimization applied (only necessary fields)
- [x] PII stripped before AI processing
- [x] Anonymization validated before every API call
- [x] DPA in place with OpenAI
- [x] Audit logging of all AI operations
- [x] Fallback available when AI declined
- [x] Usage logs retained for 12 months only
- [x] No personal data stored at subprocessor
## Contact
For questions about AI data processing:
- Data Protection Officer: [DPO email]
- Technical Contact: [Tech contact email]
## See Also
- [Platform GDPR Compliance](./platform-gdpr-compliance.md)
- [AI System Architecture](../architecture/ai-system.md)
- [AI Services Reference](../architecture/ai-services.md)

View File

@ -0,0 +1,324 @@
# MOPC Platform - GDPR Compliance Documentation
## 1. Data Controller Information
| Field | Value |
|-------|-------|
| **Data Controller** | Monaco Ocean Protection Challenge |
| **Contact** | [Data Protection Officer email] |
| **Platform** | monaco-opc.com |
| **Jurisdiction** | Monaco |
---
## 2. Personal Data Collected
### 2.1 User Account Data
| Data Type | Purpose | Legal Basis | Retention |
|-----------|---------|-------------|-----------|
| Email address | Account identification, notifications | Contract performance | Account lifetime + 2 years |
| Name | Display in platform, certificates | Contract performance | Account lifetime + 2 years |
| Phone number (optional) | WhatsApp notifications | Consent | Until consent withdrawn |
| Profile photo (optional) | Platform personalization | Consent | Until deleted by user |
| Role | Access control | Contract performance | Account lifetime |
| IP address | Security, audit logging | Legitimate interest | 12 months |
| User agent | Security, debugging | Legitimate interest | 12 months |
### 2.2 Project/Application Data
| Data Type | Purpose | Legal Basis | Retention |
|-----------|---------|-------------|-----------|
| Project title | Competition entry | Contract performance | Program lifetime + 5 years |
| Project description | Evaluation | Contract performance | Program lifetime + 5 years |
| Team information | Contact, evaluation | Contract performance | Program lifetime + 5 years |
| Uploaded files | Evaluation | Contract performance | Program lifetime + 5 years |
| Country/Region | Geographic eligibility | Contract performance | Program lifetime + 5 years |
### 2.3 Evaluation Data
| Data Type | Purpose | Legal Basis | Retention |
|-----------|---------|-------------|-----------|
| Jury evaluations | Competition judging | Contract performance | Program lifetime + 5 years |
| Scores and comments | Competition judging | Contract performance | Program lifetime + 5 years |
| Evaluation timestamps | Audit trail | Legitimate interest | Program lifetime + 5 years |
### 2.4 Technical Data
| Data Type | Purpose | Legal Basis | Retention |
|-----------|---------|-------------|-----------|
| Session tokens | Authentication | Contract performance | Session duration |
| Magic link tokens | Passwordless login | Contract performance | 15 minutes |
| Audit logs | Security, compliance | Legitimate interest | 12 months |
| AI usage logs | Cost tracking, debugging | Legitimate interest | 12 months |
---
## 3. Data Processing Purposes
### 3.1 Primary Purposes
1. **Competition Management** - Managing project submissions, evaluations, and results
2. **User Authentication** - Secure access to the platform
3. **Communication** - Sending notifications about evaluations, deadlines, results
### 3.2 Secondary Purposes
1. **Analytics** - Understanding platform usage (aggregated, anonymized)
2. **Security** - Detecting and preventing unauthorized access
3. **AI Processing** - Automated filtering and matching (anonymized data only)
---
## 4. Third-Party Data Sharing
### 4.1 Subprocessors
| Subprocessor | Purpose | Data Shared | Location | DPA |
|--------------|---------|-------------|----------|-----|
| OpenAI | AI processing | Anonymized project data only | USA | Yes |
| MinIO/S3 | File storage | Uploaded files | [Location] | Yes |
| Poste.io | Email delivery | Email addresses, notification content | [Location] | Yes |
### 4.2 Data Shared with OpenAI
**Sent to OpenAI:**
- Anonymized project titles (PII sanitized)
- Truncated descriptions (500 chars max)
- Project category, tags, country
- Team size (count only)
- Founded year (year only)
**NEVER sent to OpenAI:**
- Names of any individuals
- Email addresses
- Phone numbers
- Physical addresses
- External URLs
- Internal database IDs
- File contents
For full details, see [AI Data Processing](./ai-data-processing.md).
---
## 5. Data Subject Rights
### 5.1 Right of Access (Article 15)
Users can request a copy of their personal data via:
- Profile → Settings → Download My Data
- Email to [DPO email]
**Response Time:** Within 30 days
### 5.2 Right to Rectification (Article 16)
Users can update their data via:
- Profile → Settings → Edit Profile
- Contact support for assistance
**Response Time:** Immediately for self-service, 72 hours for support
### 5.3 Right to Erasure (Article 17)
Users can request deletion via:
- Profile → Settings → Delete Account
- Email to [DPO email]
**Exceptions:** Data required for legal obligations or ongoing competitions
**Response Time:** Within 30 days
### 5.4 Right to Restrict Processing (Article 18)
Users can request processing restrictions by contacting [DPO email]
**Response Time:** Within 72 hours
### 5.5 Right to Data Portability (Article 20)
Users can export their data in machine-readable format (JSON) via:
- Profile → Settings → Export Data
**Format:** JSON file containing all user data
### 5.6 Right to Object (Article 21)
Users can object to processing based on legitimate interests by contacting [DPO email]
**Response Time:** Within 72 hours
---
## 6. Security Measures (Article 32)
### 6.1 Technical Measures
| Measure | Implementation |
|---------|----------------|
| Encryption in transit | TLS 1.3 for all connections |
| Encryption at rest | AES-256 for sensitive data |
| Authentication | Magic link (passwordless) or OAuth |
| Rate limiting | 100 requests/minute per IP |
| Session management | Secure cookies, automatic expiry |
| Input validation | Zod schema validation on all inputs |
### 6.2 Access Controls
| Control | Implementation |
|---------|----------------|
| RBAC | Role-based permissions (SUPER_ADMIN, PROGRAM_ADMIN, JURY_MEMBER, etc.) |
| Least privilege | Users only see assigned projects/programs |
| Session expiry | Configurable timeout (default 24 hours) |
| Audit logging | All sensitive actions logged |
### 6.3 Infrastructure Security
| Measure | Implementation |
|---------|----------------|
| Firewall | iptables rules on VPS |
| DDoS protection | Cloudflare (if configured) |
| Updates | Regular security patches |
| Backups | Daily encrypted backups, 90-day retention |
| Monitoring | Error logging, performance monitoring |
---
## 7. Data Retention Policy
| Data Category | Retention Period | Deletion Method |
|---------------|------------------|-----------------|
| Active user accounts | Account lifetime | Soft delete → hard delete after 30 days |
| Inactive accounts | 2 years after last login | Automatic anonymization |
| Project data | Program lifetime + 5 years | Archived, then anonymized |
| Audit logs | 12 months | Automatic deletion |
| AI usage logs | 12 months | Automatic deletion |
| Session data | Session duration | Automatic expiration |
| Backup data | 90 days | Automatic rotation |
---
## 8. International Data Transfers
### 8.1 OpenAI (USA)
| Aspect | Details |
|--------|---------|
| Transfer Mechanism | Standard Contractual Clauses (SCCs) |
| DPA | OpenAI Data Processing Agreement |
| Data Minimization | Only anonymized data transferred |
| Risk Assessment | Low (no PII transferred) |
### 8.2 Data Localization
| Service | Location |
|---------|----------|
| Primary database | [EU location] |
| File storage | [Location] |
| Email service | [Location] |
---
## 9. Cookies and Tracking
### 9.1 Essential Cookies
| Cookie | Purpose | Duration |
|--------|---------|----------|
| `session_token` | User authentication | Session |
| `csrf_token` | CSRF protection | Session |
### 9.2 Optional Cookies
The platform does **not** use:
- Marketing cookies
- Analytics cookies that track individuals
- Third-party tracking
---
## 10. Data Protection Impact Assessment (DPIA)
### 10.1 AI Processing DPIA
| Factor | Assessment |
|--------|------------|
| **Risk** | Personal data sent to third-party AI |
| **Mitigation** | Strict anonymization before processing |
| **Residual Risk** | Low (no PII transferred) |
### 10.2 File Upload DPIA
| Factor | Assessment |
|--------|------------|
| **Risk** | Sensitive documents uploaded |
| **Mitigation** | Pre-signed URLs, access controls, virus scanning |
| **Residual Risk** | Medium (users control uploads) |
### 10.3 Evaluation Data DPIA
| Factor | Assessment |
|--------|------------|
| **Risk** | Subjective opinions about projects/teams |
| **Mitigation** | Access controls, audit logging |
| **Residual Risk** | Low |
---
## 11. Breach Notification Procedure
### 11.1 Detection (Within 24 hours)
1. Automated monitoring alerts
2. User reports
3. Security audit findings
### 11.2 Assessment (Within 48 hours)
1. Identify affected data and individuals
2. Assess severity and risk
3. Document incident details
### 11.3 Notification (Within 72 hours)
**Supervisory Authority:**
- Notify if risk to individuals
- Include: nature of breach, categories of data, number affected, consequences, measures taken
**Affected Individuals:**
- Notify without undue delay if high risk
- Include: nature of breach, likely consequences, measures taken, contact for information
### 11.4 Documentation
All breaches documented regardless of notification requirement.
---
## 12. Contact Information
| Role | Contact |
|------|---------|
| **Data Protection Officer** | [DPO name] |
| **Email** | [DPO email] |
| **Address** | [Physical address] |
**Supervisory Authority:**
Commission de Contrôle des Informations Nominatives (CCIN)
[Address in Monaco]
---
## 13. Document History
| Version | Date | Changes |
|---------|------|---------|
| 1.0 | 2025-01 | Initial version |
---
## See Also
- [AI Data Processing](./ai-data-processing.md)
- [AI System Architecture](../architecture/ai-system.md)

View File

@ -684,6 +684,46 @@ model AuditLog {
@@index([timestamp]) @@index([timestamp])
} }
// =============================================================================
// AI USAGE TRACKING
// =============================================================================
model AIUsageLog {
id String @id @default(cuid())
createdAt DateTime @default(now())
// Who/what triggered it
userId String?
action String // ASSIGNMENT, FILTERING, AWARD_ELIGIBILITY, MENTOR_MATCHING
entityType String? // Round, Project, Award
entityId String?
// What was used
model String // gpt-4o, gpt-4o-mini, o1, etc.
promptTokens Int
completionTokens Int
totalTokens Int
// Cost tracking
estimatedCostUsd Decimal? @db.Decimal(10, 6)
// Request context
batchSize Int?
itemsProcessed Int?
// Status
status String // SUCCESS, PARTIAL, ERROR
errorMessage String?
// Detailed data (optional)
detailsJson Json? @db.JsonB
@@index([userId])
@@index([action])
@@index([createdAt])
@@index([model])
}
// ============================================================================= // =============================================================================
// NOTIFICATION LOG (Phase 2) // NOTIFICATION LOG (Phase 2)
// ============================================================================= // =============================================================================

View File

@ -0,0 +1,294 @@
'use client'
import { trpc } from '@/lib/trpc/client'
import {
Card,
CardContent,
CardDescription,
CardHeader,
CardTitle,
} from '@/components/ui/card'
import { Skeleton } from '@/components/ui/skeleton'
import { Badge } from '@/components/ui/badge'
import {
Coins,
Zap,
TrendingUp,
Activity,
Brain,
Filter,
Users,
Award,
} from 'lucide-react'
import { cn } from '@/lib/utils'
const ACTION_ICONS: Record<string, typeof Zap> = {
ASSIGNMENT: Users,
FILTERING: Filter,
AWARD_ELIGIBILITY: Award,
MENTOR_MATCHING: Brain,
}
const ACTION_LABELS: Record<string, string> = {
ASSIGNMENT: 'Jury Assignment',
FILTERING: 'Project Filtering',
AWARD_ELIGIBILITY: 'Award Eligibility',
MENTOR_MATCHING: 'Mentor Matching',
}
function StatCard({
label,
value,
subValue,
icon: Icon,
trend,
}: {
label: string
value: string
subValue?: string
icon: typeof Zap
trend?: 'up' | 'down' | 'neutral'
}) {
return (
<div className="flex items-start gap-3 rounded-lg border bg-card p-4">
<div className="rounded-md bg-muted p-2">
<Icon className="h-4 w-4 text-muted-foreground" />
</div>
<div className="flex-1 space-y-1">
<p className="text-sm font-medium text-muted-foreground">{label}</p>
<div className="flex items-baseline gap-2">
<p className="text-2xl font-bold">{value}</p>
{trend && trend !== 'neutral' && (
<TrendingUp
className={cn(
'h-4 w-4',
trend === 'up' ? 'text-green-500' : 'rotate-180 text-red-500'
)}
/>
)}
</div>
{subValue && (
<p className="text-xs text-muted-foreground">{subValue}</p>
)}
</div>
</div>
)
}
function UsageBar({
label,
value,
maxValue,
color,
}: {
label: string
value: number
maxValue: number
color: string
}) {
const percentage = maxValue > 0 ? (value / maxValue) * 100 : 0
return (
<div className="space-y-1">
<div className="flex justify-between text-sm">
<span className="text-muted-foreground">{label}</span>
<span className="font-medium">{value.toLocaleString()}</span>
</div>
<div className="h-2 overflow-hidden rounded-full bg-muted">
<div
className={cn('h-full transition-all duration-500', color)}
style={{ width: `${percentage}%` }}
/>
</div>
</div>
)
}
export function AIUsageCard() {
const {
data: monthCost,
isLoading: monthLoading,
} = trpc.settings.getAICurrentMonthCost.useQuery(undefined, {
staleTime: 60 * 1000, // 1 minute
})
const {
data: stats,
isLoading: statsLoading,
} = trpc.settings.getAIUsageStats.useQuery({}, {
staleTime: 60 * 1000,
})
const {
data: history,
isLoading: historyLoading,
} = trpc.settings.getAIUsageHistory.useQuery({ days: 30 }, {
staleTime: 60 * 1000,
})
const isLoading = monthLoading || statsLoading
if (isLoading) {
return (
<Card>
<CardHeader>
<CardTitle className="flex items-center gap-2">
<Activity className="h-5 w-5" />
AI Usage & Costs
</CardTitle>
<CardDescription>Loading usage data...</CardDescription>
</CardHeader>
<CardContent className="space-y-6">
<div className="grid gap-4 sm:grid-cols-2">
<Skeleton className="h-24" />
<Skeleton className="h-24" />
</div>
<Skeleton className="h-32" />
</CardContent>
</Card>
)
}
const hasUsage = monthCost && monthCost.requestCount > 0
const maxTokensByAction = stats?.byAction
? Math.max(...Object.values(stats.byAction).map((a) => a.tokens))
: 0
return (
<Card>
<CardHeader>
<CardTitle className="flex items-center gap-2">
<Activity className="h-5 w-5" />
AI Usage & Costs
</CardTitle>
<CardDescription>
Token usage and estimated costs for AI features
</CardDescription>
</CardHeader>
<CardContent className="space-y-6">
{/* Current month summary */}
<div className="grid gap-4 sm:grid-cols-2 lg:grid-cols-3">
<StatCard
label="This Month Cost"
value={monthCost?.costFormatted || '$0.00'}
subValue={`${monthCost?.requestCount || 0} requests`}
icon={Coins}
/>
<StatCard
label="Tokens Used"
value={monthCost?.tokens?.toLocaleString() || '0'}
subValue="This month"
icon={Zap}
/>
{stats && (
<StatCard
label="All-Time Cost"
value={stats.totalCostFormatted || '$0.00'}
subValue={`${stats.totalTokens?.toLocaleString() || 0} tokens`}
icon={TrendingUp}
/>
)}
</div>
{/* Usage by action */}
{hasUsage && stats?.byAction && Object.keys(stats.byAction).length > 0 && (
<div className="space-y-4">
<h4 className="text-sm font-semibold">Usage by Feature</h4>
<div className="space-y-3">
{Object.entries(stats.byAction)
.sort(([, a], [, b]) => b.tokens - a.tokens)
.map(([action, data]) => {
const Icon = ACTION_ICONS[action] || Zap
return (
<div key={action} className="flex items-center gap-3">
<div className="rounded-md bg-muted p-1.5">
<Icon className="h-3.5 w-3.5 text-muted-foreground" />
</div>
<div className="flex-1">
<UsageBar
label={ACTION_LABELS[action] || action}
value={data.tokens}
maxValue={maxTokensByAction}
color="bg-primary"
/>
</div>
<Badge variant="secondary" className="ml-2 text-xs">
{(data as { costFormatted?: string }).costFormatted}
</Badge>
</div>
)
})}
</div>
</div>
)}
{/* Usage by model */}
{hasUsage && stats?.byModel && Object.keys(stats.byModel).length > 0 && (
<div className="space-y-4">
<h4 className="text-sm font-semibold">Usage by Model</h4>
<div className="flex flex-wrap gap-2">
{Object.entries(stats.byModel)
.sort(([, a], [, b]) => b.cost - a.cost)
.map(([model, data]) => (
<Badge
key={model}
variant="outline"
className="flex items-center gap-2"
>
<Brain className="h-3 w-3" />
<span>{model}</span>
<span className="text-muted-foreground">
{(data as { costFormatted?: string }).costFormatted}
</span>
</Badge>
))}
</div>
</div>
)}
{/* Usage history mini chart */}
{hasUsage && history && history.length > 0 && (
<div className="space-y-4">
<h4 className="text-sm font-semibold">Last 30 Days</h4>
<div className="flex h-16 items-end gap-0.5">
{(() => {
const maxCost = Math.max(...history.map((d) => d.cost), 0.001)
return history.slice(-30).map((day, i) => {
const height = (day.cost / maxCost) * 100
return (
<div
key={day.date}
className="group relative flex-1 cursor-pointer"
title={`${day.date}: ${day.costFormatted}`}
>
<div
className="w-full rounded-t bg-primary/60 transition-colors hover:bg-primary"
style={{ height: `${Math.max(height, 4)}%` }}
/>
</div>
)
})
})()}
</div>
<div className="flex justify-between text-xs text-muted-foreground">
<span>{history[0]?.date}</span>
<span>{history[history.length - 1]?.date}</span>
</div>
</div>
)}
{/* No usage message */}
{!hasUsage && (
<div className="rounded-lg border border-dashed p-8 text-center">
<Activity className="mx-auto h-8 w-8 text-muted-foreground" />
<h4 className="mt-2 text-sm font-semibold">No AI usage yet</h4>
<p className="mt-1 text-sm text-muted-foreground">
AI usage will be tracked when you use filtering, assignments, or
other AI-powered features.
</p>
</div>
)}
</CardContent>
</Card>
)
}

View File

@ -19,6 +19,7 @@ import {
Settings as SettingsIcon, Settings as SettingsIcon,
} from 'lucide-react' } from 'lucide-react'
import { AISettingsForm } from './ai-settings-form' import { AISettingsForm } from './ai-settings-form'
import { AIUsageCard } from './ai-usage-card'
import { BrandingSettingsForm } from './branding-settings-form' import { BrandingSettingsForm } from './branding-settings-form'
import { EmailSettingsForm } from './email-settings-form' import { EmailSettingsForm } from './email-settings-form'
import { StorageSettingsForm } from './storage-settings-form' import { StorageSettingsForm } from './storage-settings-form'
@ -134,7 +135,7 @@ export function SettingsContent({ initialSettings }: SettingsContentProps) {
</TabsTrigger> </TabsTrigger>
</TabsList> </TabsList>
<TabsContent value="ai"> <TabsContent value="ai" className="space-y-6">
<Card> <Card>
<CardHeader> <CardHeader>
<CardTitle>AI Configuration</CardTitle> <CardTitle>AI Configuration</CardTitle>
@ -146,6 +147,7 @@ export function SettingsContent({ initialSettings }: SettingsContentProps) {
<AISettingsForm settings={aiSettings} /> <AISettingsForm settings={aiSettings} />
</CardContent> </CardContent>
</Card> </Card>
<AIUsageCard />
</TabsContent> </TabsContent>
<TabsContent value="branding"> <TabsContent value="branding">

View File

@ -1,4 +1,5 @@
import OpenAI from 'openai' import OpenAI from 'openai'
import type { ChatCompletionCreateParamsNonStreaming } from 'openai/resources/chat/completions'
import { prisma } from './prisma' import { prisma } from './prisma'
// OpenAI client singleton with lazy initialization // OpenAI client singleton with lazy initialization
@ -7,6 +8,103 @@ const globalForOpenAI = globalThis as unknown as {
openaiInitialized: boolean openaiInitialized: boolean
} }
// ─── Model Type Detection ────────────────────────────────────────────────────
/**
* Reasoning models that require different API parameters:
* - Use max_completion_tokens instead of max_tokens
* - Don't support response_format: json_object (must instruct JSON in prompt)
* - Don't support temperature parameter
* - Don't support system messages (use developer or user role instead)
*/
const REASONING_MODEL_PREFIXES = ['o1', 'o3', 'o4']
/**
* Check if a model is a reasoning model (o1, o3, o4 series)
*/
export function isReasoningModel(model: string): boolean {
const modelLower = model.toLowerCase()
return REASONING_MODEL_PREFIXES.some(prefix =>
modelLower.startsWith(prefix) ||
modelLower.includes(`/${prefix}`) ||
modelLower.includes(`-${prefix}`)
)
}
// ─── Chat Completion Parameter Builder ───────────────────────────────────────
type MessageRole = 'system' | 'user' | 'assistant' | 'developer'
export interface ChatCompletionOptions {
messages: Array<{ role: MessageRole; content: string }>
maxTokens?: number
temperature?: number
jsonMode?: boolean
}
/**
* Build chat completion parameters with correct settings for the model type.
* Handles differences between standard models and reasoning models.
*/
export function buildCompletionParams(
model: string,
options: ChatCompletionOptions
): ChatCompletionCreateParamsNonStreaming {
const isReasoning = isReasoningModel(model)
// Convert messages for reasoning models (system -> developer)
const messages = options.messages.map(msg => {
if (isReasoning && msg.role === 'system') {
return { role: 'developer' as const, content: msg.content }
}
return msg as { role: 'system' | 'user' | 'assistant' | 'developer'; content: string }
})
// For reasoning models requesting JSON, append JSON instruction to last user message
if (isReasoning && options.jsonMode) {
// Find last user message index (polyfill for findLastIndex)
let lastUserIdx = -1
for (let i = messages.length - 1; i >= 0; i--) {
if (messages[i].role === 'user') {
lastUserIdx = i
break
}
}
if (lastUserIdx !== -1) {
messages[lastUserIdx] = {
...messages[lastUserIdx],
content: messages[lastUserIdx].content + '\n\nIMPORTANT: Respond with valid JSON only, no other text.',
}
}
}
const params: ChatCompletionCreateParamsNonStreaming = {
model,
messages: messages as ChatCompletionCreateParamsNonStreaming['messages'],
}
// Token limit parameter differs between model types
if (options.maxTokens) {
if (isReasoning) {
params.max_completion_tokens = options.maxTokens
} else {
params.max_tokens = options.maxTokens
}
}
// Reasoning models don't support temperature
if (!isReasoning && options.temperature !== undefined) {
params.temperature = options.temperature
}
// Reasoning models don't support response_format: json_object
if (!isReasoning && options.jsonMode) {
params.response_format = { type: 'json_object' }
}
return params
}
/** /**
* Get OpenAI API key from SystemSettings * Get OpenAI API key from SystemSettings
*/ */
@ -118,13 +216,14 @@ export async function validateModel(modelId: string): Promise<{
} }
} }
// Try a minimal completion with the model // Try a minimal completion with the model using correct parameters
await client.chat.completions.create({ const params = buildCompletionParams(modelId, {
model: modelId,
messages: [{ role: 'user', content: 'test' }], messages: [{ role: 'user', content: 'test' }],
max_tokens: 1, maxTokens: 1,
}) })
await client.chat.completions.create(params)
return { valid: true } return { valid: true }
} catch (error) { } catch (error) {
const message = error instanceof Error ? error.message : 'Unknown error' const message = error instanceof Error ? error.message : 'Unknown error'
@ -164,13 +263,14 @@ export async function testOpenAIConnection(): Promise<{
// Get the configured model // Get the configured model
const configuredModel = await getConfiguredModel() const configuredModel = await getConfiguredModel()
// Test with the configured model // Test with the configured model using correct parameters
const response = await client.chat.completions.create({ const params = buildCompletionParams(configuredModel, {
model: configuredModel,
messages: [{ role: 'user', content: 'Hello' }], messages: [{ role: 'user', content: 'Hello' }],
max_tokens: 5, maxTokens: 5,
}) })
const response = await client.chat.completions.create(params)
return { return {
success: true, success: true,
model: response.model, model: response.model,

View File

@ -1,6 +1,20 @@
import { z } from 'zod' import { z } from 'zod'
import { router, adminProcedure, superAdminProcedure, protectedProcedure } from '../trpc' import { router, adminProcedure, superAdminProcedure, protectedProcedure } from '../trpc'
import { getWhatsAppProvider, getWhatsAppProviderType } from '@/lib/whatsapp' import { getWhatsAppProvider, getWhatsAppProviderType } from '@/lib/whatsapp'
import { listAvailableModels, testOpenAIConnection, isReasoningModel } from '@/lib/openai'
import { getAIUsageStats, getCurrentMonthCost, formatCost } from '@/server/utils/ai-usage'
/**
* Categorize an OpenAI model for display
*/
function categorizeModel(modelId: string): string {
const id = modelId.toLowerCase()
if (id.startsWith('gpt-4o')) return 'gpt-4o'
if (id.startsWith('gpt-4')) return 'gpt-4'
if (id.startsWith('gpt-3.5')) return 'gpt-3.5'
if (id.startsWith('o1') || id.startsWith('o3') || id.startsWith('o4')) return 'reasoning'
return 'other'
}
export const settingsRouter = router({ export const settingsRouter = router({
/** /**
@ -177,33 +191,47 @@ export const settingsRouter = router({
}), }),
/** /**
* Test AI connection * Test AI connection with the configured model
*/ */
testAIConnection: superAdminProcedure.mutation(async ({ ctx }) => { testAIConnection: superAdminProcedure.mutation(async () => {
const apiKeySetting = await ctx.prisma.systemSettings.findUnique({ const result = await testOpenAIConnection()
where: { key: 'openai_api_key' }, return result
}) }),
if (!apiKeySetting?.value) { /**
return { success: false, error: 'API key not configured' } * List available AI models from OpenAI
*/
listAIModels: superAdminProcedure.query(async () => {
const result = await listAvailableModels()
if (!result.success || !result.models) {
return {
success: false,
error: result.error || 'Failed to fetch models',
models: [],
}
} }
try { // Categorize and annotate models
// Test OpenAI connection with a minimal request const categorizedModels = result.models.map(model => ({
const response = await fetch('https://api.openai.com/v1/models', { id: model,
headers: { name: model,
Authorization: `Bearer ${apiKeySetting.value}`, isReasoning: isReasoningModel(model),
}, category: categorizeModel(model),
}))
// Sort: GPT-4o first, then other GPT-4, then GPT-3.5, then reasoning models
const sorted = categorizedModels.sort((a, b) => {
const order = ['gpt-4o', 'gpt-4', 'gpt-3.5', 'reasoning']
const aOrder = order.findIndex(cat => a.category.startsWith(cat))
const bOrder = order.findIndex(cat => b.category.startsWith(cat))
if (aOrder !== bOrder) return aOrder - bOrder
return a.id.localeCompare(b.id)
}) })
if (response.ok) { return {
return { success: true } success: true,
} else { models: sorted,
const error = await response.json()
return { success: false, error: error.error?.message || 'Unknown error' }
}
} catch (error) {
return { success: false, error: 'Connection failed' }
} }
}), }),
@ -373,4 +401,105 @@ export const settingsRouter = router({
), ),
} }
}), }),
/**
* Get AI usage statistics (admin only)
*/
getAIUsageStats: adminProcedure
.input(
z.object({
startDate: z.string().datetime().optional(),
endDate: z.string().datetime().optional(),
})
)
.query(async ({ input }) => {
const startDate = input.startDate ? new Date(input.startDate) : undefined
const endDate = input.endDate ? new Date(input.endDate) : undefined
const stats = await getAIUsageStats(startDate, endDate)
return {
totalTokens: stats.totalTokens,
totalCost: stats.totalCost,
totalCostFormatted: formatCost(stats.totalCost),
byAction: Object.fromEntries(
Object.entries(stats.byAction).map(([action, data]) => [
action,
{
...data,
costFormatted: formatCost(data.cost),
},
])
),
byModel: Object.fromEntries(
Object.entries(stats.byModel).map(([model, data]) => [
model,
{
...data,
costFormatted: formatCost(data.cost),
},
])
),
}
}),
/**
* Get current month AI usage cost (admin only)
*/
getAICurrentMonthCost: adminProcedure.query(async () => {
const { cost, tokens, requestCount } = await getCurrentMonthCost()
return {
cost,
costFormatted: formatCost(cost),
tokens,
requestCount,
}
}),
/**
* Get AI usage history (last 30 days grouped by day)
*/
getAIUsageHistory: adminProcedure
.input(
z.object({
days: z.number().min(1).max(90).default(30),
})
)
.query(async ({ ctx, input }) => {
const startDate = new Date()
startDate.setDate(startDate.getDate() - input.days)
startDate.setHours(0, 0, 0, 0)
const logs = await ctx.prisma.aIUsageLog.findMany({
where: {
createdAt: { gte: startDate },
},
select: {
createdAt: true,
totalTokens: true,
estimatedCostUsd: true,
action: true,
},
orderBy: { createdAt: 'asc' },
})
// Group by day
const dailyData: Record<string, { date: string; tokens: number; cost: number; count: number }> = {}
for (const log of logs) {
const dateKey = log.createdAt.toISOString().split('T')[0]
if (!dailyData[dateKey]) {
dailyData[dateKey] = { date: dateKey, tokens: 0, cost: 0, count: 0 }
}
dailyData[dateKey].tokens += log.totalTokens
dailyData[dateKey].cost += log.estimatedCostUsd?.toNumber() ?? 0
dailyData[dateKey].count += 1
}
return Object.values(dailyData).map((day) => ({
...day,
costFormatted: formatCost(day.cost),
}))
}),
}) })

View File

@ -3,17 +3,41 @@
* *
* Uses GPT to analyze juror expertise and project requirements * Uses GPT to analyze juror expertise and project requirements
* to generate optimal assignment suggestions. * to generate optimal assignment suggestions.
*
* Optimization:
* - Batched processing (15 projects per batch)
* - Description truncation (300 chars)
* - Token tracking and cost logging
*
* GDPR Compliance:
* - All data anonymized before AI processing
* - IDs replaced with sequential identifiers
* - No personal information sent to OpenAI
*/ */
import { getOpenAI, getConfiguredModel } from '@/lib/openai' import { getOpenAI, getConfiguredModel, buildCompletionParams } from '@/lib/openai'
import { logAIUsage, extractTokenUsage } from '@/server/utils/ai-usage'
import { classifyAIError, createParseError, logAIError } from './ai-errors'
import { import {
anonymizeForAI, anonymizeForAI,
deanonymizeResults, deanonymizeResults,
validateAnonymization, validateAnonymization,
DESCRIPTION_LIMITS,
truncateAndSanitize,
type AnonymizationResult, type AnonymizationResult,
} from './anonymization' } from './anonymization'
// Types for AI assignment // ─── Constants ───────────────────────────────────────────────────────────────
const ASSIGNMENT_BATCH_SIZE = 15
// Optimized system prompt
const ASSIGNMENT_SYSTEM_PROMPT = `Match jurors to projects by expertise. Return JSON assignments.
Each: {juror_id, project_id, confidence_score: 0-1, expertise_match_score: 0-1, reasoning: str (1-2 sentences)}
Distribute workload fairly. Avoid assigning jurors at capacity.`
// ─── Types ───────────────────────────────────────────────────────────────────
export interface AIAssignmentSuggestion { export interface AIAssignmentSuggestion {
jurorId: string jurorId: string
projectId: string projectId: string
@ -61,118 +85,71 @@ interface AssignmentConstraints {
}> }>
} }
/** // ─── AI Processing ───────────────────────────────────────────────────────────
* System prompt for AI assignment
*/
const ASSIGNMENT_SYSTEM_PROMPT = `You are an expert at matching jury members to projects based on expertise alignment.
Your task is to suggest optimal juror-project assignments that:
1. Match juror expertise tags with project tags and content
2. Distribute workload fairly among jurors
3. Ensure each project gets the required number of reviews
4. Avoid assigning jurors who are already at their limit
For each suggestion, provide:
- A confidence score (0-1) based on how well the juror's expertise matches the project
- An expertise match score (0-1) based purely on tag/content alignment
- A brief reasoning explaining why this is a good match
Return your response as a JSON array of assignments.`
/** /**
* Generate AI-powered assignment suggestions * Process a batch of projects for assignment suggestions
*/ */
export async function generateAIAssignments( async function processAssignmentBatch(
jurors: JurorForAssignment[], openai: NonNullable<Awaited<ReturnType<typeof getOpenAI>>>,
projects: ProjectForAssignment[], model: string,
constraints: AssignmentConstraints anonymizedData: AnonymizationResult,
): Promise<AIAssignmentResult> { batchProjects: typeof anonymizedData.projects,
// Anonymize data before sending to AI batchMappings: typeof anonymizedData.projectMappings,
const anonymizedData = anonymizeForAI(jurors, projects) constraints: AssignmentConstraints,
userId?: string,
entityId?: string
): Promise<{
suggestions: AIAssignmentSuggestion[]
tokensUsed: number
}> {
const suggestions: AIAssignmentSuggestion[] = []
let tokensUsed = 0
// Validate anonymization // Build prompt with batch-specific data
if (!validateAnonymization(anonymizedData)) { const userPrompt = buildBatchPrompt(
console.error('Anonymization validation failed, falling back to algorithm') anonymizedData.jurors,
return generateFallbackAssignments(jurors, projects, constraints) batchProjects,
} constraints,
anonymizedData.jurorMappings,
try { batchMappings
const openai = await getOpenAI()
if (!openai) {
console.log('OpenAI not configured, using fallback algorithm')
return generateFallbackAssignments(jurors, projects, constraints)
}
const suggestions = await callAIForAssignments(
openai,
anonymizedData,
constraints
) )
// De-anonymize results try {
const deanonymizedSuggestions = deanonymizeResults( const params = buildCompletionParams(model, {
suggestions.map((s) => ({
...s,
jurorId: s.jurorId,
projectId: s.projectId,
})),
anonymizedData.jurorMappings,
anonymizedData.projectMappings
).map((s) => ({
jurorId: s.realJurorId,
projectId: s.realProjectId,
confidenceScore: s.confidenceScore,
reasoning: s.reasoning,
expertiseMatchScore: s.expertiseMatchScore,
}))
return {
success: true,
suggestions: deanonymizedSuggestions,
fallbackUsed: false,
}
} catch (error) {
console.error('AI assignment failed, using fallback:', error)
return generateFallbackAssignments(jurors, projects, constraints)
}
}
/**
* Call OpenAI API for assignment suggestions
*/
async function callAIForAssignments(
openai: Awaited<ReturnType<typeof getOpenAI>>,
anonymizedData: AnonymizationResult,
constraints: AssignmentConstraints
): Promise<AIAssignmentSuggestion[]> {
if (!openai) {
throw new Error('OpenAI client not available')
}
// Build the user prompt
const userPrompt = buildAssignmentPrompt(anonymizedData, constraints)
const model = await getConfiguredModel()
const response = await openai.chat.completions.create({
model,
messages: [ messages: [
{ role: 'system', content: ASSIGNMENT_SYSTEM_PROMPT }, { role: 'system', content: ASSIGNMENT_SYSTEM_PROMPT },
{ role: 'user', content: userPrompt }, { role: 'user', content: userPrompt },
], ],
response_format: { type: 'json_object' }, jsonMode: true,
temperature: 0.3, // Lower temperature for more consistent results temperature: 0.3,
max_tokens: 4000, maxTokens: 4000,
})
const response = await openai.chat.completions.create(params)
const usage = extractTokenUsage(response)
tokensUsed = usage.totalTokens
// Log batch usage
await logAIUsage({
userId,
action: 'ASSIGNMENT',
entityType: 'Round',
entityId,
model,
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
totalTokens: usage.totalTokens,
batchSize: batchProjects.length,
itemsProcessed: batchProjects.length,
status: 'SUCCESS',
}) })
const content = response.choices[0]?.message?.content const content = response.choices[0]?.message?.content
if (!content) { if (!content) {
throw new Error('No response from AI') throw new Error('No response from AI')
} }
// Parse the response
const parsed = JSON.parse(content) as { const parsed = JSON.parse(content) as {
assignments: Array<{ assignments: Array<{
juror_id: string juror_id: string
@ -183,31 +160,69 @@ async function callAIForAssignments(
}> }>
} }
return (parsed.assignments || []).map((a) => ({ // De-anonymize and add to suggestions
const deanonymized = deanonymizeResults(
(parsed.assignments || []).map((a) => ({
jurorId: a.juror_id, jurorId: a.juror_id,
projectId: a.project_id, projectId: a.project_id,
confidenceScore: Math.min(1, Math.max(0, a.confidence_score)), confidenceScore: Math.min(1, Math.max(0, a.confidence_score)),
expertiseMatchScore: Math.min(1, Math.max(0, a.expertise_match_score)), expertiseMatchScore: Math.min(1, Math.max(0, a.expertise_match_score)),
reasoning: a.reasoning, reasoning: a.reasoning,
})) })),
anonymizedData.jurorMappings,
batchMappings
)
for (const item of deanonymized) {
suggestions.push({
jurorId: item.realJurorId,
projectId: item.realProjectId,
confidenceScore: item.confidenceScore,
reasoning: item.reasoning,
expertiseMatchScore: item.expertiseMatchScore,
})
}
} catch (error) {
if (error instanceof SyntaxError) {
const parseError = createParseError(error.message)
logAIError('Assignment', 'batch processing', parseError)
await logAIUsage({
userId,
action: 'ASSIGNMENT',
entityType: 'Round',
entityId,
model,
promptTokens: 0,
completionTokens: 0,
totalTokens: tokensUsed,
batchSize: batchProjects.length,
itemsProcessed: 0,
status: 'ERROR',
errorMessage: parseError.message,
})
} else {
throw error
}
}
return { suggestions, tokensUsed }
} }
/** /**
* Build the prompt for AI assignment * Build prompt for a batch of projects
*/ */
function buildAssignmentPrompt( function buildBatchPrompt(
data: AnonymizationResult, jurors: AnonymizationResult['jurors'],
constraints: AssignmentConstraints projects: AnonymizationResult['projects'],
constraints: AssignmentConstraints,
jurorMappings: AnonymizationResult['jurorMappings'],
projectMappings: AnonymizationResult['projectMappings']
): string { ): string {
const { jurors, projects } = data
// Map existing assignments to anonymous IDs // Map existing assignments to anonymous IDs
const jurorIdMap = new Map( const jurorIdMap = new Map(jurorMappings.map((m) => [m.realId, m.anonymousId]))
data.jurorMappings.map((m) => [m.realId, m.anonymousId]) const projectIdMap = new Map(projectMappings.map((m) => [m.realId, m.anonymousId]))
)
const projectIdMap = new Map(
data.projectMappings.map((m) => [m.realId, m.anonymousId])
)
const anonymousExisting = constraints.existingAssignments const anonymousExisting = constraints.existingAssignments
.map((a) => ({ .map((a) => ({
@ -216,29 +231,110 @@ function buildAssignmentPrompt(
})) }))
.filter((a) => a.jurorId && a.projectId) .filter((a) => a.jurorId && a.projectId)
return `## Jurors Available return `JURORS: ${JSON.stringify(jurors)}
${JSON.stringify(jurors, null, 2)} PROJECTS: ${JSON.stringify(projects)}
CONSTRAINTS: ${constraints.requiredReviewsPerProject} reviews/project, max ${constraints.maxAssignmentsPerJuror || 'unlimited'}/juror
## Projects to Assign EXISTING: ${JSON.stringify(anonymousExisting)}
${JSON.stringify(projects, null, 2)} Return JSON: {"assignments": [...]}`
## Constraints
- Each project needs ${constraints.requiredReviewsPerProject} reviews
- Maximum assignments per juror: ${constraints.maxAssignmentsPerJuror || 'No limit'}
- Existing assignments to avoid duplicating:
${JSON.stringify(anonymousExisting, null, 2)}
## Instructions
Generate optimal juror-project assignments. Return a JSON object with an "assignments" array where each assignment has:
- juror_id: The anonymous juror ID
- project_id: The anonymous project ID
- confidence_score: 0-1 confidence in this match
- expertise_match_score: 0-1 expertise alignment score
- reasoning: Brief explanation (1-2 sentences)
Focus on matching expertise tags with project tags and descriptions. Distribute assignments fairly.`
} }
/**
* Generate AI-powered assignment suggestions with batching
*/
export async function generateAIAssignments(
jurors: JurorForAssignment[],
projects: ProjectForAssignment[],
constraints: AssignmentConstraints,
userId?: string,
entityId?: string
): Promise<AIAssignmentResult> {
// Truncate descriptions before anonymization
const truncatedProjects = projects.map((p) => ({
...p,
description: truncateAndSanitize(p.description, DESCRIPTION_LIMITS.ASSIGNMENT),
}))
// Anonymize data before sending to AI
const anonymizedData = anonymizeForAI(jurors, truncatedProjects)
// Validate anonymization
if (!validateAnonymization(anonymizedData)) {
console.error('[AI Assignment] Anonymization validation failed, falling back to algorithm')
return generateFallbackAssignments(jurors, projects, constraints)
}
try {
const openai = await getOpenAI()
if (!openai) {
console.log('[AI Assignment] OpenAI not configured, using fallback algorithm')
return generateFallbackAssignments(jurors, projects, constraints)
}
const model = await getConfiguredModel()
console.log(`[AI Assignment] Using model: ${model} for ${projects.length} projects in batches of ${ASSIGNMENT_BATCH_SIZE}`)
const allSuggestions: AIAssignmentSuggestion[] = []
let totalTokens = 0
// Process projects in batches
for (let i = 0; i < anonymizedData.projects.length; i += ASSIGNMENT_BATCH_SIZE) {
const batchProjects = anonymizedData.projects.slice(i, i + ASSIGNMENT_BATCH_SIZE)
const batchMappings = anonymizedData.projectMappings.slice(i, i + ASSIGNMENT_BATCH_SIZE)
console.log(`[AI Assignment] Processing batch ${Math.floor(i / ASSIGNMENT_BATCH_SIZE) + 1}/${Math.ceil(anonymizedData.projects.length / ASSIGNMENT_BATCH_SIZE)}`)
const { suggestions, tokensUsed } = await processAssignmentBatch(
openai,
model,
anonymizedData,
batchProjects,
batchMappings,
constraints,
userId,
entityId
)
allSuggestions.push(...suggestions)
totalTokens += tokensUsed
}
console.log(`[AI Assignment] Completed. Total suggestions: ${allSuggestions.length}, Total tokens: ${totalTokens}`)
return {
success: true,
suggestions: allSuggestions,
tokensUsed: totalTokens,
fallbackUsed: false,
}
} catch (error) {
const classified = classifyAIError(error)
logAIError('Assignment', 'generateAIAssignments', classified)
// Log failed attempt
await logAIUsage({
userId,
action: 'ASSIGNMENT',
entityType: 'Round',
entityId,
model: 'unknown',
promptTokens: 0,
completionTokens: 0,
totalTokens: 0,
batchSize: projects.length,
itemsProcessed: 0,
status: 'ERROR',
errorMessage: classified.message,
})
console.error('[AI Assignment] AI assignment failed, using fallback:', classified.message)
return generateFallbackAssignments(jurors, projects, constraints)
}
}
// ─── Fallback Algorithm ──────────────────────────────────────────────────────
/** /**
* Fallback algorithm-based assignment when AI is unavailable * Fallback algorithm-based assignment when AI is unavailable
*/ */

View File

@ -4,9 +4,33 @@
* Determines project eligibility for special awards using: * Determines project eligibility for special awards using:
* - Deterministic field matching (tags, country, category) * - Deterministic field matching (tags, country, category)
* - AI interpretation of plain-language criteria * - AI interpretation of plain-language criteria
*
* GDPR Compliance:
* - All project data is anonymized before AI processing
* - IDs replaced with sequential identifiers
* - No personal information sent to OpenAI
*/ */
import { getOpenAI, getConfiguredModel } from '@/lib/openai' import { getOpenAI, getConfiguredModel, buildCompletionParams } from '@/lib/openai'
import { logAIUsage, extractTokenUsage } from '@/server/utils/ai-usage'
import { classifyAIError, createParseError, logAIError } from './ai-errors'
import {
anonymizeProjectsForAI,
validateAnonymizedProjects,
type ProjectWithRelations,
type AnonymizedProjectForAI,
type ProjectAIMapping,
} from './anonymization'
import type { SubmissionSource } from '@prisma/client'
// ─── Constants ───────────────────────────────────────────────────────────────
const BATCH_SIZE = 20
// Optimized system prompt
const AI_ELIGIBILITY_SYSTEM_PROMPT = `Award eligibility evaluator. Evaluate projects against criteria, return JSON.
Format: {"evaluations": [{project_id, eligible: bool, confidence: 0-1, reasoning: str}]}
Be objective. Base evaluation only on provided data. No personal identifiers in reasoning.`
// ─── Types ────────────────────────────────────────────────────────────────── // ─── Types ──────────────────────────────────────────────────────────────────
@ -33,6 +57,16 @@ interface ProjectForEligibility {
geographicZone?: string | null geographicZone?: string | null
tags: string[] tags: string[]
oceanIssue?: string | null oceanIssue?: string | null
institution?: string | null
foundedAt?: Date | null
wantsMentorship?: boolean
submissionSource?: SubmissionSource
submittedAt?: Date | null
_count?: {
teamMembers?: number
files?: number
}
files?: Array<{ fileType: string | null }>
} }
// ─── Auto Tag Rules ───────────────────────────────────────────────────────── // ─── Auto Tag Rules ─────────────────────────────────────────────────────────
@ -97,32 +131,162 @@ function getFieldValue(
// ─── AI Criteria Interpretation ───────────────────────────────────────────── // ─── AI Criteria Interpretation ─────────────────────────────────────────────
const AI_ELIGIBILITY_SYSTEM_PROMPT = `You are a special award eligibility evaluator. Given a list of projects and award criteria, determine which projects are eligible. /**
* Convert project to enhanced format for anonymization
Return a JSON object with this structure: */
{ function toProjectWithRelations(project: ProjectForEligibility): ProjectWithRelations {
"evaluations": [ return {
{ id: project.id,
"project_id": "string", title: project.title,
"eligible": boolean, description: project.description,
"confidence": number (0-1), competitionCategory: project.competitionCategory as any,
"reasoning": "string" oceanIssue: project.oceanIssue as any,
country: project.country,
geographicZone: project.geographicZone,
institution: project.institution,
tags: project.tags,
foundedAt: project.foundedAt,
wantsMentorship: project.wantsMentorship ?? false,
submissionSource: project.submissionSource ?? 'MANUAL',
submittedAt: project.submittedAt,
_count: {
teamMembers: project._count?.teamMembers ?? 0,
files: project._count?.files ?? 0,
},
files: project.files?.map(f => ({ fileType: f.fileType as any })) ?? [],
} }
]
} }
Be fair, objective, and base your evaluation only on the provided information. Do not include personal identifiers in reasoning.` /**
* Process a batch for AI eligibility evaluation
*/
async function processEligibilityBatch(
openai: NonNullable<Awaited<ReturnType<typeof getOpenAI>>>,
model: string,
criteriaText: string,
anonymized: AnonymizedProjectForAI[],
mappings: ProjectAIMapping[],
userId?: string,
entityId?: string
): Promise<{
results: EligibilityResult[]
tokensUsed: number
}> {
const results: EligibilityResult[] = []
let tokensUsed = 0
const userPrompt = `CRITERIA: ${criteriaText}
PROJECTS: ${JSON.stringify(anonymized)}
Evaluate eligibility for each project.`
try {
const params = buildCompletionParams(model, {
messages: [
{ role: 'system', content: AI_ELIGIBILITY_SYSTEM_PROMPT },
{ role: 'user', content: userPrompt },
],
jsonMode: true,
temperature: 0.3,
maxTokens: 4000,
})
const response = await openai.chat.completions.create(params)
const usage = extractTokenUsage(response)
tokensUsed = usage.totalTokens
// Log usage
await logAIUsage({
userId,
action: 'AWARD_ELIGIBILITY',
entityType: 'Award',
entityId,
model,
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
totalTokens: usage.totalTokens,
batchSize: anonymized.length,
itemsProcessed: anonymized.length,
status: 'SUCCESS',
})
const content = response.choices[0]?.message?.content
if (!content) {
throw new Error('Empty response from AI')
}
const parsed = JSON.parse(content) as {
evaluations: Array<{
project_id: string
eligible: boolean
confidence: number
reasoning: string
}>
}
// Map results back to real IDs
for (const eval_ of parsed.evaluations || []) {
const mapping = mappings.find((m) => m.anonymousId === eval_.project_id)
if (mapping) {
results.push({
projectId: mapping.realId,
eligible: eval_.eligible,
confidence: eval_.confidence,
reasoning: eval_.reasoning,
method: 'AI',
})
}
}
} catch (error) {
if (error instanceof SyntaxError) {
const parseError = createParseError(error.message)
logAIError('AwardEligibility', 'batch processing', parseError)
await logAIUsage({
userId,
action: 'AWARD_ELIGIBILITY',
entityType: 'Award',
entityId,
model,
promptTokens: 0,
completionTokens: 0,
totalTokens: tokensUsed,
batchSize: anonymized.length,
itemsProcessed: 0,
status: 'ERROR',
errorMessage: parseError.message,
})
// Flag all for manual review
for (const mapping of mappings) {
results.push({
projectId: mapping.realId,
eligible: false,
confidence: 0,
reasoning: 'AI response parse error — requires manual review',
method: 'AI',
})
}
} else {
throw error
}
}
return { results, tokensUsed }
}
export async function aiInterpretCriteria( export async function aiInterpretCriteria(
criteriaText: string, criteriaText: string,
projects: ProjectForEligibility[] projects: ProjectForEligibility[],
userId?: string,
awardId?: string
): Promise<EligibilityResult[]> { ): Promise<EligibilityResult[]> {
const results: EligibilityResult[] = [] const results: EligibilityResult[] = []
try { try {
const openai = await getOpenAI() const openai = await getOpenAI()
if (!openai) { if (!openai) {
// No OpenAI — mark all as needing manual review console.warn('[AI Eligibility] OpenAI not configured')
return projects.map((p) => ({ return projects.map((p) => ({
projectId: p.id, projectId: p.id,
eligible: false, eligible: false,
@ -133,91 +297,69 @@ export async function aiInterpretCriteria(
} }
const model = await getConfiguredModel() const model = await getConfiguredModel()
console.log(`[AI Eligibility] Using model: ${model} for ${projects.length} projects`)
// Anonymize and batch // Convert and anonymize projects
const anonymized = projects.map((p, i) => ({ const projectsWithRelations = projects.map(toProjectWithRelations)
project_id: `P${i + 1}`, const { anonymized, mappings } = anonymizeProjectsForAI(projectsWithRelations, 'ELIGIBILITY')
real_id: p.id,
title: p.title,
description: p.description?.slice(0, 500) || '',
category: p.competitionCategory || 'Unknown',
ocean_issue: p.oceanIssue || 'Unknown',
country: p.country || 'Unknown',
region: p.geographicZone || 'Unknown',
tags: p.tags.join(', '),
}))
const batchSize = 20 // Validate anonymization
for (let i = 0; i < anonymized.length; i += batchSize) { if (!validateAnonymizedProjects(anonymized)) {
const batch = anonymized.slice(i, i + batchSize) console.error('[AI Eligibility] Anonymization validation failed')
throw new Error('GDPR compliance check failed: PII detected in anonymized data')
}
const userPrompt = `Award criteria: ${criteriaText} let totalTokens = 0
Projects to evaluate: // Process in batches
${JSON.stringify( for (let i = 0; i < anonymized.length; i += BATCH_SIZE) {
batch.map(({ real_id, ...rest }) => rest), const batchAnon = anonymized.slice(i, i + BATCH_SIZE)
null, const batchMappings = mappings.slice(i, i + BATCH_SIZE)
2
)}
Evaluate each project against the award criteria.` console.log(`[AI Eligibility] Processing batch ${Math.floor(i / BATCH_SIZE) + 1}/${Math.ceil(anonymized.length / BATCH_SIZE)}`)
const response = await openai.chat.completions.create({ const { results: batchResults, tokensUsed } = await processEligibilityBatch(
openai,
model, model,
messages: [ criteriaText,
{ role: 'system', content: AI_ELIGIBILITY_SYSTEM_PROMPT }, batchAnon,
{ role: 'user', content: userPrompt }, batchMappings,
], userId,
response_format: { type: 'json_object' }, awardId
temperature: 0.3, )
max_tokens: 4000,
})
const content = response.choices[0]?.message?.content results.push(...batchResults)
if (content) { totalTokens += tokensUsed
try {
const parsed = JSON.parse(content) as {
evaluations: Array<{
project_id: string
eligible: boolean
confidence: number
reasoning: string
}>
} }
for (const eval_ of parsed.evaluations) { console.log(`[AI Eligibility] Completed. Total tokens: ${totalTokens}`)
const anon = batch.find((b) => b.project_id === eval_.project_id)
if (anon) { } catch (error) {
results.push({ const classified = classifyAIError(error)
projectId: anon.real_id, logAIError('AwardEligibility', 'aiInterpretCriteria', classified)
eligible: eval_.eligible,
confidence: eval_.confidence, // Log failed attempt
reasoning: eval_.reasoning, await logAIUsage({
method: 'AI', userId,
action: 'AWARD_ELIGIBILITY',
entityType: 'Award',
entityId: awardId,
model: 'unknown',
promptTokens: 0,
completionTokens: 0,
totalTokens: 0,
batchSize: projects.length,
itemsProcessed: 0,
status: 'ERROR',
errorMessage: classified.message,
}) })
}
} // Return all as needing manual review
} catch {
// Parse error — mark batch for manual review
for (const item of batch) {
results.push({
projectId: item.real_id,
eligible: false,
confidence: 0,
reasoning: 'AI response parse error — requires manual review',
method: 'AI',
})
}
}
}
}
} catch {
// OpenAI error — mark all for manual review
return projects.map((p) => ({ return projects.map((p) => ({
projectId: p.id, projectId: p.id,
eligible: false, eligible: false,
confidence: 0, confidence: 0,
reasoning: 'AI error — requires manual eligibility review', reasoning: `AI error: ${classified.message}`,
method: 'AI' as const, method: 'AI' as const,
})) }))
} }

View File

@ -0,0 +1,318 @@
/**
* AI Error Classification Service
*
* Provides unified error handling and classification for all AI services.
* Converts technical API errors into user-friendly messages.
*/
// ─── Error Types ─────────────────────────────────────────────────────────────
export type AIErrorType =
| 'rate_limit'
| 'quota_exceeded'
| 'model_not_found'
| 'invalid_api_key'
| 'context_length'
| 'parse_error'
| 'timeout'
| 'network_error'
| 'content_filter'
| 'server_error'
| 'unknown'
export interface ClassifiedError {
type: AIErrorType
message: string
originalMessage: string
retryable: boolean
suggestedAction?: string
}
// ─── Error Patterns ──────────────────────────────────────────────────────────
interface ErrorPattern {
type: AIErrorType
patterns: Array<string | RegExp>
retryable: boolean
userMessage: string
suggestedAction?: string
}
const ERROR_PATTERNS: ErrorPattern[] = [
{
type: 'rate_limit',
patterns: [
'rate_limit',
'rate limit',
'too many requests',
'429',
'quota exceeded',
'Rate limit reached',
],
retryable: true,
userMessage: 'Rate limit exceeded. Please wait a few minutes and try again.',
suggestedAction: 'Wait 1-2 minutes before retrying, or reduce batch size.',
},
{
type: 'quota_exceeded',
patterns: [
'insufficient_quota',
'billing',
'exceeded your current quota',
'payment required',
'account deactivated',
],
retryable: false,
userMessage: 'API quota exceeded. Please check your OpenAI billing settings.',
suggestedAction: 'Add payment method or increase spending limit in OpenAI dashboard.',
},
{
type: 'model_not_found',
patterns: [
'model_not_found',
'does not exist',
'The model',
'invalid model',
'model not available',
],
retryable: false,
userMessage: 'The selected AI model is not available. Please check your settings.',
suggestedAction: 'Go to Settings → AI and select a different model.',
},
{
type: 'invalid_api_key',
patterns: [
'invalid_api_key',
'Incorrect API key',
'authentication',
'unauthorized',
'401',
'invalid api key',
],
retryable: false,
userMessage: 'Invalid API key. Please check your OpenAI API key in settings.',
suggestedAction: 'Go to Settings → AI and enter a valid API key.',
},
{
type: 'context_length',
patterns: [
'context_length',
'maximum context length',
'tokens',
'too long',
'reduce the length',
'max_tokens',
],
retryable: true,
userMessage: 'Request too large. Try processing fewer items at once.',
suggestedAction: 'Process items in smaller batches.',
},
{
type: 'content_filter',
patterns: [
'content_filter',
'content policy',
'flagged',
'inappropriate',
'safety system',
],
retryable: false,
userMessage: 'Content was flagged by the AI safety system. Please review the input data.',
suggestedAction: 'Check project descriptions for potentially sensitive content.',
},
{
type: 'timeout',
patterns: [
'timeout',
'timed out',
'ETIMEDOUT',
'ECONNABORTED',
'deadline exceeded',
],
retryable: true,
userMessage: 'Request timed out. Please try again.',
suggestedAction: 'Try again or process fewer items at once.',
},
{
type: 'network_error',
patterns: [
'ENOTFOUND',
'ECONNREFUSED',
'network',
'connection',
'DNS',
'getaddrinfo',
],
retryable: true,
userMessage: 'Network error. Please check your connection and try again.',
suggestedAction: 'Check network connectivity and firewall settings.',
},
{
type: 'server_error',
patterns: [
'500',
'502',
'503',
'504',
'internal error',
'server error',
'service unavailable',
],
retryable: true,
userMessage: 'OpenAI service temporarily unavailable. Please try again later.',
suggestedAction: 'Wait a few minutes and retry. Check status.openai.com for outages.',
},
]
// ─── Error Classification ────────────────────────────────────────────────────
/**
* Classify an error from the OpenAI API
*/
export function classifyAIError(error: Error | unknown): ClassifiedError {
const errorMessage = error instanceof Error ? error.message : String(error)
const errorString = errorMessage.toLowerCase()
// Check against known patterns
for (const pattern of ERROR_PATTERNS) {
for (const matcher of pattern.patterns) {
const matches =
typeof matcher === 'string'
? errorString.includes(matcher.toLowerCase())
: matcher.test(errorString)
if (matches) {
return {
type: pattern.type,
message: pattern.userMessage,
originalMessage: errorMessage,
retryable: pattern.retryable,
suggestedAction: pattern.suggestedAction,
}
}
}
}
// Unknown error
return {
type: 'unknown',
message: 'An unexpected error occurred. Please try again.',
originalMessage: errorMessage,
retryable: true,
suggestedAction: 'If the problem persists, check the AI settings or contact support.',
}
}
/**
* Check if an error is a JSON parse error
*/
export function isParseError(error: Error | unknown): boolean {
const message = error instanceof Error ? error.message : String(error)
return (
message.includes('JSON') ||
message.includes('parse') ||
message.includes('Unexpected token') ||
message.includes('SyntaxError')
)
}
/**
* Create a classified parse error
*/
export function createParseError(originalMessage: string): ClassifiedError {
return {
type: 'parse_error',
message: 'AI returned an invalid response. Items flagged for manual review.',
originalMessage,
retryable: true,
suggestedAction: 'Review flagged items manually. Consider using a different model.',
}
}
// ─── User-Friendly Messages ──────────────────────────────────────────────────
const USER_FRIENDLY_MESSAGES: Record<AIErrorType, string> = {
rate_limit: 'Rate limit exceeded. Please wait a few minutes and try again.',
quota_exceeded: 'API quota exceeded. Please check your OpenAI billing settings.',
model_not_found: 'Selected AI model is not available. Please check your settings.',
invalid_api_key: 'Invalid API key. Please verify your OpenAI API key.',
context_length: 'Request too large. Please try with fewer items.',
parse_error: 'AI response could not be processed. Items flagged for review.',
timeout: 'Request timed out. Please try again.',
network_error: 'Network connection error. Please check your connection.',
content_filter: 'Content flagged by AI safety system. Please review input data.',
server_error: 'AI service temporarily unavailable. Please try again later.',
unknown: 'An unexpected error occurred. Please try again.',
}
/**
* Get a user-friendly message for an error type
*/
export function getUserFriendlyMessage(errorType: AIErrorType): string {
return USER_FRIENDLY_MESSAGES[errorType]
}
// ─── Error Handling Helpers ──────────────────────────────────────────────────
/**
* Wrap an async function with standardized AI error handling
*/
export async function withAIErrorHandling<T>(
fn: () => Promise<T>,
fallback: T
): Promise<{ result: T; error?: ClassifiedError }> {
try {
const result = await fn()
return { result }
} catch (error) {
const classified = classifyAIError(error)
console.error(`[AI Error] ${classified.type}:`, classified.originalMessage)
return { result: fallback, error: classified }
}
}
/**
* Log an AI error with context
*/
export function logAIError(
service: string,
operation: string,
error: ClassifiedError,
context?: Record<string, unknown>
): void {
console.error(
`[AI ${service}] ${operation} failed:`,
JSON.stringify({
type: error.type,
message: error.message,
originalMessage: error.originalMessage,
retryable: error.retryable,
...context,
})
)
}
// ─── Retry Logic ─────────────────────────────────────────────────────────────
/**
* Determine if an operation should be retried based on error type
*/
export function shouldRetry(error: ClassifiedError, attempt: number, maxAttempts: number = 3): boolean {
if (!error.retryable) return false
if (attempt >= maxAttempts) return false
// Rate limits need longer delays
if (error.type === 'rate_limit') {
return attempt < 2 // Only retry once for rate limits
}
return true
}
/**
* Calculate delay before retry (exponential backoff)
*/
export function getRetryDelay(error: ClassifiedError, attempt: number): number {
const baseDelay = error.type === 'rate_limit' ? 30000 : 1000 // 30s for rate limit, 1s otherwise
return baseDelay * Math.pow(2, attempt)
}

View File

@ -5,10 +5,24 @@
* - Field-based rules (age checks, category, country, etc.) * - Field-based rules (age checks, category, country, etc.)
* - Document checks (file existence/types) * - Document checks (file existence/types)
* - AI screening (GPT interprets criteria text, flags spam) * - AI screening (GPT interprets criteria text, flags spam)
*
* GDPR Compliance:
* - All project data is anonymized before AI processing
* - Only necessary fields sent to OpenAI
* - No personal identifiers in prompts or responses
*/ */
import { getOpenAI, getConfiguredModel } from '@/lib/openai' import { getOpenAI, getConfiguredModel, buildCompletionParams } from '@/lib/openai'
import type { Prisma } from '@prisma/client' import { logAIUsage, extractTokenUsage } from '@/server/utils/ai-usage'
import { classifyAIError, createParseError, logAIError } from './ai-errors'
import {
anonymizeProjectsForAI,
validateAnonymizedProjects,
type ProjectWithRelations,
type AnonymizedProjectForAI,
type ProjectAIMapping,
} from './anonymization'
import type { Prisma, FileType, SubmissionSource } from '@prisma/client'
// ─── Types ────────────────────────────────────────────────────────────────── // ─── Types ──────────────────────────────────────────────────────────────────
@ -80,7 +94,14 @@ interface ProjectForFiltering {
tags: string[] tags: string[]
oceanIssue?: string | null oceanIssue?: string | null
wantsMentorship?: boolean | null wantsMentorship?: boolean | null
files: Array<{ id: string; fileName: string; fileType?: string | null }> institution?: string | null
submissionSource?: SubmissionSource
submittedAt?: Date | null
files: Array<{ id: string; fileName: string; fileType?: FileType | null }>
_count?: {
teamMembers?: number
files?: number
}
} }
interface FilteringRuleInput { interface FilteringRuleInput {
@ -92,6 +113,15 @@ interface FilteringRuleInput {
isActive: boolean isActive: boolean
} }
// ─── Constants ───────────────────────────────────────────────────────────────
const BATCH_SIZE = 20
// Optimized system prompt (compressed for token efficiency)
const AI_SCREENING_SYSTEM_PROMPT = `Project screening assistant. Evaluate against criteria, return JSON.
Format: {"projects": [{project_id, meets_criteria: bool, confidence: 0-1, reasoning: str, quality_score: 1-10, spam_risk: bool}]}
Be objective. Base evaluation only on provided data. No personal identifiers in reasoning.`
// ─── Field-Based Rule Evaluation ──────────────────────────────────────────── // ─── Field-Based Rule Evaluation ────────────────────────────────────────────
function evaluateCondition( function evaluateCondition(
@ -185,14 +215,9 @@ export function evaluateFieldRule(
? results.every(Boolean) ? results.every(Boolean)
: results.some(Boolean) : results.some(Boolean)
// If conditions met, the rule's action applies
// For PASS action: conditions met = passed, not met = not passed
// For REJECT action: conditions met = rejected (not passed)
// For FLAG action: conditions met = flagged
if (config.action === 'PASS') { if (config.action === 'PASS') {
return { passed: allConditionsMet, action: config.action } return { passed: allConditionsMet, action: config.action }
} }
// For REJECT/FLAG: conditions matching means the project should be rejected/flagged
return { passed: !allConditionsMet, action: config.action } return { passed: !allConditionsMet, action: config.action }
} }
@ -226,55 +251,173 @@ export function evaluateDocumentRule(
// ─── AI Screening ─────────────────────────────────────────────────────────── // ─── AI Screening ───────────────────────────────────────────────────────────
const AI_SCREENING_SYSTEM_PROMPT = `You are a project screening assistant. You evaluate projects against specific criteria. interface AIScreeningResult {
You must return a JSON object with this structure: meetsCriteria: boolean
{ confidence: number
"projects": [ reasoning: string
{ qualityScore: number
"project_id": "string", spamRisk: boolean
"meets_criteria": boolean,
"confidence": number (0-1),
"reasoning": "string",
"quality_score": number (1-10),
"spam_risk": boolean
}
]
} }
Be fair and objective. Base your evaluation only on the information provided. /**
Never include personal identifiers in your reasoning.` * Convert project to enhanced format for anonymization
*/
function toProjectWithRelations(project: ProjectForFiltering): ProjectWithRelations {
return {
id: project.id,
title: project.title,
description: project.description,
competitionCategory: project.competitionCategory as any,
oceanIssue: project.oceanIssue as any,
country: project.country,
geographicZone: project.geographicZone,
institution: project.institution,
tags: project.tags,
foundedAt: project.foundedAt,
wantsMentorship: project.wantsMentorship ?? false,
submissionSource: project.submissionSource ?? 'MANUAL',
submittedAt: project.submittedAt,
_count: {
teamMembers: project._count?.teamMembers ?? 0,
files: project.files?.length ?? 0,
},
files: project.files?.map(f => ({ fileType: f.fileType ?? null })) ?? [],
}
}
/**
* Execute AI screening on a batch of projects
*/
async function processAIBatch(
openai: NonNullable<Awaited<ReturnType<typeof getOpenAI>>>,
model: string,
criteriaText: string,
anonymized: AnonymizedProjectForAI[],
mappings: ProjectAIMapping[],
userId?: string,
entityId?: string
): Promise<{
results: Map<string, AIScreeningResult>
tokensUsed: number
}> {
const results = new Map<string, AIScreeningResult>()
let tokensUsed = 0
// Build optimized prompt
const userPrompt = `CRITERIA: ${criteriaText}
PROJECTS: ${JSON.stringify(anonymized)}
Evaluate and return JSON.`
try {
const params = buildCompletionParams(model, {
messages: [
{ role: 'system', content: AI_SCREENING_SYSTEM_PROMPT },
{ role: 'user', content: userPrompt },
],
jsonMode: true,
temperature: 0.3,
maxTokens: 4000,
})
const response = await openai.chat.completions.create(params)
const usage = extractTokenUsage(response)
tokensUsed = usage.totalTokens
// Log usage
await logAIUsage({
userId,
action: 'FILTERING',
entityType: 'Round',
entityId,
model,
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
totalTokens: usage.totalTokens,
batchSize: anonymized.length,
itemsProcessed: anonymized.length,
status: 'SUCCESS',
})
const content = response.choices[0]?.message?.content
if (!content) {
throw new Error('Empty response from AI')
}
const parsed = JSON.parse(content) as {
projects: Array<{
project_id: string
meets_criteria: boolean
confidence: number
reasoning: string
quality_score: number
spam_risk: boolean
}>
}
// Map results back to real IDs
for (const result of parsed.projects || []) {
const mapping = mappings.find((m) => m.anonymousId === result.project_id)
if (mapping) {
results.set(mapping.realId, {
meetsCriteria: result.meets_criteria,
confidence: result.confidence,
reasoning: result.reasoning,
qualityScore: result.quality_score,
spamRisk: result.spam_risk,
})
}
}
} catch (error) {
// Check if parse error
if (error instanceof SyntaxError) {
const parseError = createParseError(error.message)
logAIError('Filtering', 'batch processing', parseError)
await logAIUsage({
userId,
action: 'FILTERING',
entityType: 'Round',
entityId,
model,
promptTokens: 0,
completionTokens: 0,
totalTokens: tokensUsed,
batchSize: anonymized.length,
itemsProcessed: 0,
status: 'ERROR',
errorMessage: parseError.message,
})
// Flag all for manual review
for (const mapping of mappings) {
results.set(mapping.realId, {
meetsCriteria: false,
confidence: 0,
reasoning: 'AI response parse error — flagged for manual review',
qualityScore: 5,
spamRisk: false,
})
}
} else {
throw error // Re-throw for outer catch
}
}
return { results, tokensUsed }
}
export async function executeAIScreening( export async function executeAIScreening(
config: AIScreeningConfig, config: AIScreeningConfig,
projects: ProjectForFiltering[] projects: ProjectForFiltering[],
): Promise< userId?: string,
Map< entityId?: string
string, ): Promise<Map<string, AIScreeningResult>> {
{ const results = new Map<string, AIScreeningResult>()
meetsCriteria: boolean
confidence: number
reasoning: string
qualityScore: number
spamRisk: boolean
}
>
> {
const results = new Map<
string,
{
meetsCriteria: boolean
confidence: number
reasoning: string
qualityScore: number
spamRisk: boolean
}
>()
try { try {
const openai = await getOpenAI() const openai = await getOpenAI()
if (!openai) { if (!openai) {
// No OpenAI configured — flag all for manual review console.warn('[AI Filtering] OpenAI not configured')
console.warn('[AI Filtering] OpenAI client not available - API key may not be configured')
for (const p of projects) { for (const p of projects) {
results.set(p.id, { results.set(p.id, {
meetsCriteria: false, meetsCriteria: false,
@ -290,133 +433,71 @@ export async function executeAIScreening(
const model = await getConfiguredModel() const model = await getConfiguredModel()
console.log(`[AI Filtering] Using model: ${model} for ${projects.length} projects`) console.log(`[AI Filtering] Using model: ${model} for ${projects.length} projects`)
// Anonymize project data — use numeric IDs // Convert and anonymize projects
const anonymizedProjects = projects.map((p, i) => ({ const projectsWithRelations = projects.map(toProjectWithRelations)
project_id: `P${i + 1}`, const { anonymized, mappings } = anonymizeProjectsForAI(projectsWithRelations, 'FILTERING')
real_id: p.id,
title: p.title,
description: p.description?.slice(0, 500) || '',
category: p.competitionCategory || 'Unknown',
ocean_issue: p.oceanIssue || 'Unknown',
country: p.country || 'Unknown',
tags: p.tags.join(', '),
has_files: (p.files?.length || 0) > 0,
}))
// Process in batches of 20 // Validate anonymization
const batchSize = 20 if (!validateAnonymizedProjects(anonymized)) {
for (let i = 0; i < anonymizedProjects.length; i += batchSize) { console.error('[AI Filtering] Anonymization validation failed')
const batch = anonymizedProjects.slice(i, i + batchSize) throw new Error('GDPR compliance check failed: PII detected in anonymized data')
}
const userPrompt = `Evaluate these projects against the following criteria: let totalTokens = 0
CRITERIA: ${config.criteriaText} // Process in batches
for (let i = 0; i < anonymized.length; i += BATCH_SIZE) {
const batchAnon = anonymized.slice(i, i + BATCH_SIZE)
const batchMappings = mappings.slice(i, i + BATCH_SIZE)
PROJECTS: console.log(`[AI Filtering] Processing batch ${Math.floor(i / BATCH_SIZE) + 1}/${Math.ceil(anonymized.length / BATCH_SIZE)}`)
${JSON.stringify(
batch.map(({ real_id, ...rest }) => rest),
null,
2
)}
Return your evaluation as JSON.` const { results: batchResults, tokensUsed } = await processAIBatch(
openai,
console.log(`[AI Filtering] Processing batch ${Math.floor(i / batchSize) + 1}, ${batch.length} projects`)
const response = await openai.chat.completions.create({
model, model,
messages: [ config.criteriaText,
{ role: 'system', content: AI_SCREENING_SYSTEM_PROMPT }, batchAnon,
{ role: 'user', content: userPrompt }, batchMappings,
], userId,
response_format: { type: 'json_object' }, entityId
temperature: 0.3, )
max_tokens: 4000,
})
console.log(`[AI Filtering] Batch completed, usage: ${response.usage?.total_tokens} tokens`) totalTokens += tokensUsed
const content = response.choices[0]?.message?.content // Merge batch results
if (content) { for (const [id, result] of batchResults) {
try { results.set(id, result)
const parsed = JSON.parse(content) as { }
projects: Array<{
project_id: string
meets_criteria: boolean
confidence: number
reasoning: string
quality_score: number
spam_risk: boolean
}>
} }
console.log(`[AI Filtering] Parsed ${parsed.projects?.length || 0} results from response`) console.log(`[AI Filtering] Completed. Total tokens: ${totalTokens}`)
for (const result of parsed.projects) {
const anon = batch.find((b) => b.project_id === result.project_id)
if (anon) {
results.set(anon.real_id, {
meetsCriteria: result.meets_criteria,
confidence: result.confidence,
reasoning: result.reasoning,
qualityScore: result.quality_score,
spamRisk: result.spam_risk,
})
}
}
} catch (parseError) {
// Parse error — flag batch for manual review
console.error('[AI Filtering] JSON parse error:', parseError)
console.error('[AI Filtering] Raw response content:', content.slice(0, 500))
for (const item of batch) {
results.set(item.real_id, {
meetsCriteria: false,
confidence: 0,
reasoning: 'AI response parse error — flagged for manual review',
qualityScore: 5,
spamRisk: false,
})
}
}
} else {
console.error('[AI Filtering] Empty response content from API')
}
}
} catch (error) { } catch (error) {
// OpenAI error — flag all for manual review with specific error info const classified = classifyAIError(error)
console.error('[AI Filtering] OpenAI API error:', error) logAIError('Filtering', 'executeAIScreening', classified)
// Extract meaningful error message // Log failed attempt
let errorType = 'unknown_error' await logAIUsage({
let errorDetail = 'Unknown error occurred' userId,
action: 'FILTERING',
if (error instanceof Error) { entityType: 'Round',
const message = error.message.toLowerCase() entityId,
if (message.includes('rate_limit') || message.includes('rate limit')) { model: 'unknown',
errorType = 'rate_limit' promptTokens: 0,
errorDetail = 'OpenAI rate limit exceeded. Try again in a few minutes.' completionTokens: 0,
} else if (message.includes('model') && (message.includes('not found') || message.includes('does not exist'))) { totalTokens: 0,
errorType = 'model_not_found' batchSize: projects.length,
errorDetail = 'The configured AI model is not available. Check Settings → AI.' itemsProcessed: 0,
} else if (message.includes('insufficient_quota') || message.includes('quota')) { status: 'ERROR',
errorType = 'quota_exceeded' errorMessage: classified.message,
errorDetail = 'OpenAI API quota exceeded. Check your billing settings.' })
} else if (message.includes('invalid_api_key') || message.includes('unauthorized')) {
errorType = 'invalid_api_key'
errorDetail = 'Invalid OpenAI API key. Check Settings → AI.'
} else if (message.includes('context_length') || message.includes('token')) {
errorType = 'context_length'
errorDetail = 'Request too large. Try with fewer projects or shorter descriptions.'
} else {
errorDetail = error.message
}
}
// Flag all for manual review with error info
for (const p of projects) { for (const p of projects) {
results.set(p.id, { results.set(p.id, {
meetsCriteria: false, meetsCriteria: false,
confidence: 0, confidence: 0,
reasoning: `AI screening error (${errorType}): ${errorDetail}`, reasoning: `AI screening error: ${classified.message}`,
qualityScore: 5, qualityScore: 5,
spamRisk: false, spamRisk: false,
}) })
@ -430,7 +511,9 @@ Return your evaluation as JSON.`
export async function executeFilteringRules( export async function executeFilteringRules(
rules: FilteringRuleInput[], rules: FilteringRuleInput[],
projects: ProjectForFiltering[] projects: ProjectForFiltering[],
userId?: string,
roundId?: string
): Promise<ProjectFilteringResult[]> { ): Promise<ProjectFilteringResult[]> {
const activeRules = rules const activeRules = rules
.filter((r) => r.isActive) .filter((r) => r.isActive)
@ -441,23 +524,11 @@ export async function executeFilteringRules(
const nonAiRules = activeRules.filter((r) => r.ruleType !== 'AI_SCREENING') const nonAiRules = activeRules.filter((r) => r.ruleType !== 'AI_SCREENING')
// Pre-compute AI screening results if needed // Pre-compute AI screening results if needed
const aiResults = new Map< const aiResults = new Map<string, Map<string, AIScreeningResult>>()
string,
Map<
string,
{
meetsCriteria: boolean
confidence: number
reasoning: string
qualityScore: number
spamRisk: boolean
}
>
>()
for (const aiRule of aiRules) { for (const aiRule of aiRules) {
const config = aiRule.configJson as unknown as AIScreeningConfig const config = aiRule.configJson as unknown as AIScreeningConfig
const screeningResults = await executeAIScreening(config, projects) const screeningResults = await executeAIScreening(config, projects, userId, roundId)
aiResults.set(aiRule.id, screeningResults) aiResults.set(aiRule.id, screeningResults)
} }

View File

@ -3,8 +3,44 @@
* *
* Strips PII (names, emails, etc.) from data before sending to AI services. * Strips PII (names, emails, etc.) from data before sending to AI services.
* Returns ID mappings for de-anonymization of results. * Returns ID mappings for de-anonymization of results.
*
* GDPR Compliance:
* - All personal identifiers are stripped before AI processing
* - Project/user IDs are replaced with sequential anonymous IDs
* - Text content is sanitized to remove emails, phones, URLs
* - Validation ensures no PII leakage before each AI call
*/ */
import type {
CompetitionCategory,
OceanIssue,
FileType,
SubmissionSource,
} from '@prisma/client'
// ─── Description Limits ──────────────────────────────────────────────────────
export const DESCRIPTION_LIMITS = {
ASSIGNMENT: 300,
FILTERING: 500,
ELIGIBILITY: 400,
MENTOR: 350,
} as const
export type DescriptionContext = keyof typeof DESCRIPTION_LIMITS
// ─── PII Patterns ────────────────────────────────────────────────────────────
const PII_PATTERNS = {
email: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g,
phone: /(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/g,
url: /https?:\/\/[^\s]+/g,
ssn: /\d{3}-\d{2}-\d{4}/g,
ipv4: /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/g,
} as const
// ─── Basic Anonymization Types (Assignment Service) ──────────────────────────
export interface AnonymizedJuror { export interface AnonymizedJuror {
anonymousId: string anonymousId: string
expertiseTags: string[] expertiseTags: string[]
@ -37,9 +73,67 @@ export interface AnonymizationResult {
projectMappings: ProjectMapping[] projectMappings: ProjectMapping[]
} }
// ─── Enhanced Project Types (Filtering/Awards) ───────────────────────────────
/** /**
* Juror data from database * Comprehensive anonymized project data for AI filtering
* Includes all fields needed for flexible filtering criteria
*/ */
export interface AnonymizedProjectForAI {
project_id: string // P1, P2, etc.
title: string // Sanitized
description: string // Truncated + PII stripped
category: CompetitionCategory | null // STARTUP | BUSINESS_CONCEPT
ocean_issue: OceanIssue | null // Enum value
country: string | null
region: string | null // geographicZone
institution: string | null
tags: string[]
founded_year: number | null // Just the year
team_size: number
has_description: boolean
file_count: number
file_types: string[] // FileType values
wants_mentorship: boolean
submission_source: SubmissionSource
submitted_date: string | null // YYYY-MM-DD only
}
/**
* Project input with all relations needed for comprehensive anonymization
*/
export interface ProjectWithRelations {
id: string
title: string
description?: string | null
teamName?: string | null
competitionCategory?: CompetitionCategory | null
oceanIssue?: OceanIssue | null
country?: string | null
geographicZone?: string | null
institution?: string | null
tags: string[]
foundedAt?: Date | null
wantsMentorship?: boolean
submissionSource: SubmissionSource
submittedAt?: Date | null
_count?: {
teamMembers?: number
files?: number
}
files?: Array<{ fileType: FileType | null }>
}
/**
* Mapping for de-anonymization
*/
export interface ProjectAIMapping {
anonymousId: string
realId: string
}
// ─── Basic Anonymization (Assignment Service) ────────────────────────────────
interface JurorInput { interface JurorInput {
id: string id: string
name?: string | null name?: string | null
@ -51,9 +145,6 @@ interface JurorInput {
} }
} }
/**
* Project data from database
*/
interface ProjectInput { interface ProjectInput {
id: string id: string
title: string title: string
@ -63,13 +154,7 @@ interface ProjectInput {
} }
/** /**
* Anonymize juror and project data for AI processing * Anonymize juror and project data for AI processing (Assignment service)
*
* This function:
* 1. Strips all PII (names, emails) from juror data
* 2. Replaces real IDs with sequential anonymous IDs
* 3. Keeps only expertise tags and assignment counts
* 4. Returns mappings for de-anonymization
*/ */
export function anonymizeForAI( export function anonymizeForAI(
jurors: JurorInput[], jurors: JurorInput[],
@ -78,7 +163,6 @@ export function anonymizeForAI(
const jurorMappings: JurorMapping[] = [] const jurorMappings: JurorMapping[] = []
const projectMappings: ProjectMapping[] = [] const projectMappings: ProjectMapping[] = []
// Anonymize jurors
const anonymizedJurors: AnonymizedJuror[] = jurors.map((juror, index) => { const anonymizedJurors: AnonymizedJuror[] = jurors.map((juror, index) => {
const anonymousId = `juror_${(index + 1).toString().padStart(3, '0')}` const anonymousId = `juror_${(index + 1).toString().padStart(3, '0')}`
@ -95,7 +179,6 @@ export function anonymizeForAI(
} }
}) })
// Anonymize projects (keep content but replace IDs)
const anonymizedProjects: AnonymizedProject[] = projects.map( const anonymizedProjects: AnonymizedProject[] = projects.map(
(project, index) => { (project, index) => {
const anonymousId = `project_${(index + 1).toString().padStart(3, '0')}` const anonymousId = `project_${(index + 1).toString().padStart(3, '0')}`
@ -109,10 +192,9 @@ export function anonymizeForAI(
anonymousId, anonymousId,
title: sanitizeText(project.title), title: sanitizeText(project.title),
description: project.description description: project.description
? sanitizeText(project.description) ? truncateAndSanitize(project.description, DESCRIPTION_LIMITS.ASSIGNMENT)
: null, : null,
tags: project.tags, tags: project.tags,
// Replace specific team names with generic identifier
teamName: project.teamName ? `Team ${index + 1}` : null, teamName: project.teamName ? `Team ${index + 1}` : null,
} }
} }
@ -126,10 +208,77 @@ export function anonymizeForAI(
} }
} }
// ─── Enhanced Anonymization (Filtering/Awards) ───────────────────────────────
/**
* Anonymize a single project with comprehensive data for AI filtering
*
* GDPR Compliance:
* - Strips team names, email references, phone numbers, URLs
* - Replaces IDs with sequential anonymous IDs
* - Truncates descriptions to limit data exposure
* - Keeps only necessary fields for filtering criteria
*/
export function anonymizeProjectForAI(
project: ProjectWithRelations,
index: number,
context: DescriptionContext = 'FILTERING'
): AnonymizedProjectForAI {
const descriptionLimit = DESCRIPTION_LIMITS[context]
return {
project_id: `P${index + 1}`,
title: sanitizeText(project.title),
description: truncateAndSanitize(project.description, descriptionLimit),
category: project.competitionCategory ?? null,
ocean_issue: project.oceanIssue ?? null,
country: project.country ?? null,
region: project.geographicZone ?? null,
institution: project.institution ?? null,
tags: project.tags,
founded_year: project.foundedAt?.getFullYear() ?? null,
team_size: project._count?.teamMembers ?? 0,
has_description: !!project.description?.trim(),
file_count: project._count?.files ?? 0,
file_types: project.files
?.map((f) => f.fileType)
.filter((ft): ft is FileType => ft !== null) ?? [],
wants_mentorship: project.wantsMentorship ?? false,
submission_source: project.submissionSource,
submitted_date: project.submittedAt?.toISOString().split('T')[0] ?? null,
}
}
/**
* Anonymize multiple projects and return mappings
*/
export function anonymizeProjectsForAI(
projects: ProjectWithRelations[],
context: DescriptionContext = 'FILTERING'
): {
anonymized: AnonymizedProjectForAI[]
mappings: ProjectAIMapping[]
} {
const mappings: ProjectAIMapping[] = []
const anonymized = projects.map((project, index) => {
mappings.push({
anonymousId: `P${index + 1}`,
realId: project.id,
})
return anonymizeProjectForAI(project, index, context)
})
return { anonymized, mappings }
}
// ─── De-anonymization ────────────────────────────────────────────────────────
/** /**
* De-anonymize AI results back to real IDs * De-anonymize AI results back to real IDs
*/ */
export function deanonymizeResults<T extends { jurorId: string; projectId: string }>( export function deanonymizeResults<
T extends { jurorId: string; projectId: string }
>(
results: T[], results: T[],
jurorMappings: JurorMapping[], jurorMappings: JurorMapping[],
projectMappings: ProjectMapping[] projectMappings: ProjectMapping[]
@ -149,50 +298,155 @@ export function deanonymizeResults<T extends { jurorId: string; projectId: strin
} }
/** /**
* Sanitize text to remove potential PII patterns * De-anonymize project-only results (for filtering/awards)
* Removes emails, phone numbers, and URLs from text
*/ */
function sanitizeText(text: string): string { export function deanonymizeProjectResults<T extends { project_id: string }>(
results: T[],
mappings: ProjectAIMapping[]
): (T & { realProjectId: string })[] {
const projectMap = new Map(mappings.map((m) => [m.anonymousId, m.realId]))
return results.map((result) => ({
...result,
realProjectId: projectMap.get(result.project_id) || result.project_id,
}))
}
// ─── Text Sanitization ───────────────────────────────────────────────────────
/**
* Sanitize text to remove potential PII patterns
* Removes emails, phone numbers, URLs, and other identifying information
*/
export function sanitizeText(text: string): string {
let sanitized = text
// Remove email addresses // Remove email addresses
let sanitized = text.replace( sanitized = sanitized.replace(PII_PATTERNS.email, '[email removed]')
/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g,
'[email removed]'
)
// Remove phone numbers (various formats) // Remove phone numbers (various formats)
sanitized = sanitized.replace( sanitized = sanitized.replace(PII_PATTERNS.phone, '[phone removed]')
/(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/g,
'[phone removed]'
)
// Remove URLs // Remove URLs
sanitized = sanitized.replace( sanitized = sanitized.replace(PII_PATTERNS.url, '[url removed]')
/https?:\/\/[^\s]+/g,
'[url removed]' // Remove SSN-like patterns
) sanitized = sanitized.replace(PII_PATTERNS.ssn, '[id removed]')
return sanitized return sanitized
} }
/**
* Truncate text to a maximum length and sanitize
*/
export function truncateAndSanitize(
text: string | null | undefined,
maxLength: number
): string {
if (!text) return ''
const sanitized = sanitizeText(text)
if (sanitized.length <= maxLength) {
return sanitized
}
return sanitized.slice(0, maxLength - 3) + '...'
}
// ─── GDPR Compliance Validation ──────────────────────────────────────────────
export interface PIIValidationResult {
valid: boolean
violations: string[]
}
/**
* Validate that data contains no personal information
* Used for GDPR compliance before sending data to AI
*/
export function validateNoPersonalData(
data: Record<string, unknown>
): PIIValidationResult {
const violations: string[] = []
const textContent = JSON.stringify(data)
// Check each PII pattern
for (const [type, pattern] of Object.entries(PII_PATTERNS)) {
// Reset regex state (global flag)
pattern.lastIndex = 0
if (pattern.test(textContent)) {
violations.push(`Potential ${type} detected in data`)
}
}
// Additional checks for common PII fields
const sensitiveFields = [
'email',
'phone',
'password',
'ssn',
'socialSecurity',
'creditCard',
'bankAccount',
'drivingLicense',
]
const keys = Object.keys(data).map((k) => k.toLowerCase())
for (const field of sensitiveFields) {
if (keys.includes(field)) {
violations.push(`Sensitive field "${field}" present in data`)
}
}
return {
valid: violations.length === 0,
violations,
}
}
/**
* Enforce GDPR compliance before EVERY AI call
* Throws an error if PII is detected
*/
export function enforceGDPRCompliance(data: unknown[]): void {
for (let i = 0; i < data.length; i++) {
const item = data[i]
if (typeof item === 'object' && item !== null) {
const { valid, violations } = validateNoPersonalData(
item as Record<string, unknown>
)
if (!valid) {
console.error(
`[GDPR] PII validation failed for item ${i}:`,
violations
)
throw new Error(
`GDPR compliance check failed: ${violations.join(', ')}`
)
}
}
}
}
/** /**
* Validate that data has been properly anonymized * Validate that data has been properly anonymized
* Returns true if no PII patterns are detected * Returns true if no PII patterns are detected
*/ */
export function validateAnonymization(data: AnonymizationResult): boolean { export function validateAnonymization(data: AnonymizationResult): boolean {
const piiPatterns = [
/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/, // Email
/(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/, // Phone
]
const checkText = (text: string | null | undefined): boolean => { const checkText = (text: string | null | undefined): boolean => {
if (!text) return true if (!text) return true
return !piiPatterns.some((pattern) => pattern.test(text)) // Reset regex state for each check
for (const pattern of Object.values(PII_PATTERNS)) {
pattern.lastIndex = 0
if (pattern.test(text)) return false
}
return true
} }
// Check jurors (they should only have expertise tags) // Check jurors
for (const juror of data.jurors) { for (const juror of data.jurors) {
// Jurors should not have any text fields that could contain PII
// Only check expertiseTags
for (const tag of juror.expertiseTags) { for (const tag of juror.expertiseTags) {
if (!checkText(tag)) return false if (!checkText(tag)) return false
} }
@ -209,3 +463,30 @@ export function validateAnonymization(data: AnonymizationResult): boolean {
return true return true
} }
/**
* Validate anonymized projects for AI (enhanced version)
*/
export function validateAnonymizedProjects(
projects: AnonymizedProjectForAI[]
): boolean {
const checkText = (text: string | null | undefined): boolean => {
if (!text) return true
for (const pattern of Object.values(PII_PATTERNS)) {
pattern.lastIndex = 0
if (pattern.test(text)) return false
}
return true
}
for (const project of projects) {
if (!checkText(project.title)) return false
if (!checkText(project.description)) return false
if (!checkText(project.institution)) return false
for (const tag of project.tags) {
if (!checkText(tag)) return false
}
}
return true
}

View File

@ -1,5 +1,33 @@
/**
* AI-Powered Mentor Matching Service
*
* Matches mentors to projects based on expertise alignment.
*
* Optimization:
* - Batched processing (15 projects per batch)
* - Token tracking and cost logging
* - Fallback to algorithmic matching
*
* GDPR Compliance:
* - All data anonymized before AI processing
* - No personal information sent to OpenAI
*/
import { PrismaClient, OceanIssue, CompetitionCategory } from '@prisma/client' import { PrismaClient, OceanIssue, CompetitionCategory } from '@prisma/client'
import { getOpenAI, getConfiguredModel } from '@/lib/openai' import { getOpenAI, getConfiguredModel, buildCompletionParams } from '@/lib/openai'
import { logAIUsage, extractTokenUsage } from '@/server/utils/ai-usage'
import { classifyAIError, createParseError, logAIError } from './ai-errors'
// ─── Constants ───────────────────────────────────────────────────────────────
const MENTOR_BATCH_SIZE = 15
// Optimized system prompt
const MENTOR_MATCHING_SYSTEM_PROMPT = `Match mentors to projects by expertise. Return JSON.
Format for each project: {"matches": [{project_id, mentor_matches: [{mentor_index, confidence_score: 0-1, expertise_match_score: 0-1, reasoning: str}]}]}
Rank by suitability. Consider expertise alignment and availability.`
// ─── Types ───────────────────────────────────────────────────────────────────
interface ProjectInfo { interface ProjectInfo {
id: string id: string
@ -26,17 +54,162 @@ interface MentorMatch {
reasoning: string reasoning: string
} }
// ─── Batched AI Matching ─────────────────────────────────────────────────────
/** /**
* Get AI-suggested mentor matches for a project * Process a batch of projects for mentor matching
*/ */
export async function getAIMentorSuggestions( async function processMatchingBatch(
openai: NonNullable<Awaited<ReturnType<typeof getOpenAI>>>,
model: string,
projects: ProjectInfo[],
mentors: MentorInfo[],
limit: number,
userId?: string
): Promise<{
results: Map<string, MentorMatch[]>
tokensUsed: number
}> {
const results = new Map<string, MentorMatch[]>()
let tokensUsed = 0
// Anonymize project data
const anonymizedProjects = projects.map((p, index) => ({
project_id: `P${index + 1}`,
real_id: p.id,
description: p.description?.slice(0, 350) || 'No description',
category: p.competitionCategory,
oceanIssue: p.oceanIssue,
tags: p.tags,
}))
// Anonymize mentor data
const anonymizedMentors = mentors.map((m, index) => ({
index,
expertise: m.expertiseTags,
availability: m.maxAssignments
? `${m.currentAssignments}/${m.maxAssignments}`
: 'unlimited',
}))
const userPrompt = `PROJECTS:
${anonymizedProjects.map(p => `${p.project_id}: Category=${p.category || 'N/A'}, Issue=${p.oceanIssue || 'N/A'}, Tags=[${p.tags.join(', ')}], Desc=${p.description.slice(0, 200)}`).join('\n')}
MENTORS:
${anonymizedMentors.map(m => `${m.index}: Expertise=[${m.expertise.join(', ')}], Availability=${m.availability}`).join('\n')}
For each project, rank top ${limit} mentors.`
try {
const params = buildCompletionParams(model, {
messages: [
{ role: 'system', content: MENTOR_MATCHING_SYSTEM_PROMPT },
{ role: 'user', content: userPrompt },
],
jsonMode: true,
temperature: 0.3,
maxTokens: 4000,
})
const response = await openai.chat.completions.create(params)
const usage = extractTokenUsage(response)
tokensUsed = usage.totalTokens
// Log usage
await logAIUsage({
userId,
action: 'MENTOR_MATCHING',
entityType: 'Project',
model,
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
totalTokens: usage.totalTokens,
batchSize: projects.length,
itemsProcessed: projects.length,
status: 'SUCCESS',
})
const content = response.choices[0]?.message?.content
if (!content) {
throw new Error('No response from AI')
}
const parsed = JSON.parse(content) as {
matches: Array<{
project_id: string
mentor_matches: Array<{
mentor_index: number
confidence_score: number
expertise_match_score: number
reasoning: string
}>
}>
}
// Map results back to real IDs
for (const projectMatch of parsed.matches || []) {
const project = anonymizedProjects.find(p => p.project_id === projectMatch.project_id)
if (!project) continue
const mentorMatches: MentorMatch[] = []
for (const match of projectMatch.mentor_matches || []) {
if (match.mentor_index >= 0 && match.mentor_index < mentors.length) {
mentorMatches.push({
mentorId: mentors[match.mentor_index].id,
confidenceScore: Math.min(1, Math.max(0, match.confidence_score)),
expertiseMatchScore: Math.min(1, Math.max(0, match.expertise_match_score)),
reasoning: match.reasoning,
})
}
}
results.set(project.real_id, mentorMatches)
}
} catch (error) {
if (error instanceof SyntaxError) {
const parseError = createParseError(error.message)
logAIError('MentorMatching', 'batch processing', parseError)
await logAIUsage({
userId,
action: 'MENTOR_MATCHING',
entityType: 'Project',
model,
promptTokens: 0,
completionTokens: 0,
totalTokens: tokensUsed,
batchSize: projects.length,
itemsProcessed: 0,
status: 'ERROR',
errorMessage: parseError.message,
})
// Return empty results for batch (will fall back to algorithm)
for (const project of projects) {
results.set(project.id, [])
}
} else {
throw error
}
}
return { results, tokensUsed }
}
/**
* Get AI-suggested mentor matches for multiple projects (batched)
*/
export async function getAIMentorSuggestionsBatch(
prisma: PrismaClient, prisma: PrismaClient,
projectId: string, projectIds: string[],
limit: number = 5 limit: number = 5,
): Promise<MentorMatch[]> { userId?: string
// Get project details ): Promise<Map<string, MentorMatch[]>> {
const project = await prisma.project.findUniqueOrThrow({ const allResults = new Map<string, MentorMatch[]>()
where: { id: projectId },
// Get projects
const projects = await prisma.project.findMany({
where: { id: { in: projectIds } },
select: { select: {
id: true, id: true,
title: true, title: true,
@ -47,14 +220,16 @@ export async function getAIMentorSuggestions(
}, },
}) })
// Get available mentors (users with expertise tags) if (projects.length === 0) {
// In a full implementation, you'd have a MENTOR role return allResults
// For now, we use users with expertiseTags and consider them potential mentors }
// Get available mentors
const mentors = await prisma.user.findMany({ const mentors = await prisma.user.findMany({
where: { where: {
OR: [ OR: [
{ expertiseTags: { isEmpty: false } }, { expertiseTags: { isEmpty: false } },
{ role: 'JURY_MEMBER' }, // Jury members can also be mentors { role: 'JURY_MEMBER' },
], ],
status: 'ACTIVE', status: 'ACTIVE',
}, },
@ -86,118 +261,111 @@ export async function getAIMentorSuggestions(
})) }))
if (availableMentors.length === 0) { if (availableMentors.length === 0) {
return [] return allResults
} }
// Try AI matching if API key is configured // Try AI matching
if (process.env.OPENAI_API_KEY) {
try { try {
return await getAIMatches(project, availableMentors, limit)
} catch (error) {
console.error('AI mentor matching failed, falling back to algorithm:', error)
}
}
// Fallback to algorithmic matching
return getAlgorithmicMatches(project, availableMentors, limit)
}
/**
* Use OpenAI to match mentors to projects
*/
async function getAIMatches(
project: ProjectInfo,
mentors: MentorInfo[],
limit: number
): Promise<MentorMatch[]> {
// Anonymize data before sending to AI
const anonymizedProject = {
description: project.description?.slice(0, 500) || 'No description',
category: project.competitionCategory,
oceanIssue: project.oceanIssue,
tags: project.tags,
}
const anonymizedMentors = mentors.map((m, index) => ({
index,
expertise: m.expertiseTags,
availability: m.maxAssignments
? `${m.currentAssignments}/${m.maxAssignments}`
: 'unlimited',
}))
const prompt = `You are matching mentors to an ocean protection project.
PROJECT:
- Category: ${anonymizedProject.category || 'Not specified'}
- Ocean Issue: ${anonymizedProject.oceanIssue || 'Not specified'}
- Tags: ${anonymizedProject.tags.join(', ') || 'None'}
- Description: ${anonymizedProject.description}
AVAILABLE MENTORS:
${anonymizedMentors.map((m) => `${m.index}: Expertise: [${m.expertise.join(', ')}], Availability: ${m.availability}`).join('\n')}
Rank the top ${limit} mentors by suitability. For each, provide:
1. Mentor index (0-based)
2. Confidence score (0-1)
3. Expertise match score (0-1)
4. Brief reasoning (1-2 sentences)
Respond in JSON format:
{
"matches": [
{
"mentorIndex": 0,
"confidenceScore": 0.85,
"expertiseMatchScore": 0.9,
"reasoning": "Strong expertise alignment..."
}
]
}`
const openai = await getOpenAI() const openai = await getOpenAI()
if (!openai) { if (!openai) {
throw new Error('OpenAI client not available') console.log('[Mentor Matching] OpenAI not configured, using algorithm')
return getAlgorithmicMatchesBatch(projects, availableMentors, limit)
} }
const model = await getConfiguredModel() const model = await getConfiguredModel()
console.log(`[Mentor Matching] Using model: ${model} for ${projects.length} projects in batches of ${MENTOR_BATCH_SIZE}`)
const response = await openai.chat.completions.create({ let totalTokens = 0
// Process in batches
for (let i = 0; i < projects.length; i += MENTOR_BATCH_SIZE) {
const batchProjects = projects.slice(i, i + MENTOR_BATCH_SIZE)
console.log(`[Mentor Matching] Processing batch ${Math.floor(i / MENTOR_BATCH_SIZE) + 1}/${Math.ceil(projects.length / MENTOR_BATCH_SIZE)}`)
const { results, tokensUsed } = await processMatchingBatch(
openai,
model, model,
messages: [ batchProjects,
{ availableMentors,
role: 'system', limit,
content: 'You are an expert at matching mentors to projects based on expertise alignment. Always respond with valid JSON.', userId
}, )
{ role: 'user', content: prompt },
], totalTokens += tokensUsed
response_format: { type: 'json_object' },
temperature: 0.3, // Merge results
max_tokens: 1000, for (const [projectId, matches] of results) {
allResults.set(projectId, matches)
}
}
console.log(`[Mentor Matching] Completed. Total tokens: ${totalTokens}`)
// Fill in any missing projects with algorithmic fallback
for (const project of projects) {
if (!allResults.has(project.id) || allResults.get(project.id)?.length === 0) {
const fallbackMatches = getAlgorithmicMatches(project, availableMentors, limit)
allResults.set(project.id, fallbackMatches)
}
}
return allResults
} catch (error) {
const classified = classifyAIError(error)
logAIError('MentorMatching', 'getAIMentorSuggestionsBatch', classified)
// Log failed attempt
await logAIUsage({
userId,
action: 'MENTOR_MATCHING',
entityType: 'Project',
model: 'unknown',
promptTokens: 0,
completionTokens: 0,
totalTokens: 0,
batchSize: projects.length,
itemsProcessed: 0,
status: 'ERROR',
errorMessage: classified.message,
}) })
const content = response.choices[0]?.message?.content console.error('[Mentor Matching] AI failed, using algorithm:', classified.message)
if (!content) { return getAlgorithmicMatchesBatch(projects, availableMentors, limit)
throw new Error('No response from AI') }
}
/**
* Get AI-suggested mentor matches for a single project
*/
export async function getAIMentorSuggestions(
prisma: PrismaClient,
projectId: string,
limit: number = 5,
userId?: string
): Promise<MentorMatch[]> {
const results = await getAIMentorSuggestionsBatch(prisma, [projectId], limit, userId)
return results.get(projectId) || []
}
// ─── Algorithmic Fallback ────────────────────────────────────────────────────
/**
* Algorithmic fallback for multiple projects
*/
function getAlgorithmicMatchesBatch(
projects: ProjectInfo[],
mentors: MentorInfo[],
limit: number
): Map<string, MentorMatch[]> {
const results = new Map<string, MentorMatch[]>()
for (const project of projects) {
results.set(project.id, getAlgorithmicMatches(project, mentors, limit))
} }
const parsed = JSON.parse(content) as { return results
matches: Array<{
mentorIndex: number
confidenceScore: number
expertiseMatchScore: number
reasoning: string
}>
}
return parsed.matches
.filter((m) => m.mentorIndex >= 0 && m.mentorIndex < mentors.length)
.map((m) => ({
mentorId: mentors[m.mentorIndex].id,
confidenceScore: m.confidenceScore,
expertiseMatchScore: m.expertiseMatchScore,
reasoning: m.reasoning,
}))
} }
/** /**
@ -226,7 +394,6 @@ function getAlgorithmicMatches(
}) })
if (project.description) { if (project.description) {
// Extract key words from description
const words = project.description.toLowerCase().split(/\s+/) const words = project.description.toLowerCase().split(/\s+/)
words.forEach((word) => { words.forEach((word) => {
if (word.length > 4) projectKeywords.add(word.replace(/[^a-z]/g, '')) if (word.length > 4) projectKeywords.add(word.replace(/[^a-z]/g, ''))
@ -267,7 +434,7 @@ function getAlgorithmicMatches(
mentorId: mentor.id, mentorId: mentor.id,
confidenceScore: Math.round(confidenceScore * 100) / 100, confidenceScore: Math.round(confidenceScore * 100) / 100,
expertiseMatchScore: Math.round(expertiseMatchScore * 100) / 100, expertiseMatchScore: Math.round(expertiseMatchScore * 100) / 100,
reasoning: `Matched ${matchCount} keyword(s) with mentor expertise. Availability: ${availabilityScore > 0.5 ? 'Good' : 'Limited'}.`, reasoning: `Matched ${matchCount} keyword(s). Availability: ${availabilityScore > 0.5 ? 'Good' : 'Limited'}.`,
} }
}) })

View File

@ -0,0 +1,323 @@
/**
* AI Usage Tracking Utility
*
* Logs AI API usage to the database for cost tracking and monitoring.
* Calculates estimated costs based on model pricing.
*/
import { prisma } from '@/lib/prisma'
import { Decimal } from '@prisma/client/runtime/library'
import type { Prisma } from '@prisma/client'
// ─── Types ───────────────────────────────────────────────────────────────────
export type AIAction =
| 'ASSIGNMENT'
| 'FILTERING'
| 'AWARD_ELIGIBILITY'
| 'MENTOR_MATCHING'
export type AIStatus = 'SUCCESS' | 'PARTIAL' | 'ERROR'
export interface LogAIUsageInput {
userId?: string
action: AIAction
entityType?: string
entityId?: string
model: string
promptTokens: number
completionTokens: number
totalTokens: number
batchSize?: number
itemsProcessed?: number
status: AIStatus
errorMessage?: string
detailsJson?: Record<string, unknown>
}
export interface TokenUsageResult {
promptTokens: number
completionTokens: number
totalTokens: number
}
// ─── Model Pricing (per 1M tokens) ───────────────────────────────────────────
interface ModelPricing {
input: number // $ per 1M input tokens
output: number // $ per 1M output tokens
}
/**
* OpenAI model pricing as of 2024/2025
* Prices in USD per 1 million tokens
*/
const MODEL_PRICING: Record<string, ModelPricing> = {
// GPT-4o series
'gpt-4o': { input: 2.5, output: 10.0 },
'gpt-4o-2024-11-20': { input: 2.5, output: 10.0 },
'gpt-4o-2024-08-06': { input: 2.5, output: 10.0 },
'gpt-4o-2024-05-13': { input: 5.0, output: 15.0 },
'gpt-4o-mini': { input: 0.15, output: 0.6 },
'gpt-4o-mini-2024-07-18': { input: 0.15, output: 0.6 },
// GPT-4 Turbo series
'gpt-4-turbo': { input: 10.0, output: 30.0 },
'gpt-4-turbo-2024-04-09': { input: 10.0, output: 30.0 },
'gpt-4-turbo-preview': { input: 10.0, output: 30.0 },
'gpt-4-1106-preview': { input: 10.0, output: 30.0 },
'gpt-4-0125-preview': { input: 10.0, output: 30.0 },
// GPT-4 (base)
'gpt-4': { input: 30.0, output: 60.0 },
'gpt-4-0613': { input: 30.0, output: 60.0 },
'gpt-4-32k': { input: 60.0, output: 120.0 },
'gpt-4-32k-0613': { input: 60.0, output: 120.0 },
// GPT-3.5 Turbo series
'gpt-3.5-turbo': { input: 0.5, output: 1.5 },
'gpt-3.5-turbo-0125': { input: 0.5, output: 1.5 },
'gpt-3.5-turbo-1106': { input: 1.0, output: 2.0 },
'gpt-3.5-turbo-16k': { input: 3.0, output: 4.0 },
// o1 reasoning models
'o1': { input: 15.0, output: 60.0 },
'o1-2024-12-17': { input: 15.0, output: 60.0 },
'o1-preview': { input: 15.0, output: 60.0 },
'o1-preview-2024-09-12': { input: 15.0, output: 60.0 },
'o1-mini': { input: 3.0, output: 12.0 },
'o1-mini-2024-09-12': { input: 3.0, output: 12.0 },
// o3 reasoning models
'o3-mini': { input: 1.1, output: 4.4 },
'o3-mini-2025-01-31': { input: 1.1, output: 4.4 },
// o4 reasoning models (future-proofing)
'o4-mini': { input: 1.1, output: 4.4 },
}
// Default pricing for unknown models (conservative estimate)
const DEFAULT_PRICING: ModelPricing = { input: 5.0, output: 15.0 }
// ─── Cost Calculation ────────────────────────────────────────────────────────
/**
* Get pricing for a model, with fallback for unknown models
*/
function getModelPricing(model: string): ModelPricing {
// Exact match
if (MODEL_PRICING[model]) {
return MODEL_PRICING[model]
}
// Try to match by prefix
const modelLower = model.toLowerCase()
for (const [key, pricing] of Object.entries(MODEL_PRICING)) {
if (modelLower.startsWith(key.toLowerCase())) {
return pricing
}
}
// Fallback based on model type
if (modelLower.startsWith('gpt-4o-mini')) {
return MODEL_PRICING['gpt-4o-mini']
}
if (modelLower.startsWith('gpt-4o')) {
return MODEL_PRICING['gpt-4o']
}
if (modelLower.startsWith('gpt-4')) {
return MODEL_PRICING['gpt-4-turbo']
}
if (modelLower.startsWith('gpt-3.5')) {
return MODEL_PRICING['gpt-3.5-turbo']
}
if (modelLower.startsWith('o1-mini')) {
return MODEL_PRICING['o1-mini']
}
if (modelLower.startsWith('o1')) {
return MODEL_PRICING['o1']
}
if (modelLower.startsWith('o3-mini')) {
return MODEL_PRICING['o3-mini']
}
if (modelLower.startsWith('o3')) {
return MODEL_PRICING['o3-mini'] // Conservative estimate
}
if (modelLower.startsWith('o4')) {
return MODEL_PRICING['o4-mini'] || DEFAULT_PRICING
}
return DEFAULT_PRICING
}
/**
* Calculate estimated cost in USD for a given model and token usage
*/
export function calculateCost(
model: string,
promptTokens: number,
completionTokens: number
): number {
const pricing = getModelPricing(model)
const inputCost = (promptTokens / 1_000_000) * pricing.input
const outputCost = (completionTokens / 1_000_000) * pricing.output
return inputCost + outputCost
}
/**
* Format cost for display
*/
export function formatCost(costUsd: number): string {
if (costUsd < 0.01) {
return `$${(costUsd * 100).toFixed(3)}¢`
}
return `$${costUsd.toFixed(4)}`
}
// ─── Logging ─────────────────────────────────────────────────────────────────
/**
* Log AI usage to the database
*/
export async function logAIUsage(input: LogAIUsageInput): Promise<void> {
try {
const estimatedCost = calculateCost(
input.model,
input.promptTokens,
input.completionTokens
)
await prisma.aIUsageLog.create({
data: {
userId: input.userId,
action: input.action,
entityType: input.entityType,
entityId: input.entityId,
model: input.model,
promptTokens: input.promptTokens,
completionTokens: input.completionTokens,
totalTokens: input.totalTokens,
estimatedCostUsd: new Decimal(estimatedCost),
batchSize: input.batchSize,
itemsProcessed: input.itemsProcessed,
status: input.status,
errorMessage: input.errorMessage,
detailsJson: input.detailsJson as Prisma.InputJsonValue | undefined,
},
})
} catch (error) {
// Don't let logging failures break the main operation
console.error('[AI Usage] Failed to log usage:', error)
}
}
/**
* Extract token usage from OpenAI API response
*/
export function extractTokenUsage(
response: { usage?: { prompt_tokens?: number; completion_tokens?: number; total_tokens?: number } }
): TokenUsageResult {
return {
promptTokens: response.usage?.prompt_tokens ?? 0,
completionTokens: response.usage?.completion_tokens ?? 0,
totalTokens: response.usage?.total_tokens ?? 0,
}
}
// ─── Statistics ──────────────────────────────────────────────────────────────
export interface AIUsageStats {
totalTokens: number
totalCost: number
byAction: Record<string, { tokens: number; cost: number; count: number }>
byModel: Record<string, { tokens: number; cost: number; count: number }>
}
/**
* Get AI usage statistics for a date range
*/
export async function getAIUsageStats(
startDate?: Date,
endDate?: Date
): Promise<AIUsageStats> {
const where: { createdAt?: { gte?: Date; lte?: Date } } = {}
if (startDate || endDate) {
where.createdAt = {}
if (startDate) where.createdAt.gte = startDate
if (endDate) where.createdAt.lte = endDate
}
const logs = await prisma.aIUsageLog.findMany({
where,
select: {
action: true,
model: true,
totalTokens: true,
estimatedCostUsd: true,
},
})
const stats: AIUsageStats = {
totalTokens: 0,
totalCost: 0,
byAction: {},
byModel: {},
}
for (const log of logs) {
const cost = log.estimatedCostUsd?.toNumber() ?? 0
stats.totalTokens += log.totalTokens
stats.totalCost += cost
// By action
if (!stats.byAction[log.action]) {
stats.byAction[log.action] = { tokens: 0, cost: 0, count: 0 }
}
stats.byAction[log.action].tokens += log.totalTokens
stats.byAction[log.action].cost += cost
stats.byAction[log.action].count += 1
// By model
if (!stats.byModel[log.model]) {
stats.byModel[log.model] = { tokens: 0, cost: 0, count: 0 }
}
stats.byModel[log.model].tokens += log.totalTokens
stats.byModel[log.model].cost += cost
stats.byModel[log.model].count += 1
}
return stats
}
/**
* Get current month's AI usage cost
*/
export async function getCurrentMonthCost(): Promise<{
cost: number
tokens: number
requestCount: number
}> {
const startOfMonth = new Date()
startOfMonth.setDate(1)
startOfMonth.setHours(0, 0, 0, 0)
const logs = await prisma.aIUsageLog.findMany({
where: {
createdAt: { gte: startOfMonth },
},
select: {
totalTokens: true,
estimatedCostUsd: true,
},
})
return {
cost: logs.reduce((sum, log) => sum + (log.estimatedCostUsd?.toNumber() ?? 0), 0),
tokens: logs.reduce((sum, log) => sum + log.totalTokens, 0),
requestCount: logs.length,
}
}