Optimize AI system with batching, token tracking, and GDPR compliance
Build and Push Docker Image / build (push) Successful in 9m11s
Details
Build and Push Docker Image / build (push) Successful in 9m11s
Details
- Add AIUsageLog model for persistent token/cost tracking - Implement batched processing for all AI services: - Assignment: 15 projects/batch - Filtering: 20 projects/batch - Award eligibility: 20 projects/batch - Mentor matching: 15 projects/batch - Create unified error classification (ai-errors.ts) - Enhance anonymization with comprehensive project data - Add AI usage dashboard to Settings page - Add usage stats endpoints to settings router - Create AI system documentation (5 files) - Create GDPR compliance documentation (2 files) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
a72e815d3a
commit
928b1c65dc
|
|
@ -0,0 +1,176 @@
|
|||
# AI Configuration Guide
|
||||
|
||||
## Admin Settings
|
||||
|
||||
Navigate to **Settings → AI** to configure AI features.
|
||||
|
||||
### Available Settings
|
||||
|
||||
| Setting | Description | Default |
|
||||
|---------|-------------|---------|
|
||||
| `ai_enabled` | Master switch for AI features | `true` |
|
||||
| `ai_provider` | AI provider (OpenAI only currently) | `openai` |
|
||||
| `ai_model` | Model to use | `gpt-4o` |
|
||||
| `openai_api_key` | API key (encrypted) | - |
|
||||
| `ai_send_descriptions` | Include project descriptions | `true` |
|
||||
|
||||
## Supported Models
|
||||
|
||||
### Standard Models (GPT)
|
||||
|
||||
| Model | Speed | Quality | Cost | Recommended For |
|
||||
|-------|-------|---------|------|-----------------|
|
||||
| `gpt-4o` | Fast | Best | Medium | Production use |
|
||||
| `gpt-4o-mini` | Very Fast | Good | Low | High-volume, cost-sensitive |
|
||||
| `gpt-4-turbo` | Medium | Very Good | High | Complex analysis |
|
||||
| `gpt-3.5-turbo` | Very Fast | Basic | Very Low | Simple tasks only |
|
||||
|
||||
### Reasoning Models (o-series)
|
||||
|
||||
| Model | Speed | Quality | Cost | Recommended For |
|
||||
|-------|-------|---------|------|-----------------|
|
||||
| `o1` | Slow | Excellent | Very High | Complex reasoning tasks |
|
||||
| `o1-mini` | Medium | Very Good | High | Moderate complexity |
|
||||
| `o3-mini` | Medium | Good | Medium | Cost-effective reasoning |
|
||||
|
||||
**Note:** Reasoning models use different API parameters:
|
||||
- `max_completion_tokens` instead of `max_tokens`
|
||||
- No `temperature` parameter
|
||||
- No `response_format: json_object`
|
||||
- System messages become "developer" role
|
||||
|
||||
The platform automatically handles these differences via `buildCompletionParams()`.
|
||||
|
||||
## Cost Estimates
|
||||
|
||||
### Per 1M Tokens (USD)
|
||||
|
||||
| Model | Input | Output |
|
||||
|-------|-------|--------|
|
||||
| gpt-4o | $2.50 | $10.00 |
|
||||
| gpt-4o-mini | $0.15 | $0.60 |
|
||||
| gpt-4-turbo | $10.00 | $30.00 |
|
||||
| gpt-3.5-turbo | $0.50 | $1.50 |
|
||||
| o1 | $15.00 | $60.00 |
|
||||
| o1-mini | $3.00 | $12.00 |
|
||||
| o3-mini | $1.10 | $4.40 |
|
||||
|
||||
### Typical Usage Per Operation
|
||||
|
||||
| Operation | Projects | Est. Tokens | Est. Cost (gpt-4o) |
|
||||
|-----------|----------|-------------|-------------------|
|
||||
| Filter 100 projects | 100 | ~10,000 | ~$0.10 |
|
||||
| Assign 50 projects | 50 | ~15,000 | ~$0.15 |
|
||||
| Award eligibility | 100 | ~10,000 | ~$0.10 |
|
||||
| Mentor matching | 60 | ~12,000 | ~$0.12 |
|
||||
|
||||
## Rate Limits
|
||||
|
||||
OpenAI enforces rate limits based on your account tier:
|
||||
|
||||
| Tier | Requests/Min | Tokens/Min |
|
||||
|------|--------------|------------|
|
||||
| Tier 1 | 500 | 30,000 |
|
||||
| Tier 2 | 5,000 | 450,000 |
|
||||
| Tier 3+ | Higher | Higher |
|
||||
|
||||
The platform handles rate limits with:
|
||||
- Batch processing (reduces request count)
|
||||
- Error classification (detects rate limit errors)
|
||||
- Manual retry guidance in UI
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```env
|
||||
# Required for AI features
|
||||
OPENAI_API_KEY=sk-your-api-key
|
||||
|
||||
# Optional overrides (normally set via admin UI)
|
||||
OPENAI_MODEL=gpt-4o
|
||||
```
|
||||
|
||||
## Testing Connection
|
||||
|
||||
1. Go to **Settings → AI**
|
||||
2. Enter your OpenAI API key
|
||||
3. Click **Save AI Settings**
|
||||
4. Click **Test Connection**
|
||||
|
||||
The test verifies:
|
||||
- API key validity
|
||||
- Model availability
|
||||
- Basic request/response
|
||||
|
||||
## Monitoring Usage
|
||||
|
||||
### Admin Dashboard
|
||||
|
||||
Navigate to **Settings → AI** to see:
|
||||
- Current month cost
|
||||
- Token usage by feature
|
||||
- Usage by model
|
||||
- 30-day usage trend
|
||||
|
||||
### Database Queries
|
||||
|
||||
```sql
|
||||
-- Current month usage
|
||||
SELECT
|
||||
action,
|
||||
SUM(total_tokens) as tokens,
|
||||
SUM(estimated_cost_usd) as cost
|
||||
FROM ai_usage_log
|
||||
WHERE created_at >= date_trunc('month', NOW())
|
||||
GROUP BY action;
|
||||
|
||||
-- Top users by cost
|
||||
SELECT
|
||||
u.email,
|
||||
SUM(l.estimated_cost_usd) as total_cost
|
||||
FROM ai_usage_log l
|
||||
JOIN users u ON l.user_id = u.id
|
||||
GROUP BY u.id
|
||||
ORDER BY total_cost DESC
|
||||
LIMIT 10;
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Model not found"
|
||||
|
||||
- Verify the model is available with your API key tier
|
||||
- Some models (o1, o3) require specific API access
|
||||
- Try a more common model like `gpt-4o-mini`
|
||||
|
||||
### "Rate limit exceeded"
|
||||
|
||||
- Wait a few minutes before retrying
|
||||
- Consider using a smaller batch size
|
||||
- Upgrade your OpenAI account tier
|
||||
|
||||
### "All projects flagged"
|
||||
|
||||
1. Check **Settings → AI** for correct API key
|
||||
2. Verify model is available
|
||||
3. Check console logs for specific error messages
|
||||
4. Test connection with the button in settings
|
||||
|
||||
### "Invalid API key"
|
||||
|
||||
1. Verify the key starts with `sk-`
|
||||
2. Check the key hasn't been revoked in OpenAI dashboard
|
||||
3. Ensure no extra whitespace in the key
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use gpt-4o-mini** for high-volume operations (filtering many projects)
|
||||
2. **Use gpt-4o** for critical decisions (final assignments)
|
||||
3. **Monitor costs** regularly via the usage dashboard
|
||||
4. **Test with small batches** before running on full dataset
|
||||
5. **Keep descriptions enabled** for better matching accuracy
|
||||
|
||||
## See Also
|
||||
|
||||
- [AI System Architecture](./ai-system.md)
|
||||
- [AI Services Reference](./ai-services.md)
|
||||
- [AI Error Handling](./ai-errors.md)
|
||||
|
|
@ -0,0 +1,208 @@
|
|||
# AI Error Handling Guide
|
||||
|
||||
## Error Types
|
||||
|
||||
The AI system classifies errors into these categories:
|
||||
|
||||
| Error Type | Cause | User Message | Retryable |
|
||||
|------------|-------|--------------|-----------|
|
||||
| `rate_limit` | Too many requests | "Rate limit exceeded. Wait a few minutes." | Yes |
|
||||
| `quota_exceeded` | Billing limit | "API quota exceeded. Check billing." | No |
|
||||
| `model_not_found` | Invalid model | "Model not available. Check settings." | No |
|
||||
| `invalid_api_key` | Bad API key | "Invalid API key. Check settings." | No |
|
||||
| `context_length` | Prompt too large | "Request too large. Try fewer items." | Yes* |
|
||||
| `parse_error` | AI returned invalid JSON | "Response parse error. Flagged for review." | Yes |
|
||||
| `timeout` | Request took too long | "Request timed out. Try again." | Yes |
|
||||
| `network_error` | Connection issue | "Network error. Check connection." | Yes |
|
||||
| `content_filter` | Content blocked | "Content filtered. Check input data." | No |
|
||||
| `server_error` | OpenAI server issue | "Server error. Try again later." | Yes |
|
||||
|
||||
*Context length errors can be retried with smaller batches.
|
||||
|
||||
## Error Classification
|
||||
|
||||
```typescript
|
||||
import { classifyAIError, shouldRetry, getRetryDelay } from '@/server/services/ai-errors'
|
||||
|
||||
try {
|
||||
const response = await openai.chat.completions.create(params)
|
||||
} catch (error) {
|
||||
const classified = classifyAIError(error)
|
||||
|
||||
console.error(`AI Error: ${classified.type} - ${classified.message}`)
|
||||
|
||||
if (shouldRetry(classified.type)) {
|
||||
const delay = getRetryDelay(classified.type)
|
||||
// Wait and retry
|
||||
} else {
|
||||
// Fall back to algorithm
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Graceful Degradation
|
||||
|
||||
When AI fails, the platform automatically handles it:
|
||||
|
||||
### AI Assignment
|
||||
1. Logs the error
|
||||
2. Falls back to algorithmic assignment:
|
||||
- Matches by expertise tag overlap
|
||||
- Balances workload across jurors
|
||||
- Respects constraints (max assignments)
|
||||
|
||||
### AI Filtering
|
||||
1. Logs the error
|
||||
2. Flags all projects for manual review
|
||||
3. Returns error message to admin
|
||||
|
||||
### Award Eligibility
|
||||
1. Logs the error
|
||||
2. Returns all projects as "needs manual review"
|
||||
3. Admin can apply deterministic rules instead
|
||||
|
||||
### Mentor Matching
|
||||
1. Logs the error
|
||||
2. Falls back to keyword-based matching
|
||||
3. Uses availability scoring
|
||||
|
||||
## Retry Strategy
|
||||
|
||||
| Error Type | Retry Count | Delay |
|
||||
|------------|-------------|-------|
|
||||
| `rate_limit` | 3 | Exponential (1s, 2s, 4s) |
|
||||
| `timeout` | 2 | Fixed 5s |
|
||||
| `network_error` | 3 | Exponential (1s, 2s, 4s) |
|
||||
| `server_error` | 3 | Exponential (2s, 4s, 8s) |
|
||||
| `parse_error` | 1 | None |
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Error Logging
|
||||
|
||||
All AI errors are logged to:
|
||||
1. Console (development)
|
||||
2. `AIUsageLog` table with `status: 'ERROR'`
|
||||
3. `AuditLog` for security-relevant failures
|
||||
|
||||
### Checking Errors
|
||||
|
||||
```sql
|
||||
-- Recent AI errors
|
||||
SELECT
|
||||
created_at,
|
||||
action,
|
||||
model,
|
||||
error_message
|
||||
FROM ai_usage_log
|
||||
WHERE status = 'ERROR'
|
||||
ORDER BY created_at DESC
|
||||
LIMIT 20;
|
||||
|
||||
-- Error rate by action
|
||||
SELECT
|
||||
action,
|
||||
COUNT(*) FILTER (WHERE status = 'ERROR') as errors,
|
||||
COUNT(*) as total,
|
||||
ROUND(100.0 * COUNT(*) FILTER (WHERE status = 'ERROR') / COUNT(*), 2) as error_rate
|
||||
FROM ai_usage_log
|
||||
GROUP BY action;
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### High Error Rate
|
||||
|
||||
1. Check OpenAI status page for outages
|
||||
2. Verify API key is valid and not rate-limited
|
||||
3. Review error messages in logs
|
||||
4. Consider switching to a different model
|
||||
|
||||
### Consistent Parse Errors
|
||||
|
||||
1. The AI model may be returning malformed JSON
|
||||
2. Try a more capable model (gpt-4o instead of gpt-3.5)
|
||||
3. Check if prompts are being truncated
|
||||
4. Review recent responses in logs
|
||||
|
||||
### All Requests Failing
|
||||
|
||||
1. Test connection in Settings → AI
|
||||
2. Verify API key hasn't been revoked
|
||||
3. Check billing status in OpenAI dashboard
|
||||
4. Review network connectivity
|
||||
|
||||
### Slow Responses
|
||||
|
||||
1. Consider using gpt-4o-mini for speed
|
||||
2. Reduce batch sizes
|
||||
3. Check for rate limiting (429 errors)
|
||||
4. Monitor OpenAI latency
|
||||
|
||||
## Error Response Format
|
||||
|
||||
When errors occur, services return structured responses:
|
||||
|
||||
```typescript
|
||||
// AI Assignment error response
|
||||
{
|
||||
success: false,
|
||||
suggestions: [],
|
||||
error: "Rate limit exceeded. Wait a few minutes and try again.",
|
||||
fallbackUsed: true,
|
||||
}
|
||||
|
||||
// AI Filtering error response
|
||||
{
|
||||
projectId: "...",
|
||||
meetsCriteria: false,
|
||||
confidence: 0,
|
||||
reasoning: "AI error: Rate limit exceeded",
|
||||
flagForReview: true,
|
||||
}
|
||||
```
|
||||
|
||||
## Implementing Custom Error Handling
|
||||
|
||||
```typescript
|
||||
import {
|
||||
classifyAIError,
|
||||
shouldRetry,
|
||||
getRetryDelay,
|
||||
getUserFriendlyMessage,
|
||||
logAIError,
|
||||
} from '@/server/services/ai-errors'
|
||||
|
||||
async function callAIWithRetry<T>(
|
||||
operation: () => Promise<T>,
|
||||
serviceName: string,
|
||||
maxRetries: number = 3
|
||||
): Promise<T> {
|
||||
let lastError: Error | null = null
|
||||
|
||||
for (let attempt = 1; attempt <= maxRetries; attempt++) {
|
||||
try {
|
||||
return await operation()
|
||||
} catch (error) {
|
||||
const classified = classifyAIError(error)
|
||||
logAIError(serviceName, 'operation', classified)
|
||||
|
||||
if (!shouldRetry(classified.type) || attempt === maxRetries) {
|
||||
throw new Error(getUserFriendlyMessage(classified.type))
|
||||
}
|
||||
|
||||
const delay = getRetryDelay(classified.type) * attempt
|
||||
await new Promise(resolve => setTimeout(resolve, delay))
|
||||
lastError = error as Error
|
||||
}
|
||||
}
|
||||
|
||||
throw lastError
|
||||
}
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [AI System Architecture](./ai-system.md)
|
||||
- [AI Configuration Guide](./ai-configuration.md)
|
||||
- [AI Services Reference](./ai-services.md)
|
||||
|
|
@ -0,0 +1,222 @@
|
|||
# AI Prompts Reference
|
||||
|
||||
This document describes the prompts used by each AI service. All prompts are optimized for token efficiency while maintaining accuracy.
|
||||
|
||||
## Design Principles
|
||||
|
||||
1. **Concise system prompts** - Under 100 tokens where possible
|
||||
2. **Structured output** - JSON format for reliable parsing
|
||||
3. **Clear field names** - Consistent naming across services
|
||||
4. **Score ranges** - 0-1 for confidence, 1-10 for quality
|
||||
|
||||
## Filtering Prompt
|
||||
|
||||
**Purpose:** Evaluate projects against admin-defined criteria
|
||||
|
||||
### System Prompt
|
||||
```
|
||||
Project screening assistant. Evaluate each project against the criteria.
|
||||
Return JSON: {"projects": [{project_id, meets_criteria: bool, confidence: 0-1, reasoning: str, quality_score: 1-10, spam_risk: bool}]}
|
||||
Assess description quality and relevance objectively.
|
||||
```
|
||||
|
||||
### User Prompt Template
|
||||
```
|
||||
CRITERIA: {criteria_text}
|
||||
PROJECTS: {anonymized_project_array}
|
||||
Evaluate each project against the criteria. Return JSON.
|
||||
```
|
||||
|
||||
### Example Response
|
||||
```json
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"project_id": "P1",
|
||||
"meets_criteria": true,
|
||||
"confidence": 0.9,
|
||||
"reasoning": "Project focuses on coral reef restoration, matching ocean conservation criteria",
|
||||
"quality_score": 8,
|
||||
"spam_risk": false
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Assignment Prompt
|
||||
|
||||
**Purpose:** Match jurors to projects by expertise
|
||||
|
||||
### System Prompt
|
||||
```
|
||||
Match jurors to projects by expertise. Return JSON assignments.
|
||||
Each: {juror_id, project_id, confidence_score: 0-1, expertise_match_score: 0-1, reasoning: str (1-2 sentences)}
|
||||
Distribute workload fairly. Avoid assigning jurors at capacity.
|
||||
```
|
||||
|
||||
### User Prompt Template
|
||||
```
|
||||
JURORS: {anonymized_juror_array}
|
||||
PROJECTS: {anonymized_project_array}
|
||||
CONSTRAINTS: {N} reviews/project, max {M}/juror
|
||||
EXISTING: {existing_assignments}
|
||||
Return JSON: {"assignments": [...]}
|
||||
```
|
||||
|
||||
### Example Response
|
||||
```json
|
||||
{
|
||||
"assignments": [
|
||||
{
|
||||
"juror_id": "juror_001",
|
||||
"project_id": "project_005",
|
||||
"confidence_score": 0.85,
|
||||
"expertise_match_score": 0.9,
|
||||
"reasoning": "Juror expertise in marine biology aligns with coral restoration project"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Award Eligibility Prompt
|
||||
|
||||
**Purpose:** Determine project eligibility for special awards
|
||||
|
||||
### System Prompt
|
||||
```
|
||||
Award eligibility evaluator. Evaluate projects against criteria, return JSON.
|
||||
Format: {"evaluations": [{project_id, eligible: bool, confidence: 0-1, reasoning: str}]}
|
||||
Be objective. Base evaluation only on provided data. No personal identifiers in reasoning.
|
||||
```
|
||||
|
||||
### User Prompt Template
|
||||
```
|
||||
CRITERIA: {criteria_text}
|
||||
PROJECTS: {anonymized_project_array}
|
||||
Evaluate eligibility for each project.
|
||||
```
|
||||
|
||||
### Example Response
|
||||
```json
|
||||
{
|
||||
"evaluations": [
|
||||
{
|
||||
"project_id": "P3",
|
||||
"eligible": true,
|
||||
"confidence": 0.95,
|
||||
"reasoning": "Project is based in Italy and focuses on Mediterranean biodiversity"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Mentor Matching Prompt
|
||||
|
||||
**Purpose:** Recommend mentors for projects
|
||||
|
||||
### System Prompt
|
||||
```
|
||||
Match mentors to projects by expertise. Return JSON.
|
||||
Format for each project: {"matches": [{project_id, mentor_matches: [{mentor_index, confidence_score: 0-1, expertise_match_score: 0-1, reasoning: str}]}]}
|
||||
Rank by suitability. Consider expertise alignment and availability.
|
||||
```
|
||||
|
||||
### User Prompt Template
|
||||
```
|
||||
PROJECTS:
|
||||
P1: Category=STARTUP, Issue=HABITAT_RESTORATION, Tags=[coral, reef], Desc=Project description...
|
||||
P2: ...
|
||||
|
||||
MENTORS:
|
||||
0: Expertise=[marine biology, coral], Availability=2/5
|
||||
1: Expertise=[business development], Availability=0/3
|
||||
...
|
||||
|
||||
For each project, rank top {N} mentors.
|
||||
```
|
||||
|
||||
### Example Response
|
||||
```json
|
||||
{
|
||||
"matches": [
|
||||
{
|
||||
"project_id": "P1",
|
||||
"mentor_matches": [
|
||||
{
|
||||
"mentor_index": 0,
|
||||
"confidence_score": 0.92,
|
||||
"expertise_match_score": 0.95,
|
||||
"reasoning": "Marine biology expertise directly matches coral restoration focus"
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anonymized Data Structure
|
||||
|
||||
All projects sent to AI use this structure:
|
||||
|
||||
```typescript
|
||||
interface AnonymizedProjectForAI {
|
||||
project_id: string // P1, P2, etc.
|
||||
title: string // Sanitized (PII removed)
|
||||
description: string // Truncated + PII stripped
|
||||
category: string | null // STARTUP | BUSINESS_CONCEPT
|
||||
ocean_issue: string | null
|
||||
country: string | null
|
||||
region: string | null
|
||||
institution: string | null
|
||||
tags: string[]
|
||||
founded_year: number | null
|
||||
team_size: number
|
||||
has_description: boolean
|
||||
file_count: number
|
||||
file_types: string[]
|
||||
wants_mentorship: boolean
|
||||
submission_source: string
|
||||
submitted_date: string | null // YYYY-MM-DD
|
||||
}
|
||||
```
|
||||
|
||||
### What Gets Stripped
|
||||
- Team/company names
|
||||
- Email addresses
|
||||
- Phone numbers
|
||||
- External URLs
|
||||
- Real project/user IDs
|
||||
- Internal comments
|
||||
|
||||
---
|
||||
|
||||
## Token Optimization Tips
|
||||
|
||||
1. **Batch projects** - Process 15-20 per request
|
||||
2. **Truncate descriptions** - 300-500 chars based on task
|
||||
3. **Use abbreviated fields** - `desc` vs `description`
|
||||
4. **Compress constraints** - Inline in prompt
|
||||
5. **Request specific fields** - Only what you need
|
||||
|
||||
## Prompt Versioning
|
||||
|
||||
| Service | Version | Last Updated |
|
||||
|---------|---------|--------------|
|
||||
| Filtering | 2.0 | 2025-01 |
|
||||
| Assignment | 2.0 | 2025-01 |
|
||||
| Award Eligibility | 2.0 | 2025-01 |
|
||||
| Mentor Matching | 2.0 | 2025-01 |
|
||||
|
||||
## See Also
|
||||
|
||||
- [AI System Architecture](./ai-system.md)
|
||||
- [AI Services Reference](./ai-services.md)
|
||||
- [AI Configuration Guide](./ai-configuration.md)
|
||||
|
|
@ -0,0 +1,249 @@
|
|||
# AI Services Reference
|
||||
|
||||
## 1. AI Filtering Service
|
||||
|
||||
**File:** `src/server/services/ai-filtering.ts`
|
||||
|
||||
**Purpose:** Evaluate projects against admin-defined criteria text
|
||||
|
||||
### Input
|
||||
- List of projects (anonymized)
|
||||
- Criteria text (e.g., "Projects must be based in Mediterranean region")
|
||||
- Rule configuration (PASS/REJECT/FLAG actions)
|
||||
|
||||
### Output
|
||||
Per-project result:
|
||||
- `meets_criteria` - Boolean
|
||||
- `confidence` - 0-1 score
|
||||
- `reasoning` - Explanation
|
||||
- `quality_score` - 1-10 rating
|
||||
- `spam_risk` - Boolean flag
|
||||
|
||||
### Configuration
|
||||
- **Batch Size:** 20 projects per API call
|
||||
- **Description Limit:** 500 characters
|
||||
- **Token Usage:** ~1500-2500 tokens per batch
|
||||
|
||||
### Example Criteria
|
||||
- "Filter out any project without a description"
|
||||
- "Only include projects founded after 2020"
|
||||
- "Reject projects with fewer than 2 team members"
|
||||
- "Projects must be based in Mediterranean region"
|
||||
|
||||
### Usage
|
||||
```typescript
|
||||
import { aiFilterProjects } from '@/server/services/ai-filtering'
|
||||
|
||||
const results = await aiFilterProjects(
|
||||
projects,
|
||||
'Only include projects with ocean conservation focus',
|
||||
userId,
|
||||
roundId
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. AI Assignment Service
|
||||
|
||||
**File:** `src/server/services/ai-assignment.ts`
|
||||
|
||||
**Purpose:** Match jurors to projects based on expertise alignment
|
||||
|
||||
### Input
|
||||
- List of jurors with expertise tags
|
||||
- List of projects with tags/category
|
||||
- Constraints:
|
||||
- Required reviews per project
|
||||
- Max assignments per juror
|
||||
- Existing assignments (to avoid duplicates)
|
||||
|
||||
### Output
|
||||
Suggested assignments:
|
||||
- `jurorId` - Juror to assign
|
||||
- `projectId` - Project to assign
|
||||
- `confidenceScore` - 0-1 match confidence
|
||||
- `expertiseMatchScore` - 0-1 expertise overlap
|
||||
- `reasoning` - Explanation
|
||||
|
||||
### Configuration
|
||||
- **Batch Size:** 15 projects per batch (all jurors included)
|
||||
- **Description Limit:** 300 characters
|
||||
- **Token Usage:** ~2000-3500 tokens per batch
|
||||
|
||||
### Fallback Algorithm
|
||||
When AI is unavailable, uses:
|
||||
1. Tag overlap scoring (60% weight)
|
||||
2. Load balancing (40% weight)
|
||||
3. Constraint satisfaction
|
||||
|
||||
### Usage
|
||||
```typescript
|
||||
import { generateAIAssignments } from '@/server/services/ai-assignment'
|
||||
|
||||
const result = await generateAIAssignments(
|
||||
jurors,
|
||||
projects,
|
||||
{
|
||||
requiredReviewsPerProject: 3,
|
||||
maxAssignmentsPerJuror: 10,
|
||||
existingAssignments: [],
|
||||
},
|
||||
userId,
|
||||
roundId
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Award Eligibility Service
|
||||
|
||||
**File:** `src/server/services/ai-award-eligibility.ts`
|
||||
|
||||
**Purpose:** Determine which projects qualify for special awards
|
||||
|
||||
### Input
|
||||
- Award criteria text (plain language)
|
||||
- List of projects (anonymized)
|
||||
- Optional: Auto-tag rules (field-based matching)
|
||||
|
||||
### Output
|
||||
Per-project:
|
||||
- `eligible` - Boolean
|
||||
- `confidence` - 0-1 score
|
||||
- `reasoning` - Explanation
|
||||
- `method` - 'AI' or 'AUTO'
|
||||
|
||||
### Configuration
|
||||
- **Batch Size:** 20 projects per API call
|
||||
- **Description Limit:** 400 characters
|
||||
- **Token Usage:** ~1500-2500 tokens per batch
|
||||
|
||||
### Auto-Tag Rules
|
||||
Deterministic rules can be combined with AI:
|
||||
```typescript
|
||||
const rules: AutoTagRule[] = [
|
||||
{ field: 'country', operator: 'equals', value: 'Italy' },
|
||||
{ field: 'competitionCategory', operator: 'equals', value: 'STARTUP' },
|
||||
]
|
||||
```
|
||||
|
||||
### Usage
|
||||
```typescript
|
||||
import { aiInterpretCriteria, applyAutoTagRules } from '@/server/services/ai-award-eligibility'
|
||||
|
||||
// Deterministic matching
|
||||
const autoResults = applyAutoTagRules(rules, projects)
|
||||
|
||||
// AI-based criteria interpretation
|
||||
const aiResults = await aiInterpretCriteria(
|
||||
'Projects focusing on marine biodiversity',
|
||||
projects,
|
||||
userId,
|
||||
awardId
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Mentor Matching Service
|
||||
|
||||
**File:** `src/server/services/mentor-matching.ts`
|
||||
|
||||
**Purpose:** Recommend mentors for projects based on expertise
|
||||
|
||||
### Input
|
||||
- Project details (single or batch)
|
||||
- Available mentors with expertise tags and availability
|
||||
|
||||
### Output
|
||||
Ranked list of mentor matches:
|
||||
- `mentorId` - Mentor ID
|
||||
- `confidenceScore` - 0-1 overall match
|
||||
- `expertiseMatchScore` - 0-1 expertise overlap
|
||||
- `reasoning` - Explanation
|
||||
|
||||
### Configuration
|
||||
- **Batch Size:** 15 projects per batch
|
||||
- **Description Limit:** 350 characters
|
||||
- **Token Usage:** ~1500-2500 tokens per batch
|
||||
|
||||
### Fallback Algorithm
|
||||
Keyword-based matching when AI unavailable:
|
||||
1. Extract keywords from project tags/description
|
||||
2. Match against mentor expertise tags
|
||||
3. Factor in availability (assignments vs max)
|
||||
|
||||
### Usage
|
||||
```typescript
|
||||
import {
|
||||
getAIMentorSuggestions,
|
||||
getAIMentorSuggestionsBatch
|
||||
} from '@/server/services/mentor-matching'
|
||||
|
||||
// Single project
|
||||
const matches = await getAIMentorSuggestions(prisma, projectId, 5, userId)
|
||||
|
||||
// Batch processing
|
||||
const batchResults = await getAIMentorSuggestionsBatch(
|
||||
prisma,
|
||||
projectIds,
|
||||
5,
|
||||
userId
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Patterns
|
||||
|
||||
### Token Logging
|
||||
All services log usage to `AIUsageLog`:
|
||||
```typescript
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'FILTERING',
|
||||
entityType: 'Round',
|
||||
entityId: roundId,
|
||||
model,
|
||||
promptTokens: usage.promptTokens,
|
||||
completionTokens: usage.completionTokens,
|
||||
totalTokens: usage.totalTokens,
|
||||
batchSize: projects.length,
|
||||
itemsProcessed: projects.length,
|
||||
status: 'SUCCESS',
|
||||
})
|
||||
```
|
||||
|
||||
### Error Handling
|
||||
All services use unified error classification:
|
||||
```typescript
|
||||
try {
|
||||
// AI call
|
||||
} catch (error) {
|
||||
const classified = classifyAIError(error)
|
||||
logAIError('ServiceName', 'functionName', classified)
|
||||
|
||||
if (classified.retryable) {
|
||||
// Retry logic
|
||||
} else {
|
||||
// Fall back to algorithm
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Anonymization
|
||||
All services anonymize before sending to AI:
|
||||
```typescript
|
||||
const { anonymized, mappings } = anonymizeProjectsForAI(projects, 'FILTERING')
|
||||
|
||||
if (!validateAnonymizedProjects(anonymized)) {
|
||||
throw new Error('GDPR compliance check failed')
|
||||
}
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [AI System Architecture](./ai-system.md)
|
||||
- [AI Configuration Guide](./ai-configuration.md)
|
||||
- [AI Error Handling](./ai-errors.md)
|
||||
|
|
@ -0,0 +1,143 @@
|
|||
# MOPC AI System Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
The MOPC platform uses AI (OpenAI GPT models) for four core functions:
|
||||
1. **Project Filtering** - Automated eligibility screening against admin-defined criteria
|
||||
2. **Jury Assignment** - Smart juror-project matching based on expertise alignment
|
||||
3. **Award Eligibility** - Special award qualification determination
|
||||
4. **Mentor Matching** - Mentor-project recommendations based on expertise
|
||||
|
||||
## System Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ ADMIN INTERFACE │
|
||||
│ (Rounds, Filtering, Awards, Assignments, Mentor Assignment) │
|
||||
└─────────────────────────┬───────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ tRPC ROUTERS │
|
||||
│ filtering.ts │ assignment.ts │ specialAward.ts │ mentor.ts │
|
||||
└─────────────────────────┬───────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ AI SERVICES │
|
||||
│ ai-filtering.ts │ ai-assignment.ts │ ai-award-eligibility.ts │
|
||||
│ │ mentor-matching.ts │
|
||||
└─────────────────────────┬───────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ ANONYMIZATION LAYER │
|
||||
│ anonymization.ts │
|
||||
│ - PII stripping - ID replacement - Text sanitization │
|
||||
└─────────────────────────┬───────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ OPENAI CLIENT │
|
||||
│ lib/openai.ts │
|
||||
│ - Model detection - Parameter building - Token tracking │
|
||||
└─────────────────────────┬───────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ OPENAI API │
|
||||
│ GPT-4o │ GPT-4o-mini │ o1 │ o3-mini (configurable) │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Data Flow
|
||||
|
||||
1. **Admin triggers AI action** (filter projects, suggest assignments)
|
||||
2. **Router validates permissions** and fetches data from database
|
||||
3. **AI Service prepares data** for processing
|
||||
4. **Anonymization Layer strips PII**, replaces IDs, sanitizes text
|
||||
5. **OpenAI Client builds request** with correct parameters for model type
|
||||
6. **Request sent to OpenAI API**
|
||||
7. **Response parsed and de-anonymized**
|
||||
8. **Results stored in database**, usage logged
|
||||
9. **UI updated** with results
|
||||
|
||||
## Key Components
|
||||
|
||||
### OpenAI Client (`lib/openai.ts`)
|
||||
|
||||
Handles communication with OpenAI API:
|
||||
- `getOpenAI()` - Get configured OpenAI client
|
||||
- `getConfiguredModel()` - Get the admin-selected model
|
||||
- `buildCompletionParams()` - Build API parameters (handles reasoning vs standard models)
|
||||
- `isReasoningModel()` - Detect o1/o3/o4 series models
|
||||
|
||||
### Anonymization Service (`server/services/anonymization.ts`)
|
||||
|
||||
GDPR-compliant data preparation:
|
||||
- `anonymizeForAI()` - Basic anonymization for assignment
|
||||
- `anonymizeProjectsForAI()` - Comprehensive project anonymization for filtering/awards
|
||||
- `validateAnonymization()` - Verify no PII in anonymized data
|
||||
- `deanonymizeResults()` - Map AI results back to real IDs
|
||||
|
||||
### Token Tracking (`server/utils/ai-usage.ts`)
|
||||
|
||||
Cost and usage monitoring:
|
||||
- `logAIUsage()` - Log API calls to database
|
||||
- `calculateCost()` - Compute estimated cost by model
|
||||
- `getAIUsageStats()` - Retrieve usage statistics
|
||||
- `getCurrentMonthCost()` - Get current billing period totals
|
||||
|
||||
### Error Handling (`server/services/ai-errors.ts`)
|
||||
|
||||
Unified error classification:
|
||||
- `classifyAIError()` - Categorize API errors
|
||||
- `shouldRetry()` - Determine if error is retryable
|
||||
- `getUserFriendlyMessage()` - Get human-readable error messages
|
||||
|
||||
## Batching Strategy
|
||||
|
||||
All AI services process data in batches to avoid token limits:
|
||||
|
||||
| Service | Batch Size | Reason |
|
||||
|---------|------------|--------|
|
||||
| AI Assignment | 15 projects | Include all jurors per batch |
|
||||
| AI Filtering | 20 projects | Balance throughput and cost |
|
||||
| Award Eligibility | 20 projects | Consistent with filtering |
|
||||
| Mentor Matching | 15 projects | All mentors per batch |
|
||||
|
||||
## Fallback Behavior
|
||||
|
||||
All AI services have algorithmic fallbacks when AI is unavailable:
|
||||
|
||||
1. **Assignment** - Expertise tag matching + load balancing
|
||||
2. **Filtering** - Flag all projects for manual review
|
||||
3. **Award Eligibility** - Flag all for manual review
|
||||
4. **Mentor Matching** - Keyword-based matching algorithm
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **API keys** stored encrypted in database
|
||||
2. **No PII** sent to OpenAI (enforced by anonymization)
|
||||
3. **Audit logging** of all AI operations
|
||||
4. **Role-based access** to AI features (admin only)
|
||||
|
||||
## Files Reference
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `lib/openai.ts` | OpenAI client configuration |
|
||||
| `server/services/ai-filtering.ts` | Project filtering service |
|
||||
| `server/services/ai-assignment.ts` | Jury assignment service |
|
||||
| `server/services/ai-award-eligibility.ts` | Award eligibility service |
|
||||
| `server/services/mentor-matching.ts` | Mentor matching service |
|
||||
| `server/services/anonymization.ts` | Data anonymization |
|
||||
| `server/services/ai-errors.ts` | Error classification |
|
||||
| `server/utils/ai-usage.ts` | Token tracking |
|
||||
|
||||
## See Also
|
||||
|
||||
- [AI Services Reference](./ai-services.md)
|
||||
- [AI Configuration Guide](./ai-configuration.md)
|
||||
- [AI Error Handling](./ai-errors.md)
|
||||
- [AI Prompts Reference](./ai-prompts.md)
|
||||
|
|
@ -0,0 +1,217 @@
|
|||
# AI Data Processing - GDPR Compliance Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes how project data is processed by AI services in the MOPC Platform, ensuring compliance with GDPR Articles 5, 6, 13-14, 25, and 32.
|
||||
|
||||
## Legal Basis
|
||||
|
||||
| Processing Activity | Legal Basis | GDPR Article |
|
||||
|---------------------|-------------|--------------|
|
||||
| AI-powered project filtering | Legitimate interest | Art. 6(1)(f) |
|
||||
| AI-powered jury assignment | Legitimate interest | Art. 6(1)(f) |
|
||||
| AI-powered award eligibility | Legitimate interest | Art. 6(1)(f) |
|
||||
| AI-powered mentor matching | Legitimate interest | Art. 6(1)(f) |
|
||||
|
||||
**Legitimate Interest Justification:** AI processing is used to efficiently evaluate ocean conservation projects and match appropriate reviewers, directly serving the platform's purpose of managing the Monaco Ocean Protection Challenge.
|
||||
|
||||
## Data Minimization (Article 5(1)(c))
|
||||
|
||||
The AI system applies strict data minimization:
|
||||
|
||||
- **Only necessary fields** sent to AI (no names, emails, phone numbers)
|
||||
- **Descriptions truncated** to 300-500 characters maximum
|
||||
- **Team size** sent as count only (no member details)
|
||||
- **Dates** sent as year-only or ISO date (no timestamps)
|
||||
- **IDs replaced** with sequential anonymous identifiers (P1, P2, etc.)
|
||||
|
||||
## Anonymization Measures
|
||||
|
||||
### Data NEVER Sent to AI
|
||||
|
||||
| Data Type | Reason |
|
||||
|-----------|--------|
|
||||
| Personal names | PII - identifying |
|
||||
| Email addresses | PII - identifying |
|
||||
| Phone numbers | PII - identifying |
|
||||
| Physical addresses | PII - identifying |
|
||||
| External URLs | Could identify individuals |
|
||||
| Internal project/user IDs | Could be cross-referenced |
|
||||
| Team member details | PII - identifying |
|
||||
| Internal comments | May contain PII |
|
||||
| File content | May contain PII |
|
||||
|
||||
### Data Sent to AI (Anonymized)
|
||||
|
||||
| Field | Type | Purpose | Anonymization |
|
||||
|-------|------|---------|---------------|
|
||||
| project_id | String | Reference | Replaced with P1, P2, etc. |
|
||||
| title | String | Spam detection | PII patterns removed |
|
||||
| description | String | Criteria matching | Truncated, PII stripped |
|
||||
| category | Enum | Filtering | As-is (no PII) |
|
||||
| ocean_issue | Enum | Topic filtering | As-is (no PII) |
|
||||
| country | String | Geographic eligibility | As-is (country name only) |
|
||||
| region | String | Regional eligibility | As-is (zone name only) |
|
||||
| institution | String | Student identification | As-is (institution name only) |
|
||||
| tags | Array | Keyword matching | As-is (no PII expected) |
|
||||
| founded_year | Number | Age filtering | Year only, not full date |
|
||||
| team_size | Number | Team requirements | Count only |
|
||||
| file_count | Number | Document checks | Count only |
|
||||
| file_types | Array | File requirements | Type names only |
|
||||
| wants_mentorship | Boolean | Mentorship filtering | As-is |
|
||||
| submission_source | Enum | Source filtering | As-is |
|
||||
| submitted_date | String | Deadline checks | Date only, no time |
|
||||
|
||||
## Technical Safeguards
|
||||
|
||||
### PII Detection and Stripping
|
||||
|
||||
```typescript
|
||||
// Patterns detected and removed before AI processing
|
||||
const PII_PATTERNS = {
|
||||
email: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g,
|
||||
phone: /(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/g,
|
||||
url: /https?:\/\/[^\s]+/g,
|
||||
ssn: /\d{3}-\d{2}-\d{4}/g,
|
||||
ipv4: /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/g,
|
||||
}
|
||||
```
|
||||
|
||||
### Validation Before Every AI Call
|
||||
|
||||
```typescript
|
||||
// GDPR compliance enforced before EVERY API call
|
||||
export function enforceGDPRCompliance(data: unknown[]): void {
|
||||
for (const item of data) {
|
||||
const { valid, violations } = validateNoPersonalData(item)
|
||||
if (!valid) {
|
||||
throw new Error(`GDPR compliance check failed: ${violations.join(', ')}`)
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### ID Anonymization
|
||||
|
||||
Real IDs are never sent to AI. Instead:
|
||||
- Projects: `cm1abc123...` → `P1`, `P2`, `P3`
|
||||
- Jurors: `cm2def456...` → `juror_001`, `juror_002`
|
||||
- Results mapped back using secure mapping tables
|
||||
|
||||
## Data Retention
|
||||
|
||||
| Data Type | Retention | Deletion Method |
|
||||
|-----------|-----------|-----------------|
|
||||
| AI usage logs | 12 months | Automatic deletion |
|
||||
| Anonymized prompts | Not stored | Sent directly to API |
|
||||
| AI responses | Not stored | Parsed and discarded |
|
||||
|
||||
**Note:** OpenAI does not retain API data for training (per their API Terms). API data is retained for up to 30 days for abuse monitoring, configurable to 0 days.
|
||||
|
||||
## Subprocessor: OpenAI
|
||||
|
||||
| Aspect | Details |
|
||||
|--------|---------|
|
||||
| Subprocessor | OpenAI, Inc. |
|
||||
| Location | United States |
|
||||
| DPA Status | Data Processing Agreement in place |
|
||||
| Safeguards | Standard Contractual Clauses (SCCs) |
|
||||
| Compliance | SOC 2 Type II, GDPR-compliant |
|
||||
| Data Use | API data NOT used for model training |
|
||||
|
||||
**OpenAI DPA:** https://openai.com/policies/data-processing-agreement
|
||||
|
||||
## Audit Trail
|
||||
|
||||
All AI processing is logged:
|
||||
|
||||
```typescript
|
||||
await prisma.aIUsageLog.create({
|
||||
data: {
|
||||
userId: ctx.user.id, // Who initiated
|
||||
action: 'FILTERING', // What type
|
||||
entityType: 'Round', // What entity
|
||||
entityId: roundId, // Which entity
|
||||
model: 'gpt-4o', // What model
|
||||
totalTokens: 1500, // Resource usage
|
||||
status: 'SUCCESS', // Outcome
|
||||
},
|
||||
})
|
||||
```
|
||||
|
||||
## Data Subject Rights
|
||||
|
||||
### Right of Access (Article 15)
|
||||
|
||||
Users can request:
|
||||
- What data was processed by AI
|
||||
- When AI processing occurred
|
||||
- What decisions were made
|
||||
|
||||
**Implementation:** Export AI usage logs for user's projects.
|
||||
|
||||
### Right to Erasure (Article 17)
|
||||
|
||||
When a user requests deletion:
|
||||
- AI usage logs for their projects can be deleted
|
||||
- No data remains at OpenAI (API data not retained for training)
|
||||
|
||||
**Note:** Since only anonymized data is sent to AI, there is no personal data at OpenAI to delete.
|
||||
|
||||
### Right to Object (Article 21)
|
||||
|
||||
Users can request to opt out of AI processing:
|
||||
- Admin can disable AI features per round
|
||||
- Manual review fallback available for all AI features
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
### Risk: PII Leakage to AI Provider
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| Likelihood | Very Low |
|
||||
| Impact | Medium |
|
||||
| Mitigation | Automated PII detection, validation before every call |
|
||||
| Residual Risk | Very Low |
|
||||
|
||||
### Risk: AI Decision Bias
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| Likelihood | Low |
|
||||
| Impact | Low |
|
||||
| Mitigation | Human review of all AI suggestions, algorithmic fallback |
|
||||
| Residual Risk | Very Low |
|
||||
|
||||
### Risk: Data Breach at Subprocessor
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| Likelihood | Very Low |
|
||||
| Impact | Low (only anonymized data) |
|
||||
| Mitigation | OpenAI SOC 2 compliance, no PII sent |
|
||||
| Residual Risk | Very Low |
|
||||
|
||||
## Compliance Checklist
|
||||
|
||||
- [x] Data minimization applied (only necessary fields)
|
||||
- [x] PII stripped before AI processing
|
||||
- [x] Anonymization validated before every API call
|
||||
- [x] DPA in place with OpenAI
|
||||
- [x] Audit logging of all AI operations
|
||||
- [x] Fallback available when AI declined
|
||||
- [x] Usage logs retained for 12 months only
|
||||
- [x] No personal data stored at subprocessor
|
||||
|
||||
## Contact
|
||||
|
||||
For questions about AI data processing:
|
||||
- Data Protection Officer: [DPO email]
|
||||
- Technical Contact: [Tech contact email]
|
||||
|
||||
## See Also
|
||||
|
||||
- [Platform GDPR Compliance](./platform-gdpr-compliance.md)
|
||||
- [AI System Architecture](../architecture/ai-system.md)
|
||||
- [AI Services Reference](../architecture/ai-services.md)
|
||||
|
|
@ -0,0 +1,324 @@
|
|||
# MOPC Platform - GDPR Compliance Documentation
|
||||
|
||||
## 1. Data Controller Information
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| **Data Controller** | Monaco Ocean Protection Challenge |
|
||||
| **Contact** | [Data Protection Officer email] |
|
||||
| **Platform** | monaco-opc.com |
|
||||
| **Jurisdiction** | Monaco |
|
||||
|
||||
---
|
||||
|
||||
## 2. Personal Data Collected
|
||||
|
||||
### 2.1 User Account Data
|
||||
|
||||
| Data Type | Purpose | Legal Basis | Retention |
|
||||
|-----------|---------|-------------|-----------|
|
||||
| Email address | Account identification, notifications | Contract performance | Account lifetime + 2 years |
|
||||
| Name | Display in platform, certificates | Contract performance | Account lifetime + 2 years |
|
||||
| Phone number (optional) | WhatsApp notifications | Consent | Until consent withdrawn |
|
||||
| Profile photo (optional) | Platform personalization | Consent | Until deleted by user |
|
||||
| Role | Access control | Contract performance | Account lifetime |
|
||||
| IP address | Security, audit logging | Legitimate interest | 12 months |
|
||||
| User agent | Security, debugging | Legitimate interest | 12 months |
|
||||
|
||||
### 2.2 Project/Application Data
|
||||
|
||||
| Data Type | Purpose | Legal Basis | Retention |
|
||||
|-----------|---------|-------------|-----------|
|
||||
| Project title | Competition entry | Contract performance | Program lifetime + 5 years |
|
||||
| Project description | Evaluation | Contract performance | Program lifetime + 5 years |
|
||||
| Team information | Contact, evaluation | Contract performance | Program lifetime + 5 years |
|
||||
| Uploaded files | Evaluation | Contract performance | Program lifetime + 5 years |
|
||||
| Country/Region | Geographic eligibility | Contract performance | Program lifetime + 5 years |
|
||||
|
||||
### 2.3 Evaluation Data
|
||||
|
||||
| Data Type | Purpose | Legal Basis | Retention |
|
||||
|-----------|---------|-------------|-----------|
|
||||
| Jury evaluations | Competition judging | Contract performance | Program lifetime + 5 years |
|
||||
| Scores and comments | Competition judging | Contract performance | Program lifetime + 5 years |
|
||||
| Evaluation timestamps | Audit trail | Legitimate interest | Program lifetime + 5 years |
|
||||
|
||||
### 2.4 Technical Data
|
||||
|
||||
| Data Type | Purpose | Legal Basis | Retention |
|
||||
|-----------|---------|-------------|-----------|
|
||||
| Session tokens | Authentication | Contract performance | Session duration |
|
||||
| Magic link tokens | Passwordless login | Contract performance | 15 minutes |
|
||||
| Audit logs | Security, compliance | Legitimate interest | 12 months |
|
||||
| AI usage logs | Cost tracking, debugging | Legitimate interest | 12 months |
|
||||
|
||||
---
|
||||
|
||||
## 3. Data Processing Purposes
|
||||
|
||||
### 3.1 Primary Purposes
|
||||
|
||||
1. **Competition Management** - Managing project submissions, evaluations, and results
|
||||
2. **User Authentication** - Secure access to the platform
|
||||
3. **Communication** - Sending notifications about evaluations, deadlines, results
|
||||
|
||||
### 3.2 Secondary Purposes
|
||||
|
||||
1. **Analytics** - Understanding platform usage (aggregated, anonymized)
|
||||
2. **Security** - Detecting and preventing unauthorized access
|
||||
3. **AI Processing** - Automated filtering and matching (anonymized data only)
|
||||
|
||||
---
|
||||
|
||||
## 4. Third-Party Data Sharing
|
||||
|
||||
### 4.1 Subprocessors
|
||||
|
||||
| Subprocessor | Purpose | Data Shared | Location | DPA |
|
||||
|--------------|---------|-------------|----------|-----|
|
||||
| OpenAI | AI processing | Anonymized project data only | USA | Yes |
|
||||
| MinIO/S3 | File storage | Uploaded files | [Location] | Yes |
|
||||
| Poste.io | Email delivery | Email addresses, notification content | [Location] | Yes |
|
||||
|
||||
### 4.2 Data Shared with OpenAI
|
||||
|
||||
**Sent to OpenAI:**
|
||||
- Anonymized project titles (PII sanitized)
|
||||
- Truncated descriptions (500 chars max)
|
||||
- Project category, tags, country
|
||||
- Team size (count only)
|
||||
- Founded year (year only)
|
||||
|
||||
**NEVER sent to OpenAI:**
|
||||
- Names of any individuals
|
||||
- Email addresses
|
||||
- Phone numbers
|
||||
- Physical addresses
|
||||
- External URLs
|
||||
- Internal database IDs
|
||||
- File contents
|
||||
|
||||
For full details, see [AI Data Processing](./ai-data-processing.md).
|
||||
|
||||
---
|
||||
|
||||
## 5. Data Subject Rights
|
||||
|
||||
### 5.1 Right of Access (Article 15)
|
||||
|
||||
Users can request a copy of their personal data via:
|
||||
- Profile → Settings → Download My Data
|
||||
- Email to [DPO email]
|
||||
|
||||
**Response Time:** Within 30 days
|
||||
|
||||
### 5.2 Right to Rectification (Article 16)
|
||||
|
||||
Users can update their data via:
|
||||
- Profile → Settings → Edit Profile
|
||||
- Contact support for assistance
|
||||
|
||||
**Response Time:** Immediately for self-service, 72 hours for support
|
||||
|
||||
### 5.3 Right to Erasure (Article 17)
|
||||
|
||||
Users can request deletion via:
|
||||
- Profile → Settings → Delete Account
|
||||
- Email to [DPO email]
|
||||
|
||||
**Exceptions:** Data required for legal obligations or ongoing competitions
|
||||
|
||||
**Response Time:** Within 30 days
|
||||
|
||||
### 5.4 Right to Restrict Processing (Article 18)
|
||||
|
||||
Users can request processing restrictions by contacting [DPO email]
|
||||
|
||||
**Response Time:** Within 72 hours
|
||||
|
||||
### 5.5 Right to Data Portability (Article 20)
|
||||
|
||||
Users can export their data in machine-readable format (JSON) via:
|
||||
- Profile → Settings → Export Data
|
||||
|
||||
**Format:** JSON file containing all user data
|
||||
|
||||
### 5.6 Right to Object (Article 21)
|
||||
|
||||
Users can object to processing based on legitimate interests by contacting [DPO email]
|
||||
|
||||
**Response Time:** Within 72 hours
|
||||
|
||||
---
|
||||
|
||||
## 6. Security Measures (Article 32)
|
||||
|
||||
### 6.1 Technical Measures
|
||||
|
||||
| Measure | Implementation |
|
||||
|---------|----------------|
|
||||
| Encryption in transit | TLS 1.3 for all connections |
|
||||
| Encryption at rest | AES-256 for sensitive data |
|
||||
| Authentication | Magic link (passwordless) or OAuth |
|
||||
| Rate limiting | 100 requests/minute per IP |
|
||||
| Session management | Secure cookies, automatic expiry |
|
||||
| Input validation | Zod schema validation on all inputs |
|
||||
|
||||
### 6.2 Access Controls
|
||||
|
||||
| Control | Implementation |
|
||||
|---------|----------------|
|
||||
| RBAC | Role-based permissions (SUPER_ADMIN, PROGRAM_ADMIN, JURY_MEMBER, etc.) |
|
||||
| Least privilege | Users only see assigned projects/programs |
|
||||
| Session expiry | Configurable timeout (default 24 hours) |
|
||||
| Audit logging | All sensitive actions logged |
|
||||
|
||||
### 6.3 Infrastructure Security
|
||||
|
||||
| Measure | Implementation |
|
||||
|---------|----------------|
|
||||
| Firewall | iptables rules on VPS |
|
||||
| DDoS protection | Cloudflare (if configured) |
|
||||
| Updates | Regular security patches |
|
||||
| Backups | Daily encrypted backups, 90-day retention |
|
||||
| Monitoring | Error logging, performance monitoring |
|
||||
|
||||
---
|
||||
|
||||
## 7. Data Retention Policy
|
||||
|
||||
| Data Category | Retention Period | Deletion Method |
|
||||
|---------------|------------------|-----------------|
|
||||
| Active user accounts | Account lifetime | Soft delete → hard delete after 30 days |
|
||||
| Inactive accounts | 2 years after last login | Automatic anonymization |
|
||||
| Project data | Program lifetime + 5 years | Archived, then anonymized |
|
||||
| Audit logs | 12 months | Automatic deletion |
|
||||
| AI usage logs | 12 months | Automatic deletion |
|
||||
| Session data | Session duration | Automatic expiration |
|
||||
| Backup data | 90 days | Automatic rotation |
|
||||
|
||||
---
|
||||
|
||||
## 8. International Data Transfers
|
||||
|
||||
### 8.1 OpenAI (USA)
|
||||
|
||||
| Aspect | Details |
|
||||
|--------|---------|
|
||||
| Transfer Mechanism | Standard Contractual Clauses (SCCs) |
|
||||
| DPA | OpenAI Data Processing Agreement |
|
||||
| Data Minimization | Only anonymized data transferred |
|
||||
| Risk Assessment | Low (no PII transferred) |
|
||||
|
||||
### 8.2 Data Localization
|
||||
|
||||
| Service | Location |
|
||||
|---------|----------|
|
||||
| Primary database | [EU location] |
|
||||
| File storage | [Location] |
|
||||
| Email service | [Location] |
|
||||
|
||||
---
|
||||
|
||||
## 9. Cookies and Tracking
|
||||
|
||||
### 9.1 Essential Cookies
|
||||
|
||||
| Cookie | Purpose | Duration |
|
||||
|--------|---------|----------|
|
||||
| `session_token` | User authentication | Session |
|
||||
| `csrf_token` | CSRF protection | Session |
|
||||
|
||||
### 9.2 Optional Cookies
|
||||
|
||||
The platform does **not** use:
|
||||
- Marketing cookies
|
||||
- Analytics cookies that track individuals
|
||||
- Third-party tracking
|
||||
|
||||
---
|
||||
|
||||
## 10. Data Protection Impact Assessment (DPIA)
|
||||
|
||||
### 10.1 AI Processing DPIA
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| **Risk** | Personal data sent to third-party AI |
|
||||
| **Mitigation** | Strict anonymization before processing |
|
||||
| **Residual Risk** | Low (no PII transferred) |
|
||||
|
||||
### 10.2 File Upload DPIA
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| **Risk** | Sensitive documents uploaded |
|
||||
| **Mitigation** | Pre-signed URLs, access controls, virus scanning |
|
||||
| **Residual Risk** | Medium (users control uploads) |
|
||||
|
||||
### 10.3 Evaluation Data DPIA
|
||||
|
||||
| Factor | Assessment |
|
||||
|--------|------------|
|
||||
| **Risk** | Subjective opinions about projects/teams |
|
||||
| **Mitigation** | Access controls, audit logging |
|
||||
| **Residual Risk** | Low |
|
||||
|
||||
---
|
||||
|
||||
## 11. Breach Notification Procedure
|
||||
|
||||
### 11.1 Detection (Within 24 hours)
|
||||
|
||||
1. Automated monitoring alerts
|
||||
2. User reports
|
||||
3. Security audit findings
|
||||
|
||||
### 11.2 Assessment (Within 48 hours)
|
||||
|
||||
1. Identify affected data and individuals
|
||||
2. Assess severity and risk
|
||||
3. Document incident details
|
||||
|
||||
### 11.3 Notification (Within 72 hours)
|
||||
|
||||
**Supervisory Authority:**
|
||||
- Notify if risk to individuals
|
||||
- Include: nature of breach, categories of data, number affected, consequences, measures taken
|
||||
|
||||
**Affected Individuals:**
|
||||
- Notify without undue delay if high risk
|
||||
- Include: nature of breach, likely consequences, measures taken, contact for information
|
||||
|
||||
### 11.4 Documentation
|
||||
|
||||
All breaches documented regardless of notification requirement.
|
||||
|
||||
---
|
||||
|
||||
## 12. Contact Information
|
||||
|
||||
| Role | Contact |
|
||||
|------|---------|
|
||||
| **Data Protection Officer** | [DPO name] |
|
||||
| **Email** | [DPO email] |
|
||||
| **Address** | [Physical address] |
|
||||
|
||||
**Supervisory Authority:**
|
||||
Commission de Contrôle des Informations Nominatives (CCIN)
|
||||
[Address in Monaco]
|
||||
|
||||
---
|
||||
|
||||
## 13. Document History
|
||||
|
||||
| Version | Date | Changes |
|
||||
|---------|------|---------|
|
||||
| 1.0 | 2025-01 | Initial version |
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [AI Data Processing](./ai-data-processing.md)
|
||||
- [AI System Architecture](../architecture/ai-system.md)
|
||||
|
|
@ -684,6 +684,46 @@ model AuditLog {
|
|||
@@index([timestamp])
|
||||
}
|
||||
|
||||
// =============================================================================
|
||||
// AI USAGE TRACKING
|
||||
// =============================================================================
|
||||
|
||||
model AIUsageLog {
|
||||
id String @id @default(cuid())
|
||||
createdAt DateTime @default(now())
|
||||
|
||||
// Who/what triggered it
|
||||
userId String?
|
||||
action String // ASSIGNMENT, FILTERING, AWARD_ELIGIBILITY, MENTOR_MATCHING
|
||||
entityType String? // Round, Project, Award
|
||||
entityId String?
|
||||
|
||||
// What was used
|
||||
model String // gpt-4o, gpt-4o-mini, o1, etc.
|
||||
promptTokens Int
|
||||
completionTokens Int
|
||||
totalTokens Int
|
||||
|
||||
// Cost tracking
|
||||
estimatedCostUsd Decimal? @db.Decimal(10, 6)
|
||||
|
||||
// Request context
|
||||
batchSize Int?
|
||||
itemsProcessed Int?
|
||||
|
||||
// Status
|
||||
status String // SUCCESS, PARTIAL, ERROR
|
||||
errorMessage String?
|
||||
|
||||
// Detailed data (optional)
|
||||
detailsJson Json? @db.JsonB
|
||||
|
||||
@@index([userId])
|
||||
@@index([action])
|
||||
@@index([createdAt])
|
||||
@@index([model])
|
||||
}
|
||||
|
||||
// =============================================================================
|
||||
// NOTIFICATION LOG (Phase 2)
|
||||
// =============================================================================
|
||||
|
|
|
|||
|
|
@ -0,0 +1,294 @@
|
|||
'use client'
|
||||
|
||||
import { trpc } from '@/lib/trpc/client'
|
||||
import {
|
||||
Card,
|
||||
CardContent,
|
||||
CardDescription,
|
||||
CardHeader,
|
||||
CardTitle,
|
||||
} from '@/components/ui/card'
|
||||
import { Skeleton } from '@/components/ui/skeleton'
|
||||
import { Badge } from '@/components/ui/badge'
|
||||
import {
|
||||
Coins,
|
||||
Zap,
|
||||
TrendingUp,
|
||||
Activity,
|
||||
Brain,
|
||||
Filter,
|
||||
Users,
|
||||
Award,
|
||||
} from 'lucide-react'
|
||||
import { cn } from '@/lib/utils'
|
||||
|
||||
const ACTION_ICONS: Record<string, typeof Zap> = {
|
||||
ASSIGNMENT: Users,
|
||||
FILTERING: Filter,
|
||||
AWARD_ELIGIBILITY: Award,
|
||||
MENTOR_MATCHING: Brain,
|
||||
}
|
||||
|
||||
const ACTION_LABELS: Record<string, string> = {
|
||||
ASSIGNMENT: 'Jury Assignment',
|
||||
FILTERING: 'Project Filtering',
|
||||
AWARD_ELIGIBILITY: 'Award Eligibility',
|
||||
MENTOR_MATCHING: 'Mentor Matching',
|
||||
}
|
||||
|
||||
function StatCard({
|
||||
label,
|
||||
value,
|
||||
subValue,
|
||||
icon: Icon,
|
||||
trend,
|
||||
}: {
|
||||
label: string
|
||||
value: string
|
||||
subValue?: string
|
||||
icon: typeof Zap
|
||||
trend?: 'up' | 'down' | 'neutral'
|
||||
}) {
|
||||
return (
|
||||
<div className="flex items-start gap-3 rounded-lg border bg-card p-4">
|
||||
<div className="rounded-md bg-muted p-2">
|
||||
<Icon className="h-4 w-4 text-muted-foreground" />
|
||||
</div>
|
||||
<div className="flex-1 space-y-1">
|
||||
<p className="text-sm font-medium text-muted-foreground">{label}</p>
|
||||
<div className="flex items-baseline gap-2">
|
||||
<p className="text-2xl font-bold">{value}</p>
|
||||
{trend && trend !== 'neutral' && (
|
||||
<TrendingUp
|
||||
className={cn(
|
||||
'h-4 w-4',
|
||||
trend === 'up' ? 'text-green-500' : 'rotate-180 text-red-500'
|
||||
)}
|
||||
/>
|
||||
)}
|
||||
</div>
|
||||
{subValue && (
|
||||
<p className="text-xs text-muted-foreground">{subValue}</p>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
function UsageBar({
|
||||
label,
|
||||
value,
|
||||
maxValue,
|
||||
color,
|
||||
}: {
|
||||
label: string
|
||||
value: number
|
||||
maxValue: number
|
||||
color: string
|
||||
}) {
|
||||
const percentage = maxValue > 0 ? (value / maxValue) * 100 : 0
|
||||
|
||||
return (
|
||||
<div className="space-y-1">
|
||||
<div className="flex justify-between text-sm">
|
||||
<span className="text-muted-foreground">{label}</span>
|
||||
<span className="font-medium">{value.toLocaleString()}</span>
|
||||
</div>
|
||||
<div className="h-2 overflow-hidden rounded-full bg-muted">
|
||||
<div
|
||||
className={cn('h-full transition-all duration-500', color)}
|
||||
style={{ width: `${percentage}%` }}
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
)
|
||||
}
|
||||
|
||||
export function AIUsageCard() {
|
||||
const {
|
||||
data: monthCost,
|
||||
isLoading: monthLoading,
|
||||
} = trpc.settings.getAICurrentMonthCost.useQuery(undefined, {
|
||||
staleTime: 60 * 1000, // 1 minute
|
||||
})
|
||||
|
||||
const {
|
||||
data: stats,
|
||||
isLoading: statsLoading,
|
||||
} = trpc.settings.getAIUsageStats.useQuery({}, {
|
||||
staleTime: 60 * 1000,
|
||||
})
|
||||
|
||||
const {
|
||||
data: history,
|
||||
isLoading: historyLoading,
|
||||
} = trpc.settings.getAIUsageHistory.useQuery({ days: 30 }, {
|
||||
staleTime: 60 * 1000,
|
||||
})
|
||||
|
||||
const isLoading = monthLoading || statsLoading
|
||||
|
||||
if (isLoading) {
|
||||
return (
|
||||
<Card>
|
||||
<CardHeader>
|
||||
<CardTitle className="flex items-center gap-2">
|
||||
<Activity className="h-5 w-5" />
|
||||
AI Usage & Costs
|
||||
</CardTitle>
|
||||
<CardDescription>Loading usage data...</CardDescription>
|
||||
</CardHeader>
|
||||
<CardContent className="space-y-6">
|
||||
<div className="grid gap-4 sm:grid-cols-2">
|
||||
<Skeleton className="h-24" />
|
||||
<Skeleton className="h-24" />
|
||||
</div>
|
||||
<Skeleton className="h-32" />
|
||||
</CardContent>
|
||||
</Card>
|
||||
)
|
||||
}
|
||||
|
||||
const hasUsage = monthCost && monthCost.requestCount > 0
|
||||
const maxTokensByAction = stats?.byAction
|
||||
? Math.max(...Object.values(stats.byAction).map((a) => a.tokens))
|
||||
: 0
|
||||
|
||||
return (
|
||||
<Card>
|
||||
<CardHeader>
|
||||
<CardTitle className="flex items-center gap-2">
|
||||
<Activity className="h-5 w-5" />
|
||||
AI Usage & Costs
|
||||
</CardTitle>
|
||||
<CardDescription>
|
||||
Token usage and estimated costs for AI features
|
||||
</CardDescription>
|
||||
</CardHeader>
|
||||
<CardContent className="space-y-6">
|
||||
{/* Current month summary */}
|
||||
<div className="grid gap-4 sm:grid-cols-2 lg:grid-cols-3">
|
||||
<StatCard
|
||||
label="This Month Cost"
|
||||
value={monthCost?.costFormatted || '$0.00'}
|
||||
subValue={`${monthCost?.requestCount || 0} requests`}
|
||||
icon={Coins}
|
||||
/>
|
||||
<StatCard
|
||||
label="Tokens Used"
|
||||
value={monthCost?.tokens?.toLocaleString() || '0'}
|
||||
subValue="This month"
|
||||
icon={Zap}
|
||||
/>
|
||||
{stats && (
|
||||
<StatCard
|
||||
label="All-Time Cost"
|
||||
value={stats.totalCostFormatted || '$0.00'}
|
||||
subValue={`${stats.totalTokens?.toLocaleString() || 0} tokens`}
|
||||
icon={TrendingUp}
|
||||
/>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Usage by action */}
|
||||
{hasUsage && stats?.byAction && Object.keys(stats.byAction).length > 0 && (
|
||||
<div className="space-y-4">
|
||||
<h4 className="text-sm font-semibold">Usage by Feature</h4>
|
||||
<div className="space-y-3">
|
||||
{Object.entries(stats.byAction)
|
||||
.sort(([, a], [, b]) => b.tokens - a.tokens)
|
||||
.map(([action, data]) => {
|
||||
const Icon = ACTION_ICONS[action] || Zap
|
||||
return (
|
||||
<div key={action} className="flex items-center gap-3">
|
||||
<div className="rounded-md bg-muted p-1.5">
|
||||
<Icon className="h-3.5 w-3.5 text-muted-foreground" />
|
||||
</div>
|
||||
<div className="flex-1">
|
||||
<UsageBar
|
||||
label={ACTION_LABELS[action] || action}
|
||||
value={data.tokens}
|
||||
maxValue={maxTokensByAction}
|
||||
color="bg-primary"
|
||||
/>
|
||||
</div>
|
||||
<Badge variant="secondary" className="ml-2 text-xs">
|
||||
{(data as { costFormatted?: string }).costFormatted}
|
||||
</Badge>
|
||||
</div>
|
||||
)
|
||||
})}
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Usage by model */}
|
||||
{hasUsage && stats?.byModel && Object.keys(stats.byModel).length > 0 && (
|
||||
<div className="space-y-4">
|
||||
<h4 className="text-sm font-semibold">Usage by Model</h4>
|
||||
<div className="flex flex-wrap gap-2">
|
||||
{Object.entries(stats.byModel)
|
||||
.sort(([, a], [, b]) => b.cost - a.cost)
|
||||
.map(([model, data]) => (
|
||||
<Badge
|
||||
key={model}
|
||||
variant="outline"
|
||||
className="flex items-center gap-2"
|
||||
>
|
||||
<Brain className="h-3 w-3" />
|
||||
<span>{model}</span>
|
||||
<span className="text-muted-foreground">
|
||||
{(data as { costFormatted?: string }).costFormatted}
|
||||
</span>
|
||||
</Badge>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Usage history mini chart */}
|
||||
{hasUsage && history && history.length > 0 && (
|
||||
<div className="space-y-4">
|
||||
<h4 className="text-sm font-semibold">Last 30 Days</h4>
|
||||
<div className="flex h-16 items-end gap-0.5">
|
||||
{(() => {
|
||||
const maxCost = Math.max(...history.map((d) => d.cost), 0.001)
|
||||
return history.slice(-30).map((day, i) => {
|
||||
const height = (day.cost / maxCost) * 100
|
||||
return (
|
||||
<div
|
||||
key={day.date}
|
||||
className="group relative flex-1 cursor-pointer"
|
||||
title={`${day.date}: ${day.costFormatted}`}
|
||||
>
|
||||
<div
|
||||
className="w-full rounded-t bg-primary/60 transition-colors hover:bg-primary"
|
||||
style={{ height: `${Math.max(height, 4)}%` }}
|
||||
/>
|
||||
</div>
|
||||
)
|
||||
})
|
||||
})()}
|
||||
</div>
|
||||
<div className="flex justify-between text-xs text-muted-foreground">
|
||||
<span>{history[0]?.date}</span>
|
||||
<span>{history[history.length - 1]?.date}</span>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* No usage message */}
|
||||
{!hasUsage && (
|
||||
<div className="rounded-lg border border-dashed p-8 text-center">
|
||||
<Activity className="mx-auto h-8 w-8 text-muted-foreground" />
|
||||
<h4 className="mt-2 text-sm font-semibold">No AI usage yet</h4>
|
||||
<p className="mt-1 text-sm text-muted-foreground">
|
||||
AI usage will be tracked when you use filtering, assignments, or
|
||||
other AI-powered features.
|
||||
</p>
|
||||
</div>
|
||||
)}
|
||||
</CardContent>
|
||||
</Card>
|
||||
)
|
||||
}
|
||||
|
|
@ -19,6 +19,7 @@ import {
|
|||
Settings as SettingsIcon,
|
||||
} from 'lucide-react'
|
||||
import { AISettingsForm } from './ai-settings-form'
|
||||
import { AIUsageCard } from './ai-usage-card'
|
||||
import { BrandingSettingsForm } from './branding-settings-form'
|
||||
import { EmailSettingsForm } from './email-settings-form'
|
||||
import { StorageSettingsForm } from './storage-settings-form'
|
||||
|
|
@ -134,7 +135,7 @@ export function SettingsContent({ initialSettings }: SettingsContentProps) {
|
|||
</TabsTrigger>
|
||||
</TabsList>
|
||||
|
||||
<TabsContent value="ai">
|
||||
<TabsContent value="ai" className="space-y-6">
|
||||
<Card>
|
||||
<CardHeader>
|
||||
<CardTitle>AI Configuration</CardTitle>
|
||||
|
|
@ -146,6 +147,7 @@ export function SettingsContent({ initialSettings }: SettingsContentProps) {
|
|||
<AISettingsForm settings={aiSettings} />
|
||||
</CardContent>
|
||||
</Card>
|
||||
<AIUsageCard />
|
||||
</TabsContent>
|
||||
|
||||
<TabsContent value="branding">
|
||||
|
|
|
|||
|
|
@ -1,4 +1,5 @@
|
|||
import OpenAI from 'openai'
|
||||
import type { ChatCompletionCreateParamsNonStreaming } from 'openai/resources/chat/completions'
|
||||
import { prisma } from './prisma'
|
||||
|
||||
// OpenAI client singleton with lazy initialization
|
||||
|
|
@ -7,6 +8,103 @@ const globalForOpenAI = globalThis as unknown as {
|
|||
openaiInitialized: boolean
|
||||
}
|
||||
|
||||
// ─── Model Type Detection ────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Reasoning models that require different API parameters:
|
||||
* - Use max_completion_tokens instead of max_tokens
|
||||
* - Don't support response_format: json_object (must instruct JSON in prompt)
|
||||
* - Don't support temperature parameter
|
||||
* - Don't support system messages (use developer or user role instead)
|
||||
*/
|
||||
const REASONING_MODEL_PREFIXES = ['o1', 'o3', 'o4']
|
||||
|
||||
/**
|
||||
* Check if a model is a reasoning model (o1, o3, o4 series)
|
||||
*/
|
||||
export function isReasoningModel(model: string): boolean {
|
||||
const modelLower = model.toLowerCase()
|
||||
return REASONING_MODEL_PREFIXES.some(prefix =>
|
||||
modelLower.startsWith(prefix) ||
|
||||
modelLower.includes(`/${prefix}`) ||
|
||||
modelLower.includes(`-${prefix}`)
|
||||
)
|
||||
}
|
||||
|
||||
// ─── Chat Completion Parameter Builder ───────────────────────────────────────
|
||||
|
||||
type MessageRole = 'system' | 'user' | 'assistant' | 'developer'
|
||||
|
||||
export interface ChatCompletionOptions {
|
||||
messages: Array<{ role: MessageRole; content: string }>
|
||||
maxTokens?: number
|
||||
temperature?: number
|
||||
jsonMode?: boolean
|
||||
}
|
||||
|
||||
/**
|
||||
* Build chat completion parameters with correct settings for the model type.
|
||||
* Handles differences between standard models and reasoning models.
|
||||
*/
|
||||
export function buildCompletionParams(
|
||||
model: string,
|
||||
options: ChatCompletionOptions
|
||||
): ChatCompletionCreateParamsNonStreaming {
|
||||
const isReasoning = isReasoningModel(model)
|
||||
|
||||
// Convert messages for reasoning models (system -> developer)
|
||||
const messages = options.messages.map(msg => {
|
||||
if (isReasoning && msg.role === 'system') {
|
||||
return { role: 'developer' as const, content: msg.content }
|
||||
}
|
||||
return msg as { role: 'system' | 'user' | 'assistant' | 'developer'; content: string }
|
||||
})
|
||||
|
||||
// For reasoning models requesting JSON, append JSON instruction to last user message
|
||||
if (isReasoning && options.jsonMode) {
|
||||
// Find last user message index (polyfill for findLastIndex)
|
||||
let lastUserIdx = -1
|
||||
for (let i = messages.length - 1; i >= 0; i--) {
|
||||
if (messages[i].role === 'user') {
|
||||
lastUserIdx = i
|
||||
break
|
||||
}
|
||||
}
|
||||
if (lastUserIdx !== -1) {
|
||||
messages[lastUserIdx] = {
|
||||
...messages[lastUserIdx],
|
||||
content: messages[lastUserIdx].content + '\n\nIMPORTANT: Respond with valid JSON only, no other text.',
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const params: ChatCompletionCreateParamsNonStreaming = {
|
||||
model,
|
||||
messages: messages as ChatCompletionCreateParamsNonStreaming['messages'],
|
||||
}
|
||||
|
||||
// Token limit parameter differs between model types
|
||||
if (options.maxTokens) {
|
||||
if (isReasoning) {
|
||||
params.max_completion_tokens = options.maxTokens
|
||||
} else {
|
||||
params.max_tokens = options.maxTokens
|
||||
}
|
||||
}
|
||||
|
||||
// Reasoning models don't support temperature
|
||||
if (!isReasoning && options.temperature !== undefined) {
|
||||
params.temperature = options.temperature
|
||||
}
|
||||
|
||||
// Reasoning models don't support response_format: json_object
|
||||
if (!isReasoning && options.jsonMode) {
|
||||
params.response_format = { type: 'json_object' }
|
||||
}
|
||||
|
||||
return params
|
||||
}
|
||||
|
||||
/**
|
||||
* Get OpenAI API key from SystemSettings
|
||||
*/
|
||||
|
|
@ -118,13 +216,14 @@ export async function validateModel(modelId: string): Promise<{
|
|||
}
|
||||
}
|
||||
|
||||
// Try a minimal completion with the model
|
||||
await client.chat.completions.create({
|
||||
model: modelId,
|
||||
// Try a minimal completion with the model using correct parameters
|
||||
const params = buildCompletionParams(modelId, {
|
||||
messages: [{ role: 'user', content: 'test' }],
|
||||
max_tokens: 1,
|
||||
maxTokens: 1,
|
||||
})
|
||||
|
||||
await client.chat.completions.create(params)
|
||||
|
||||
return { valid: true }
|
||||
} catch (error) {
|
||||
const message = error instanceof Error ? error.message : 'Unknown error'
|
||||
|
|
@ -164,13 +263,14 @@ export async function testOpenAIConnection(): Promise<{
|
|||
// Get the configured model
|
||||
const configuredModel = await getConfiguredModel()
|
||||
|
||||
// Test with the configured model
|
||||
const response = await client.chat.completions.create({
|
||||
model: configuredModel,
|
||||
// Test with the configured model using correct parameters
|
||||
const params = buildCompletionParams(configuredModel, {
|
||||
messages: [{ role: 'user', content: 'Hello' }],
|
||||
max_tokens: 5,
|
||||
maxTokens: 5,
|
||||
})
|
||||
|
||||
const response = await client.chat.completions.create(params)
|
||||
|
||||
return {
|
||||
success: true,
|
||||
model: response.model,
|
||||
|
|
|
|||
|
|
@ -1,6 +1,20 @@
|
|||
import { z } from 'zod'
|
||||
import { router, adminProcedure, superAdminProcedure, protectedProcedure } from '../trpc'
|
||||
import { getWhatsAppProvider, getWhatsAppProviderType } from '@/lib/whatsapp'
|
||||
import { listAvailableModels, testOpenAIConnection, isReasoningModel } from '@/lib/openai'
|
||||
import { getAIUsageStats, getCurrentMonthCost, formatCost } from '@/server/utils/ai-usage'
|
||||
|
||||
/**
|
||||
* Categorize an OpenAI model for display
|
||||
*/
|
||||
function categorizeModel(modelId: string): string {
|
||||
const id = modelId.toLowerCase()
|
||||
if (id.startsWith('gpt-4o')) return 'gpt-4o'
|
||||
if (id.startsWith('gpt-4')) return 'gpt-4'
|
||||
if (id.startsWith('gpt-3.5')) return 'gpt-3.5'
|
||||
if (id.startsWith('o1') || id.startsWith('o3') || id.startsWith('o4')) return 'reasoning'
|
||||
return 'other'
|
||||
}
|
||||
|
||||
export const settingsRouter = router({
|
||||
/**
|
||||
|
|
@ -177,33 +191,47 @@ export const settingsRouter = router({
|
|||
}),
|
||||
|
||||
/**
|
||||
* Test AI connection
|
||||
* Test AI connection with the configured model
|
||||
*/
|
||||
testAIConnection: superAdminProcedure.mutation(async ({ ctx }) => {
|
||||
const apiKeySetting = await ctx.prisma.systemSettings.findUnique({
|
||||
where: { key: 'openai_api_key' },
|
||||
})
|
||||
testAIConnection: superAdminProcedure.mutation(async () => {
|
||||
const result = await testOpenAIConnection()
|
||||
return result
|
||||
}),
|
||||
|
||||
if (!apiKeySetting?.value) {
|
||||
return { success: false, error: 'API key not configured' }
|
||||
/**
|
||||
* List available AI models from OpenAI
|
||||
*/
|
||||
listAIModels: superAdminProcedure.query(async () => {
|
||||
const result = await listAvailableModels()
|
||||
|
||||
if (!result.success || !result.models) {
|
||||
return {
|
||||
success: false,
|
||||
error: result.error || 'Failed to fetch models',
|
||||
models: [],
|
||||
}
|
||||
}
|
||||
|
||||
try {
|
||||
// Test OpenAI connection with a minimal request
|
||||
const response = await fetch('https://api.openai.com/v1/models', {
|
||||
headers: {
|
||||
Authorization: `Bearer ${apiKeySetting.value}`,
|
||||
},
|
||||
// Categorize and annotate models
|
||||
const categorizedModels = result.models.map(model => ({
|
||||
id: model,
|
||||
name: model,
|
||||
isReasoning: isReasoningModel(model),
|
||||
category: categorizeModel(model),
|
||||
}))
|
||||
|
||||
// Sort: GPT-4o first, then other GPT-4, then GPT-3.5, then reasoning models
|
||||
const sorted = categorizedModels.sort((a, b) => {
|
||||
const order = ['gpt-4o', 'gpt-4', 'gpt-3.5', 'reasoning']
|
||||
const aOrder = order.findIndex(cat => a.category.startsWith(cat))
|
||||
const bOrder = order.findIndex(cat => b.category.startsWith(cat))
|
||||
if (aOrder !== bOrder) return aOrder - bOrder
|
||||
return a.id.localeCompare(b.id)
|
||||
})
|
||||
|
||||
if (response.ok) {
|
||||
return { success: true }
|
||||
} else {
|
||||
const error = await response.json()
|
||||
return { success: false, error: error.error?.message || 'Unknown error' }
|
||||
}
|
||||
} catch (error) {
|
||||
return { success: false, error: 'Connection failed' }
|
||||
return {
|
||||
success: true,
|
||||
models: sorted,
|
||||
}
|
||||
}),
|
||||
|
||||
|
|
@ -373,4 +401,105 @@ export const settingsRouter = router({
|
|||
),
|
||||
}
|
||||
}),
|
||||
|
||||
/**
|
||||
* Get AI usage statistics (admin only)
|
||||
*/
|
||||
getAIUsageStats: adminProcedure
|
||||
.input(
|
||||
z.object({
|
||||
startDate: z.string().datetime().optional(),
|
||||
endDate: z.string().datetime().optional(),
|
||||
})
|
||||
)
|
||||
.query(async ({ input }) => {
|
||||
const startDate = input.startDate ? new Date(input.startDate) : undefined
|
||||
const endDate = input.endDate ? new Date(input.endDate) : undefined
|
||||
|
||||
const stats = await getAIUsageStats(startDate, endDate)
|
||||
|
||||
return {
|
||||
totalTokens: stats.totalTokens,
|
||||
totalCost: stats.totalCost,
|
||||
totalCostFormatted: formatCost(stats.totalCost),
|
||||
byAction: Object.fromEntries(
|
||||
Object.entries(stats.byAction).map(([action, data]) => [
|
||||
action,
|
||||
{
|
||||
...data,
|
||||
costFormatted: formatCost(data.cost),
|
||||
},
|
||||
])
|
||||
),
|
||||
byModel: Object.fromEntries(
|
||||
Object.entries(stats.byModel).map(([model, data]) => [
|
||||
model,
|
||||
{
|
||||
...data,
|
||||
costFormatted: formatCost(data.cost),
|
||||
},
|
||||
])
|
||||
),
|
||||
}
|
||||
}),
|
||||
|
||||
/**
|
||||
* Get current month AI usage cost (admin only)
|
||||
*/
|
||||
getAICurrentMonthCost: adminProcedure.query(async () => {
|
||||
const { cost, tokens, requestCount } = await getCurrentMonthCost()
|
||||
|
||||
return {
|
||||
cost,
|
||||
costFormatted: formatCost(cost),
|
||||
tokens,
|
||||
requestCount,
|
||||
}
|
||||
}),
|
||||
|
||||
/**
|
||||
* Get AI usage history (last 30 days grouped by day)
|
||||
*/
|
||||
getAIUsageHistory: adminProcedure
|
||||
.input(
|
||||
z.object({
|
||||
days: z.number().min(1).max(90).default(30),
|
||||
})
|
||||
)
|
||||
.query(async ({ ctx, input }) => {
|
||||
const startDate = new Date()
|
||||
startDate.setDate(startDate.getDate() - input.days)
|
||||
startDate.setHours(0, 0, 0, 0)
|
||||
|
||||
const logs = await ctx.prisma.aIUsageLog.findMany({
|
||||
where: {
|
||||
createdAt: { gte: startDate },
|
||||
},
|
||||
select: {
|
||||
createdAt: true,
|
||||
totalTokens: true,
|
||||
estimatedCostUsd: true,
|
||||
action: true,
|
||||
},
|
||||
orderBy: { createdAt: 'asc' },
|
||||
})
|
||||
|
||||
// Group by day
|
||||
const dailyData: Record<string, { date: string; tokens: number; cost: number; count: number }> = {}
|
||||
|
||||
for (const log of logs) {
|
||||
const dateKey = log.createdAt.toISOString().split('T')[0]
|
||||
if (!dailyData[dateKey]) {
|
||||
dailyData[dateKey] = { date: dateKey, tokens: 0, cost: 0, count: 0 }
|
||||
}
|
||||
dailyData[dateKey].tokens += log.totalTokens
|
||||
dailyData[dateKey].cost += log.estimatedCostUsd?.toNumber() ?? 0
|
||||
dailyData[dateKey].count += 1
|
||||
}
|
||||
|
||||
return Object.values(dailyData).map((day) => ({
|
||||
...day,
|
||||
costFormatted: formatCost(day.cost),
|
||||
}))
|
||||
}),
|
||||
})
|
||||
|
|
|
|||
|
|
@ -3,17 +3,41 @@
|
|||
*
|
||||
* Uses GPT to analyze juror expertise and project requirements
|
||||
* to generate optimal assignment suggestions.
|
||||
*
|
||||
* Optimization:
|
||||
* - Batched processing (15 projects per batch)
|
||||
* - Description truncation (300 chars)
|
||||
* - Token tracking and cost logging
|
||||
*
|
||||
* GDPR Compliance:
|
||||
* - All data anonymized before AI processing
|
||||
* - IDs replaced with sequential identifiers
|
||||
* - No personal information sent to OpenAI
|
||||
*/
|
||||
|
||||
import { getOpenAI, getConfiguredModel } from '@/lib/openai'
|
||||
import { getOpenAI, getConfiguredModel, buildCompletionParams } from '@/lib/openai'
|
||||
import { logAIUsage, extractTokenUsage } from '@/server/utils/ai-usage'
|
||||
import { classifyAIError, createParseError, logAIError } from './ai-errors'
|
||||
import {
|
||||
anonymizeForAI,
|
||||
deanonymizeResults,
|
||||
validateAnonymization,
|
||||
DESCRIPTION_LIMITS,
|
||||
truncateAndSanitize,
|
||||
type AnonymizationResult,
|
||||
} from './anonymization'
|
||||
|
||||
// Types for AI assignment
|
||||
// ─── Constants ───────────────────────────────────────────────────────────────
|
||||
|
||||
const ASSIGNMENT_BATCH_SIZE = 15
|
||||
|
||||
// Optimized system prompt
|
||||
const ASSIGNMENT_SYSTEM_PROMPT = `Match jurors to projects by expertise. Return JSON assignments.
|
||||
Each: {juror_id, project_id, confidence_score: 0-1, expertise_match_score: 0-1, reasoning: str (1-2 sentences)}
|
||||
Distribute workload fairly. Avoid assigning jurors at capacity.`
|
||||
|
||||
// ─── Types ───────────────────────────────────────────────────────────────────
|
||||
|
||||
export interface AIAssignmentSuggestion {
|
||||
jurorId: string
|
||||
projectId: string
|
||||
|
|
@ -61,118 +85,71 @@ interface AssignmentConstraints {
|
|||
}>
|
||||
}
|
||||
|
||||
/**
|
||||
* System prompt for AI assignment
|
||||
*/
|
||||
const ASSIGNMENT_SYSTEM_PROMPT = `You are an expert at matching jury members to projects based on expertise alignment.
|
||||
|
||||
Your task is to suggest optimal juror-project assignments that:
|
||||
1. Match juror expertise tags with project tags and content
|
||||
2. Distribute workload fairly among jurors
|
||||
3. Ensure each project gets the required number of reviews
|
||||
4. Avoid assigning jurors who are already at their limit
|
||||
|
||||
For each suggestion, provide:
|
||||
- A confidence score (0-1) based on how well the juror's expertise matches the project
|
||||
- An expertise match score (0-1) based purely on tag/content alignment
|
||||
- A brief reasoning explaining why this is a good match
|
||||
|
||||
Return your response as a JSON array of assignments.`
|
||||
// ─── AI Processing ───────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Generate AI-powered assignment suggestions
|
||||
* Process a batch of projects for assignment suggestions
|
||||
*/
|
||||
export async function generateAIAssignments(
|
||||
jurors: JurorForAssignment[],
|
||||
projects: ProjectForAssignment[],
|
||||
constraints: AssignmentConstraints
|
||||
): Promise<AIAssignmentResult> {
|
||||
// Anonymize data before sending to AI
|
||||
const anonymizedData = anonymizeForAI(jurors, projects)
|
||||
async function processAssignmentBatch(
|
||||
openai: NonNullable<Awaited<ReturnType<typeof getOpenAI>>>,
|
||||
model: string,
|
||||
anonymizedData: AnonymizationResult,
|
||||
batchProjects: typeof anonymizedData.projects,
|
||||
batchMappings: typeof anonymizedData.projectMappings,
|
||||
constraints: AssignmentConstraints,
|
||||
userId?: string,
|
||||
entityId?: string
|
||||
): Promise<{
|
||||
suggestions: AIAssignmentSuggestion[]
|
||||
tokensUsed: number
|
||||
}> {
|
||||
const suggestions: AIAssignmentSuggestion[] = []
|
||||
let tokensUsed = 0
|
||||
|
||||
// Validate anonymization
|
||||
if (!validateAnonymization(anonymizedData)) {
|
||||
console.error('Anonymization validation failed, falling back to algorithm')
|
||||
return generateFallbackAssignments(jurors, projects, constraints)
|
||||
}
|
||||
|
||||
try {
|
||||
const openai = await getOpenAI()
|
||||
|
||||
if (!openai) {
|
||||
console.log('OpenAI not configured, using fallback algorithm')
|
||||
return generateFallbackAssignments(jurors, projects, constraints)
|
||||
}
|
||||
|
||||
const suggestions = await callAIForAssignments(
|
||||
openai,
|
||||
anonymizedData,
|
||||
constraints
|
||||
// Build prompt with batch-specific data
|
||||
const userPrompt = buildBatchPrompt(
|
||||
anonymizedData.jurors,
|
||||
batchProjects,
|
||||
constraints,
|
||||
anonymizedData.jurorMappings,
|
||||
batchMappings
|
||||
)
|
||||
|
||||
// De-anonymize results
|
||||
const deanonymizedSuggestions = deanonymizeResults(
|
||||
suggestions.map((s) => ({
|
||||
...s,
|
||||
jurorId: s.jurorId,
|
||||
projectId: s.projectId,
|
||||
})),
|
||||
anonymizedData.jurorMappings,
|
||||
anonymizedData.projectMappings
|
||||
).map((s) => ({
|
||||
jurorId: s.realJurorId,
|
||||
projectId: s.realProjectId,
|
||||
confidenceScore: s.confidenceScore,
|
||||
reasoning: s.reasoning,
|
||||
expertiseMatchScore: s.expertiseMatchScore,
|
||||
}))
|
||||
|
||||
return {
|
||||
success: true,
|
||||
suggestions: deanonymizedSuggestions,
|
||||
fallbackUsed: false,
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('AI assignment failed, using fallback:', error)
|
||||
return generateFallbackAssignments(jurors, projects, constraints)
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Call OpenAI API for assignment suggestions
|
||||
*/
|
||||
async function callAIForAssignments(
|
||||
openai: Awaited<ReturnType<typeof getOpenAI>>,
|
||||
anonymizedData: AnonymizationResult,
|
||||
constraints: AssignmentConstraints
|
||||
): Promise<AIAssignmentSuggestion[]> {
|
||||
if (!openai) {
|
||||
throw new Error('OpenAI client not available')
|
||||
}
|
||||
|
||||
// Build the user prompt
|
||||
const userPrompt = buildAssignmentPrompt(anonymizedData, constraints)
|
||||
|
||||
const model = await getConfiguredModel()
|
||||
|
||||
const response = await openai.chat.completions.create({
|
||||
model,
|
||||
try {
|
||||
const params = buildCompletionParams(model, {
|
||||
messages: [
|
||||
{ role: 'system', content: ASSIGNMENT_SYSTEM_PROMPT },
|
||||
{ role: 'user', content: userPrompt },
|
||||
],
|
||||
response_format: { type: 'json_object' },
|
||||
temperature: 0.3, // Lower temperature for more consistent results
|
||||
max_tokens: 4000,
|
||||
jsonMode: true,
|
||||
temperature: 0.3,
|
||||
maxTokens: 4000,
|
||||
})
|
||||
|
||||
const response = await openai.chat.completions.create(params)
|
||||
const usage = extractTokenUsage(response)
|
||||
tokensUsed = usage.totalTokens
|
||||
|
||||
// Log batch usage
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'ASSIGNMENT',
|
||||
entityType: 'Round',
|
||||
entityId,
|
||||
model,
|
||||
promptTokens: usage.promptTokens,
|
||||
completionTokens: usage.completionTokens,
|
||||
totalTokens: usage.totalTokens,
|
||||
batchSize: batchProjects.length,
|
||||
itemsProcessed: batchProjects.length,
|
||||
status: 'SUCCESS',
|
||||
})
|
||||
|
||||
const content = response.choices[0]?.message?.content
|
||||
|
||||
if (!content) {
|
||||
throw new Error('No response from AI')
|
||||
}
|
||||
|
||||
// Parse the response
|
||||
const parsed = JSON.parse(content) as {
|
||||
assignments: Array<{
|
||||
juror_id: string
|
||||
|
|
@ -183,31 +160,69 @@ async function callAIForAssignments(
|
|||
}>
|
||||
}
|
||||
|
||||
return (parsed.assignments || []).map((a) => ({
|
||||
// De-anonymize and add to suggestions
|
||||
const deanonymized = deanonymizeResults(
|
||||
(parsed.assignments || []).map((a) => ({
|
||||
jurorId: a.juror_id,
|
||||
projectId: a.project_id,
|
||||
confidenceScore: Math.min(1, Math.max(0, a.confidence_score)),
|
||||
expertiseMatchScore: Math.min(1, Math.max(0, a.expertise_match_score)),
|
||||
reasoning: a.reasoning,
|
||||
}))
|
||||
})),
|
||||
anonymizedData.jurorMappings,
|
||||
batchMappings
|
||||
)
|
||||
|
||||
for (const item of deanonymized) {
|
||||
suggestions.push({
|
||||
jurorId: item.realJurorId,
|
||||
projectId: item.realProjectId,
|
||||
confidenceScore: item.confidenceScore,
|
||||
reasoning: item.reasoning,
|
||||
expertiseMatchScore: item.expertiseMatchScore,
|
||||
})
|
||||
}
|
||||
|
||||
} catch (error) {
|
||||
if (error instanceof SyntaxError) {
|
||||
const parseError = createParseError(error.message)
|
||||
logAIError('Assignment', 'batch processing', parseError)
|
||||
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'ASSIGNMENT',
|
||||
entityType: 'Round',
|
||||
entityId,
|
||||
model,
|
||||
promptTokens: 0,
|
||||
completionTokens: 0,
|
||||
totalTokens: tokensUsed,
|
||||
batchSize: batchProjects.length,
|
||||
itemsProcessed: 0,
|
||||
status: 'ERROR',
|
||||
errorMessage: parseError.message,
|
||||
})
|
||||
} else {
|
||||
throw error
|
||||
}
|
||||
}
|
||||
|
||||
return { suggestions, tokensUsed }
|
||||
}
|
||||
|
||||
/**
|
||||
* Build the prompt for AI assignment
|
||||
* Build prompt for a batch of projects
|
||||
*/
|
||||
function buildAssignmentPrompt(
|
||||
data: AnonymizationResult,
|
||||
constraints: AssignmentConstraints
|
||||
function buildBatchPrompt(
|
||||
jurors: AnonymizationResult['jurors'],
|
||||
projects: AnonymizationResult['projects'],
|
||||
constraints: AssignmentConstraints,
|
||||
jurorMappings: AnonymizationResult['jurorMappings'],
|
||||
projectMappings: AnonymizationResult['projectMappings']
|
||||
): string {
|
||||
const { jurors, projects } = data
|
||||
|
||||
// Map existing assignments to anonymous IDs
|
||||
const jurorIdMap = new Map(
|
||||
data.jurorMappings.map((m) => [m.realId, m.anonymousId])
|
||||
)
|
||||
const projectIdMap = new Map(
|
||||
data.projectMappings.map((m) => [m.realId, m.anonymousId])
|
||||
)
|
||||
const jurorIdMap = new Map(jurorMappings.map((m) => [m.realId, m.anonymousId]))
|
||||
const projectIdMap = new Map(projectMappings.map((m) => [m.realId, m.anonymousId]))
|
||||
|
||||
const anonymousExisting = constraints.existingAssignments
|
||||
.map((a) => ({
|
||||
|
|
@ -216,29 +231,110 @@ function buildAssignmentPrompt(
|
|||
}))
|
||||
.filter((a) => a.jurorId && a.projectId)
|
||||
|
||||
return `## Jurors Available
|
||||
${JSON.stringify(jurors, null, 2)}
|
||||
|
||||
## Projects to Assign
|
||||
${JSON.stringify(projects, null, 2)}
|
||||
|
||||
## Constraints
|
||||
- Each project needs ${constraints.requiredReviewsPerProject} reviews
|
||||
- Maximum assignments per juror: ${constraints.maxAssignmentsPerJuror || 'No limit'}
|
||||
- Existing assignments to avoid duplicating:
|
||||
${JSON.stringify(anonymousExisting, null, 2)}
|
||||
|
||||
## Instructions
|
||||
Generate optimal juror-project assignments. Return a JSON object with an "assignments" array where each assignment has:
|
||||
- juror_id: The anonymous juror ID
|
||||
- project_id: The anonymous project ID
|
||||
- confidence_score: 0-1 confidence in this match
|
||||
- expertise_match_score: 0-1 expertise alignment score
|
||||
- reasoning: Brief explanation (1-2 sentences)
|
||||
|
||||
Focus on matching expertise tags with project tags and descriptions. Distribute assignments fairly.`
|
||||
return `JURORS: ${JSON.stringify(jurors)}
|
||||
PROJECTS: ${JSON.stringify(projects)}
|
||||
CONSTRAINTS: ${constraints.requiredReviewsPerProject} reviews/project, max ${constraints.maxAssignmentsPerJuror || 'unlimited'}/juror
|
||||
EXISTING: ${JSON.stringify(anonymousExisting)}
|
||||
Return JSON: {"assignments": [...]}`
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate AI-powered assignment suggestions with batching
|
||||
*/
|
||||
export async function generateAIAssignments(
|
||||
jurors: JurorForAssignment[],
|
||||
projects: ProjectForAssignment[],
|
||||
constraints: AssignmentConstraints,
|
||||
userId?: string,
|
||||
entityId?: string
|
||||
): Promise<AIAssignmentResult> {
|
||||
// Truncate descriptions before anonymization
|
||||
const truncatedProjects = projects.map((p) => ({
|
||||
...p,
|
||||
description: truncateAndSanitize(p.description, DESCRIPTION_LIMITS.ASSIGNMENT),
|
||||
}))
|
||||
|
||||
// Anonymize data before sending to AI
|
||||
const anonymizedData = anonymizeForAI(jurors, truncatedProjects)
|
||||
|
||||
// Validate anonymization
|
||||
if (!validateAnonymization(anonymizedData)) {
|
||||
console.error('[AI Assignment] Anonymization validation failed, falling back to algorithm')
|
||||
return generateFallbackAssignments(jurors, projects, constraints)
|
||||
}
|
||||
|
||||
try {
|
||||
const openai = await getOpenAI()
|
||||
|
||||
if (!openai) {
|
||||
console.log('[AI Assignment] OpenAI not configured, using fallback algorithm')
|
||||
return generateFallbackAssignments(jurors, projects, constraints)
|
||||
}
|
||||
|
||||
const model = await getConfiguredModel()
|
||||
console.log(`[AI Assignment] Using model: ${model} for ${projects.length} projects in batches of ${ASSIGNMENT_BATCH_SIZE}`)
|
||||
|
||||
const allSuggestions: AIAssignmentSuggestion[] = []
|
||||
let totalTokens = 0
|
||||
|
||||
// Process projects in batches
|
||||
for (let i = 0; i < anonymizedData.projects.length; i += ASSIGNMENT_BATCH_SIZE) {
|
||||
const batchProjects = anonymizedData.projects.slice(i, i + ASSIGNMENT_BATCH_SIZE)
|
||||
const batchMappings = anonymizedData.projectMappings.slice(i, i + ASSIGNMENT_BATCH_SIZE)
|
||||
|
||||
console.log(`[AI Assignment] Processing batch ${Math.floor(i / ASSIGNMENT_BATCH_SIZE) + 1}/${Math.ceil(anonymizedData.projects.length / ASSIGNMENT_BATCH_SIZE)}`)
|
||||
|
||||
const { suggestions, tokensUsed } = await processAssignmentBatch(
|
||||
openai,
|
||||
model,
|
||||
anonymizedData,
|
||||
batchProjects,
|
||||
batchMappings,
|
||||
constraints,
|
||||
userId,
|
||||
entityId
|
||||
)
|
||||
|
||||
allSuggestions.push(...suggestions)
|
||||
totalTokens += tokensUsed
|
||||
}
|
||||
|
||||
console.log(`[AI Assignment] Completed. Total suggestions: ${allSuggestions.length}, Total tokens: ${totalTokens}`)
|
||||
|
||||
return {
|
||||
success: true,
|
||||
suggestions: allSuggestions,
|
||||
tokensUsed: totalTokens,
|
||||
fallbackUsed: false,
|
||||
}
|
||||
|
||||
} catch (error) {
|
||||
const classified = classifyAIError(error)
|
||||
logAIError('Assignment', 'generateAIAssignments', classified)
|
||||
|
||||
// Log failed attempt
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'ASSIGNMENT',
|
||||
entityType: 'Round',
|
||||
entityId,
|
||||
model: 'unknown',
|
||||
promptTokens: 0,
|
||||
completionTokens: 0,
|
||||
totalTokens: 0,
|
||||
batchSize: projects.length,
|
||||
itemsProcessed: 0,
|
||||
status: 'ERROR',
|
||||
errorMessage: classified.message,
|
||||
})
|
||||
|
||||
console.error('[AI Assignment] AI assignment failed, using fallback:', classified.message)
|
||||
return generateFallbackAssignments(jurors, projects, constraints)
|
||||
}
|
||||
}
|
||||
|
||||
// ─── Fallback Algorithm ──────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Fallback algorithm-based assignment when AI is unavailable
|
||||
*/
|
||||
|
|
|
|||
|
|
@ -4,9 +4,33 @@
|
|||
* Determines project eligibility for special awards using:
|
||||
* - Deterministic field matching (tags, country, category)
|
||||
* - AI interpretation of plain-language criteria
|
||||
*
|
||||
* GDPR Compliance:
|
||||
* - All project data is anonymized before AI processing
|
||||
* - IDs replaced with sequential identifiers
|
||||
* - No personal information sent to OpenAI
|
||||
*/
|
||||
|
||||
import { getOpenAI, getConfiguredModel } from '@/lib/openai'
|
||||
import { getOpenAI, getConfiguredModel, buildCompletionParams } from '@/lib/openai'
|
||||
import { logAIUsage, extractTokenUsage } from '@/server/utils/ai-usage'
|
||||
import { classifyAIError, createParseError, logAIError } from './ai-errors'
|
||||
import {
|
||||
anonymizeProjectsForAI,
|
||||
validateAnonymizedProjects,
|
||||
type ProjectWithRelations,
|
||||
type AnonymizedProjectForAI,
|
||||
type ProjectAIMapping,
|
||||
} from './anonymization'
|
||||
import type { SubmissionSource } from '@prisma/client'
|
||||
|
||||
// ─── Constants ───────────────────────────────────────────────────────────────
|
||||
|
||||
const BATCH_SIZE = 20
|
||||
|
||||
// Optimized system prompt
|
||||
const AI_ELIGIBILITY_SYSTEM_PROMPT = `Award eligibility evaluator. Evaluate projects against criteria, return JSON.
|
||||
Format: {"evaluations": [{project_id, eligible: bool, confidence: 0-1, reasoning: str}]}
|
||||
Be objective. Base evaluation only on provided data. No personal identifiers in reasoning.`
|
||||
|
||||
// ─── Types ──────────────────────────────────────────────────────────────────
|
||||
|
||||
|
|
@ -33,6 +57,16 @@ interface ProjectForEligibility {
|
|||
geographicZone?: string | null
|
||||
tags: string[]
|
||||
oceanIssue?: string | null
|
||||
institution?: string | null
|
||||
foundedAt?: Date | null
|
||||
wantsMentorship?: boolean
|
||||
submissionSource?: SubmissionSource
|
||||
submittedAt?: Date | null
|
||||
_count?: {
|
||||
teamMembers?: number
|
||||
files?: number
|
||||
}
|
||||
files?: Array<{ fileType: string | null }>
|
||||
}
|
||||
|
||||
// ─── Auto Tag Rules ─────────────────────────────────────────────────────────
|
||||
|
|
@ -97,32 +131,162 @@ function getFieldValue(
|
|||
|
||||
// ─── AI Criteria Interpretation ─────────────────────────────────────────────
|
||||
|
||||
const AI_ELIGIBILITY_SYSTEM_PROMPT = `You are a special award eligibility evaluator. Given a list of projects and award criteria, determine which projects are eligible.
|
||||
|
||||
Return a JSON object with this structure:
|
||||
{
|
||||
"evaluations": [
|
||||
{
|
||||
"project_id": "string",
|
||||
"eligible": boolean,
|
||||
"confidence": number (0-1),
|
||||
"reasoning": "string"
|
||||
/**
|
||||
* Convert project to enhanced format for anonymization
|
||||
*/
|
||||
function toProjectWithRelations(project: ProjectForEligibility): ProjectWithRelations {
|
||||
return {
|
||||
id: project.id,
|
||||
title: project.title,
|
||||
description: project.description,
|
||||
competitionCategory: project.competitionCategory as any,
|
||||
oceanIssue: project.oceanIssue as any,
|
||||
country: project.country,
|
||||
geographicZone: project.geographicZone,
|
||||
institution: project.institution,
|
||||
tags: project.tags,
|
||||
foundedAt: project.foundedAt,
|
||||
wantsMentorship: project.wantsMentorship ?? false,
|
||||
submissionSource: project.submissionSource ?? 'MANUAL',
|
||||
submittedAt: project.submittedAt,
|
||||
_count: {
|
||||
teamMembers: project._count?.teamMembers ?? 0,
|
||||
files: project._count?.files ?? 0,
|
||||
},
|
||||
files: project.files?.map(f => ({ fileType: f.fileType as any })) ?? [],
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
Be fair, objective, and base your evaluation only on the provided information. Do not include personal identifiers in reasoning.`
|
||||
/**
|
||||
* Process a batch for AI eligibility evaluation
|
||||
*/
|
||||
async function processEligibilityBatch(
|
||||
openai: NonNullable<Awaited<ReturnType<typeof getOpenAI>>>,
|
||||
model: string,
|
||||
criteriaText: string,
|
||||
anonymized: AnonymizedProjectForAI[],
|
||||
mappings: ProjectAIMapping[],
|
||||
userId?: string,
|
||||
entityId?: string
|
||||
): Promise<{
|
||||
results: EligibilityResult[]
|
||||
tokensUsed: number
|
||||
}> {
|
||||
const results: EligibilityResult[] = []
|
||||
let tokensUsed = 0
|
||||
|
||||
const userPrompt = `CRITERIA: ${criteriaText}
|
||||
PROJECTS: ${JSON.stringify(anonymized)}
|
||||
Evaluate eligibility for each project.`
|
||||
|
||||
try {
|
||||
const params = buildCompletionParams(model, {
|
||||
messages: [
|
||||
{ role: 'system', content: AI_ELIGIBILITY_SYSTEM_PROMPT },
|
||||
{ role: 'user', content: userPrompt },
|
||||
],
|
||||
jsonMode: true,
|
||||
temperature: 0.3,
|
||||
maxTokens: 4000,
|
||||
})
|
||||
|
||||
const response = await openai.chat.completions.create(params)
|
||||
const usage = extractTokenUsage(response)
|
||||
tokensUsed = usage.totalTokens
|
||||
|
||||
// Log usage
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'AWARD_ELIGIBILITY',
|
||||
entityType: 'Award',
|
||||
entityId,
|
||||
model,
|
||||
promptTokens: usage.promptTokens,
|
||||
completionTokens: usage.completionTokens,
|
||||
totalTokens: usage.totalTokens,
|
||||
batchSize: anonymized.length,
|
||||
itemsProcessed: anonymized.length,
|
||||
status: 'SUCCESS',
|
||||
})
|
||||
|
||||
const content = response.choices[0]?.message?.content
|
||||
if (!content) {
|
||||
throw new Error('Empty response from AI')
|
||||
}
|
||||
|
||||
const parsed = JSON.parse(content) as {
|
||||
evaluations: Array<{
|
||||
project_id: string
|
||||
eligible: boolean
|
||||
confidence: number
|
||||
reasoning: string
|
||||
}>
|
||||
}
|
||||
|
||||
// Map results back to real IDs
|
||||
for (const eval_ of parsed.evaluations || []) {
|
||||
const mapping = mappings.find((m) => m.anonymousId === eval_.project_id)
|
||||
if (mapping) {
|
||||
results.push({
|
||||
projectId: mapping.realId,
|
||||
eligible: eval_.eligible,
|
||||
confidence: eval_.confidence,
|
||||
reasoning: eval_.reasoning,
|
||||
method: 'AI',
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
} catch (error) {
|
||||
if (error instanceof SyntaxError) {
|
||||
const parseError = createParseError(error.message)
|
||||
logAIError('AwardEligibility', 'batch processing', parseError)
|
||||
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'AWARD_ELIGIBILITY',
|
||||
entityType: 'Award',
|
||||
entityId,
|
||||
model,
|
||||
promptTokens: 0,
|
||||
completionTokens: 0,
|
||||
totalTokens: tokensUsed,
|
||||
batchSize: anonymized.length,
|
||||
itemsProcessed: 0,
|
||||
status: 'ERROR',
|
||||
errorMessage: parseError.message,
|
||||
})
|
||||
|
||||
// Flag all for manual review
|
||||
for (const mapping of mappings) {
|
||||
results.push({
|
||||
projectId: mapping.realId,
|
||||
eligible: false,
|
||||
confidence: 0,
|
||||
reasoning: 'AI response parse error — requires manual review',
|
||||
method: 'AI',
|
||||
})
|
||||
}
|
||||
} else {
|
||||
throw error
|
||||
}
|
||||
}
|
||||
|
||||
return { results, tokensUsed }
|
||||
}
|
||||
|
||||
export async function aiInterpretCriteria(
|
||||
criteriaText: string,
|
||||
projects: ProjectForEligibility[]
|
||||
projects: ProjectForEligibility[],
|
||||
userId?: string,
|
||||
awardId?: string
|
||||
): Promise<EligibilityResult[]> {
|
||||
const results: EligibilityResult[] = []
|
||||
|
||||
try {
|
||||
const openai = await getOpenAI()
|
||||
if (!openai) {
|
||||
// No OpenAI — mark all as needing manual review
|
||||
console.warn('[AI Eligibility] OpenAI not configured')
|
||||
return projects.map((p) => ({
|
||||
projectId: p.id,
|
||||
eligible: false,
|
||||
|
|
@ -133,91 +297,69 @@ export async function aiInterpretCriteria(
|
|||
}
|
||||
|
||||
const model = await getConfiguredModel()
|
||||
console.log(`[AI Eligibility] Using model: ${model} for ${projects.length} projects`)
|
||||
|
||||
// Anonymize and batch
|
||||
const anonymized = projects.map((p, i) => ({
|
||||
project_id: `P${i + 1}`,
|
||||
real_id: p.id,
|
||||
title: p.title,
|
||||
description: p.description?.slice(0, 500) || '',
|
||||
category: p.competitionCategory || 'Unknown',
|
||||
ocean_issue: p.oceanIssue || 'Unknown',
|
||||
country: p.country || 'Unknown',
|
||||
region: p.geographicZone || 'Unknown',
|
||||
tags: p.tags.join(', '),
|
||||
}))
|
||||
// Convert and anonymize projects
|
||||
const projectsWithRelations = projects.map(toProjectWithRelations)
|
||||
const { anonymized, mappings } = anonymizeProjectsForAI(projectsWithRelations, 'ELIGIBILITY')
|
||||
|
||||
const batchSize = 20
|
||||
for (let i = 0; i < anonymized.length; i += batchSize) {
|
||||
const batch = anonymized.slice(i, i + batchSize)
|
||||
// Validate anonymization
|
||||
if (!validateAnonymizedProjects(anonymized)) {
|
||||
console.error('[AI Eligibility] Anonymization validation failed')
|
||||
throw new Error('GDPR compliance check failed: PII detected in anonymized data')
|
||||
}
|
||||
|
||||
const userPrompt = `Award criteria: ${criteriaText}
|
||||
let totalTokens = 0
|
||||
|
||||
Projects to evaluate:
|
||||
${JSON.stringify(
|
||||
batch.map(({ real_id, ...rest }) => rest),
|
||||
null,
|
||||
2
|
||||
)}
|
||||
// Process in batches
|
||||
for (let i = 0; i < anonymized.length; i += BATCH_SIZE) {
|
||||
const batchAnon = anonymized.slice(i, i + BATCH_SIZE)
|
||||
const batchMappings = mappings.slice(i, i + BATCH_SIZE)
|
||||
|
||||
Evaluate each project against the award criteria.`
|
||||
console.log(`[AI Eligibility] Processing batch ${Math.floor(i / BATCH_SIZE) + 1}/${Math.ceil(anonymized.length / BATCH_SIZE)}`)
|
||||
|
||||
const response = await openai.chat.completions.create({
|
||||
const { results: batchResults, tokensUsed } = await processEligibilityBatch(
|
||||
openai,
|
||||
model,
|
||||
messages: [
|
||||
{ role: 'system', content: AI_ELIGIBILITY_SYSTEM_PROMPT },
|
||||
{ role: 'user', content: userPrompt },
|
||||
],
|
||||
response_format: { type: 'json_object' },
|
||||
temperature: 0.3,
|
||||
max_tokens: 4000,
|
||||
})
|
||||
criteriaText,
|
||||
batchAnon,
|
||||
batchMappings,
|
||||
userId,
|
||||
awardId
|
||||
)
|
||||
|
||||
const content = response.choices[0]?.message?.content
|
||||
if (content) {
|
||||
try {
|
||||
const parsed = JSON.parse(content) as {
|
||||
evaluations: Array<{
|
||||
project_id: string
|
||||
eligible: boolean
|
||||
confidence: number
|
||||
reasoning: string
|
||||
}>
|
||||
results.push(...batchResults)
|
||||
totalTokens += tokensUsed
|
||||
}
|
||||
|
||||
for (const eval_ of parsed.evaluations) {
|
||||
const anon = batch.find((b) => b.project_id === eval_.project_id)
|
||||
if (anon) {
|
||||
results.push({
|
||||
projectId: anon.real_id,
|
||||
eligible: eval_.eligible,
|
||||
confidence: eval_.confidence,
|
||||
reasoning: eval_.reasoning,
|
||||
method: 'AI',
|
||||
console.log(`[AI Eligibility] Completed. Total tokens: ${totalTokens}`)
|
||||
|
||||
} catch (error) {
|
||||
const classified = classifyAIError(error)
|
||||
logAIError('AwardEligibility', 'aiInterpretCriteria', classified)
|
||||
|
||||
// Log failed attempt
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'AWARD_ELIGIBILITY',
|
||||
entityType: 'Award',
|
||||
entityId: awardId,
|
||||
model: 'unknown',
|
||||
promptTokens: 0,
|
||||
completionTokens: 0,
|
||||
totalTokens: 0,
|
||||
batchSize: projects.length,
|
||||
itemsProcessed: 0,
|
||||
status: 'ERROR',
|
||||
errorMessage: classified.message,
|
||||
})
|
||||
}
|
||||
}
|
||||
} catch {
|
||||
// Parse error — mark batch for manual review
|
||||
for (const item of batch) {
|
||||
results.push({
|
||||
projectId: item.real_id,
|
||||
eligible: false,
|
||||
confidence: 0,
|
||||
reasoning: 'AI response parse error — requires manual review',
|
||||
method: 'AI',
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
} catch {
|
||||
// OpenAI error — mark all for manual review
|
||||
|
||||
// Return all as needing manual review
|
||||
return projects.map((p) => ({
|
||||
projectId: p.id,
|
||||
eligible: false,
|
||||
confidence: 0,
|
||||
reasoning: 'AI error — requires manual eligibility review',
|
||||
reasoning: `AI error: ${classified.message}`,
|
||||
method: 'AI' as const,
|
||||
}))
|
||||
}
|
||||
|
|
|
|||
|
|
@ -0,0 +1,318 @@
|
|||
/**
|
||||
* AI Error Classification Service
|
||||
*
|
||||
* Provides unified error handling and classification for all AI services.
|
||||
* Converts technical API errors into user-friendly messages.
|
||||
*/
|
||||
|
||||
// ─── Error Types ─────────────────────────────────────────────────────────────
|
||||
|
||||
export type AIErrorType =
|
||||
| 'rate_limit'
|
||||
| 'quota_exceeded'
|
||||
| 'model_not_found'
|
||||
| 'invalid_api_key'
|
||||
| 'context_length'
|
||||
| 'parse_error'
|
||||
| 'timeout'
|
||||
| 'network_error'
|
||||
| 'content_filter'
|
||||
| 'server_error'
|
||||
| 'unknown'
|
||||
|
||||
export interface ClassifiedError {
|
||||
type: AIErrorType
|
||||
message: string
|
||||
originalMessage: string
|
||||
retryable: boolean
|
||||
suggestedAction?: string
|
||||
}
|
||||
|
||||
// ─── Error Patterns ──────────────────────────────────────────────────────────
|
||||
|
||||
interface ErrorPattern {
|
||||
type: AIErrorType
|
||||
patterns: Array<string | RegExp>
|
||||
retryable: boolean
|
||||
userMessage: string
|
||||
suggestedAction?: string
|
||||
}
|
||||
|
||||
const ERROR_PATTERNS: ErrorPattern[] = [
|
||||
{
|
||||
type: 'rate_limit',
|
||||
patterns: [
|
||||
'rate_limit',
|
||||
'rate limit',
|
||||
'too many requests',
|
||||
'429',
|
||||
'quota exceeded',
|
||||
'Rate limit reached',
|
||||
],
|
||||
retryable: true,
|
||||
userMessage: 'Rate limit exceeded. Please wait a few minutes and try again.',
|
||||
suggestedAction: 'Wait 1-2 minutes before retrying, or reduce batch size.',
|
||||
},
|
||||
{
|
||||
type: 'quota_exceeded',
|
||||
patterns: [
|
||||
'insufficient_quota',
|
||||
'billing',
|
||||
'exceeded your current quota',
|
||||
'payment required',
|
||||
'account deactivated',
|
||||
],
|
||||
retryable: false,
|
||||
userMessage: 'API quota exceeded. Please check your OpenAI billing settings.',
|
||||
suggestedAction: 'Add payment method or increase spending limit in OpenAI dashboard.',
|
||||
},
|
||||
{
|
||||
type: 'model_not_found',
|
||||
patterns: [
|
||||
'model_not_found',
|
||||
'does not exist',
|
||||
'The model',
|
||||
'invalid model',
|
||||
'model not available',
|
||||
],
|
||||
retryable: false,
|
||||
userMessage: 'The selected AI model is not available. Please check your settings.',
|
||||
suggestedAction: 'Go to Settings → AI and select a different model.',
|
||||
},
|
||||
{
|
||||
type: 'invalid_api_key',
|
||||
patterns: [
|
||||
'invalid_api_key',
|
||||
'Incorrect API key',
|
||||
'authentication',
|
||||
'unauthorized',
|
||||
'401',
|
||||
'invalid api key',
|
||||
],
|
||||
retryable: false,
|
||||
userMessage: 'Invalid API key. Please check your OpenAI API key in settings.',
|
||||
suggestedAction: 'Go to Settings → AI and enter a valid API key.',
|
||||
},
|
||||
{
|
||||
type: 'context_length',
|
||||
patterns: [
|
||||
'context_length',
|
||||
'maximum context length',
|
||||
'tokens',
|
||||
'too long',
|
||||
'reduce the length',
|
||||
'max_tokens',
|
||||
],
|
||||
retryable: true,
|
||||
userMessage: 'Request too large. Try processing fewer items at once.',
|
||||
suggestedAction: 'Process items in smaller batches.',
|
||||
},
|
||||
{
|
||||
type: 'content_filter',
|
||||
patterns: [
|
||||
'content_filter',
|
||||
'content policy',
|
||||
'flagged',
|
||||
'inappropriate',
|
||||
'safety system',
|
||||
],
|
||||
retryable: false,
|
||||
userMessage: 'Content was flagged by the AI safety system. Please review the input data.',
|
||||
suggestedAction: 'Check project descriptions for potentially sensitive content.',
|
||||
},
|
||||
{
|
||||
type: 'timeout',
|
||||
patterns: [
|
||||
'timeout',
|
||||
'timed out',
|
||||
'ETIMEDOUT',
|
||||
'ECONNABORTED',
|
||||
'deadline exceeded',
|
||||
],
|
||||
retryable: true,
|
||||
userMessage: 'Request timed out. Please try again.',
|
||||
suggestedAction: 'Try again or process fewer items at once.',
|
||||
},
|
||||
{
|
||||
type: 'network_error',
|
||||
patterns: [
|
||||
'ENOTFOUND',
|
||||
'ECONNREFUSED',
|
||||
'network',
|
||||
'connection',
|
||||
'DNS',
|
||||
'getaddrinfo',
|
||||
],
|
||||
retryable: true,
|
||||
userMessage: 'Network error. Please check your connection and try again.',
|
||||
suggestedAction: 'Check network connectivity and firewall settings.',
|
||||
},
|
||||
{
|
||||
type: 'server_error',
|
||||
patterns: [
|
||||
'500',
|
||||
'502',
|
||||
'503',
|
||||
'504',
|
||||
'internal error',
|
||||
'server error',
|
||||
'service unavailable',
|
||||
],
|
||||
retryable: true,
|
||||
userMessage: 'OpenAI service temporarily unavailable. Please try again later.',
|
||||
suggestedAction: 'Wait a few minutes and retry. Check status.openai.com for outages.',
|
||||
},
|
||||
]
|
||||
|
||||
// ─── Error Classification ────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Classify an error from the OpenAI API
|
||||
*/
|
||||
export function classifyAIError(error: Error | unknown): ClassifiedError {
|
||||
const errorMessage = error instanceof Error ? error.message : String(error)
|
||||
const errorString = errorMessage.toLowerCase()
|
||||
|
||||
// Check against known patterns
|
||||
for (const pattern of ERROR_PATTERNS) {
|
||||
for (const matcher of pattern.patterns) {
|
||||
const matches =
|
||||
typeof matcher === 'string'
|
||||
? errorString.includes(matcher.toLowerCase())
|
||||
: matcher.test(errorString)
|
||||
|
||||
if (matches) {
|
||||
return {
|
||||
type: pattern.type,
|
||||
message: pattern.userMessage,
|
||||
originalMessage: errorMessage,
|
||||
retryable: pattern.retryable,
|
||||
suggestedAction: pattern.suggestedAction,
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Unknown error
|
||||
return {
|
||||
type: 'unknown',
|
||||
message: 'An unexpected error occurred. Please try again.',
|
||||
originalMessage: errorMessage,
|
||||
retryable: true,
|
||||
suggestedAction: 'If the problem persists, check the AI settings or contact support.',
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Check if an error is a JSON parse error
|
||||
*/
|
||||
export function isParseError(error: Error | unknown): boolean {
|
||||
const message = error instanceof Error ? error.message : String(error)
|
||||
return (
|
||||
message.includes('JSON') ||
|
||||
message.includes('parse') ||
|
||||
message.includes('Unexpected token') ||
|
||||
message.includes('SyntaxError')
|
||||
)
|
||||
}
|
||||
|
||||
/**
|
||||
* Create a classified parse error
|
||||
*/
|
||||
export function createParseError(originalMessage: string): ClassifiedError {
|
||||
return {
|
||||
type: 'parse_error',
|
||||
message: 'AI returned an invalid response. Items flagged for manual review.',
|
||||
originalMessage,
|
||||
retryable: true,
|
||||
suggestedAction: 'Review flagged items manually. Consider using a different model.',
|
||||
}
|
||||
}
|
||||
|
||||
// ─── User-Friendly Messages ──────────────────────────────────────────────────
|
||||
|
||||
const USER_FRIENDLY_MESSAGES: Record<AIErrorType, string> = {
|
||||
rate_limit: 'Rate limit exceeded. Please wait a few minutes and try again.',
|
||||
quota_exceeded: 'API quota exceeded. Please check your OpenAI billing settings.',
|
||||
model_not_found: 'Selected AI model is not available. Please check your settings.',
|
||||
invalid_api_key: 'Invalid API key. Please verify your OpenAI API key.',
|
||||
context_length: 'Request too large. Please try with fewer items.',
|
||||
parse_error: 'AI response could not be processed. Items flagged for review.',
|
||||
timeout: 'Request timed out. Please try again.',
|
||||
network_error: 'Network connection error. Please check your connection.',
|
||||
content_filter: 'Content flagged by AI safety system. Please review input data.',
|
||||
server_error: 'AI service temporarily unavailable. Please try again later.',
|
||||
unknown: 'An unexpected error occurred. Please try again.',
|
||||
}
|
||||
|
||||
/**
|
||||
* Get a user-friendly message for an error type
|
||||
*/
|
||||
export function getUserFriendlyMessage(errorType: AIErrorType): string {
|
||||
return USER_FRIENDLY_MESSAGES[errorType]
|
||||
}
|
||||
|
||||
// ─── Error Handling Helpers ──────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Wrap an async function with standardized AI error handling
|
||||
*/
|
||||
export async function withAIErrorHandling<T>(
|
||||
fn: () => Promise<T>,
|
||||
fallback: T
|
||||
): Promise<{ result: T; error?: ClassifiedError }> {
|
||||
try {
|
||||
const result = await fn()
|
||||
return { result }
|
||||
} catch (error) {
|
||||
const classified = classifyAIError(error)
|
||||
console.error(`[AI Error] ${classified.type}:`, classified.originalMessage)
|
||||
return { result: fallback, error: classified }
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Log an AI error with context
|
||||
*/
|
||||
export function logAIError(
|
||||
service: string,
|
||||
operation: string,
|
||||
error: ClassifiedError,
|
||||
context?: Record<string, unknown>
|
||||
): void {
|
||||
console.error(
|
||||
`[AI ${service}] ${operation} failed:`,
|
||||
JSON.stringify({
|
||||
type: error.type,
|
||||
message: error.message,
|
||||
originalMessage: error.originalMessage,
|
||||
retryable: error.retryable,
|
||||
...context,
|
||||
})
|
||||
)
|
||||
}
|
||||
|
||||
// ─── Retry Logic ─────────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Determine if an operation should be retried based on error type
|
||||
*/
|
||||
export function shouldRetry(error: ClassifiedError, attempt: number, maxAttempts: number = 3): boolean {
|
||||
if (!error.retryable) return false
|
||||
if (attempt >= maxAttempts) return false
|
||||
|
||||
// Rate limits need longer delays
|
||||
if (error.type === 'rate_limit') {
|
||||
return attempt < 2 // Only retry once for rate limits
|
||||
}
|
||||
|
||||
return true
|
||||
}
|
||||
|
||||
/**
|
||||
* Calculate delay before retry (exponential backoff)
|
||||
*/
|
||||
export function getRetryDelay(error: ClassifiedError, attempt: number): number {
|
||||
const baseDelay = error.type === 'rate_limit' ? 30000 : 1000 // 30s for rate limit, 1s otherwise
|
||||
return baseDelay * Math.pow(2, attempt)
|
||||
}
|
||||
|
|
@ -5,10 +5,24 @@
|
|||
* - Field-based rules (age checks, category, country, etc.)
|
||||
* - Document checks (file existence/types)
|
||||
* - AI screening (GPT interprets criteria text, flags spam)
|
||||
*
|
||||
* GDPR Compliance:
|
||||
* - All project data is anonymized before AI processing
|
||||
* - Only necessary fields sent to OpenAI
|
||||
* - No personal identifiers in prompts or responses
|
||||
*/
|
||||
|
||||
import { getOpenAI, getConfiguredModel } from '@/lib/openai'
|
||||
import type { Prisma } from '@prisma/client'
|
||||
import { getOpenAI, getConfiguredModel, buildCompletionParams } from '@/lib/openai'
|
||||
import { logAIUsage, extractTokenUsage } from '@/server/utils/ai-usage'
|
||||
import { classifyAIError, createParseError, logAIError } from './ai-errors'
|
||||
import {
|
||||
anonymizeProjectsForAI,
|
||||
validateAnonymizedProjects,
|
||||
type ProjectWithRelations,
|
||||
type AnonymizedProjectForAI,
|
||||
type ProjectAIMapping,
|
||||
} from './anonymization'
|
||||
import type { Prisma, FileType, SubmissionSource } from '@prisma/client'
|
||||
|
||||
// ─── Types ──────────────────────────────────────────────────────────────────
|
||||
|
||||
|
|
@ -80,7 +94,14 @@ interface ProjectForFiltering {
|
|||
tags: string[]
|
||||
oceanIssue?: string | null
|
||||
wantsMentorship?: boolean | null
|
||||
files: Array<{ id: string; fileName: string; fileType?: string | null }>
|
||||
institution?: string | null
|
||||
submissionSource?: SubmissionSource
|
||||
submittedAt?: Date | null
|
||||
files: Array<{ id: string; fileName: string; fileType?: FileType | null }>
|
||||
_count?: {
|
||||
teamMembers?: number
|
||||
files?: number
|
||||
}
|
||||
}
|
||||
|
||||
interface FilteringRuleInput {
|
||||
|
|
@ -92,6 +113,15 @@ interface FilteringRuleInput {
|
|||
isActive: boolean
|
||||
}
|
||||
|
||||
// ─── Constants ───────────────────────────────────────────────────────────────
|
||||
|
||||
const BATCH_SIZE = 20
|
||||
|
||||
// Optimized system prompt (compressed for token efficiency)
|
||||
const AI_SCREENING_SYSTEM_PROMPT = `Project screening assistant. Evaluate against criteria, return JSON.
|
||||
Format: {"projects": [{project_id, meets_criteria: bool, confidence: 0-1, reasoning: str, quality_score: 1-10, spam_risk: bool}]}
|
||||
Be objective. Base evaluation only on provided data. No personal identifiers in reasoning.`
|
||||
|
||||
// ─── Field-Based Rule Evaluation ────────────────────────────────────────────
|
||||
|
||||
function evaluateCondition(
|
||||
|
|
@ -185,14 +215,9 @@ export function evaluateFieldRule(
|
|||
? results.every(Boolean)
|
||||
: results.some(Boolean)
|
||||
|
||||
// If conditions met, the rule's action applies
|
||||
// For PASS action: conditions met = passed, not met = not passed
|
||||
// For REJECT action: conditions met = rejected (not passed)
|
||||
// For FLAG action: conditions met = flagged
|
||||
if (config.action === 'PASS') {
|
||||
return { passed: allConditionsMet, action: config.action }
|
||||
}
|
||||
// For REJECT/FLAG: conditions matching means the project should be rejected/flagged
|
||||
return { passed: !allConditionsMet, action: config.action }
|
||||
}
|
||||
|
||||
|
|
@ -226,55 +251,173 @@ export function evaluateDocumentRule(
|
|||
|
||||
// ─── AI Screening ───────────────────────────────────────────────────────────
|
||||
|
||||
const AI_SCREENING_SYSTEM_PROMPT = `You are a project screening assistant. You evaluate projects against specific criteria.
|
||||
You must return a JSON object with this structure:
|
||||
{
|
||||
"projects": [
|
||||
{
|
||||
"project_id": "string",
|
||||
"meets_criteria": boolean,
|
||||
"confidence": number (0-1),
|
||||
"reasoning": "string",
|
||||
"quality_score": number (1-10),
|
||||
"spam_risk": boolean
|
||||
}
|
||||
]
|
||||
interface AIScreeningResult {
|
||||
meetsCriteria: boolean
|
||||
confidence: number
|
||||
reasoning: string
|
||||
qualityScore: number
|
||||
spamRisk: boolean
|
||||
}
|
||||
|
||||
Be fair and objective. Base your evaluation only on the information provided.
|
||||
Never include personal identifiers in your reasoning.`
|
||||
/**
|
||||
* Convert project to enhanced format for anonymization
|
||||
*/
|
||||
function toProjectWithRelations(project: ProjectForFiltering): ProjectWithRelations {
|
||||
return {
|
||||
id: project.id,
|
||||
title: project.title,
|
||||
description: project.description,
|
||||
competitionCategory: project.competitionCategory as any,
|
||||
oceanIssue: project.oceanIssue as any,
|
||||
country: project.country,
|
||||
geographicZone: project.geographicZone,
|
||||
institution: project.institution,
|
||||
tags: project.tags,
|
||||
foundedAt: project.foundedAt,
|
||||
wantsMentorship: project.wantsMentorship ?? false,
|
||||
submissionSource: project.submissionSource ?? 'MANUAL',
|
||||
submittedAt: project.submittedAt,
|
||||
_count: {
|
||||
teamMembers: project._count?.teamMembers ?? 0,
|
||||
files: project.files?.length ?? 0,
|
||||
},
|
||||
files: project.files?.map(f => ({ fileType: f.fileType ?? null })) ?? [],
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Execute AI screening on a batch of projects
|
||||
*/
|
||||
async function processAIBatch(
|
||||
openai: NonNullable<Awaited<ReturnType<typeof getOpenAI>>>,
|
||||
model: string,
|
||||
criteriaText: string,
|
||||
anonymized: AnonymizedProjectForAI[],
|
||||
mappings: ProjectAIMapping[],
|
||||
userId?: string,
|
||||
entityId?: string
|
||||
): Promise<{
|
||||
results: Map<string, AIScreeningResult>
|
||||
tokensUsed: number
|
||||
}> {
|
||||
const results = new Map<string, AIScreeningResult>()
|
||||
let tokensUsed = 0
|
||||
|
||||
// Build optimized prompt
|
||||
const userPrompt = `CRITERIA: ${criteriaText}
|
||||
PROJECTS: ${JSON.stringify(anonymized)}
|
||||
Evaluate and return JSON.`
|
||||
|
||||
try {
|
||||
const params = buildCompletionParams(model, {
|
||||
messages: [
|
||||
{ role: 'system', content: AI_SCREENING_SYSTEM_PROMPT },
|
||||
{ role: 'user', content: userPrompt },
|
||||
],
|
||||
jsonMode: true,
|
||||
temperature: 0.3,
|
||||
maxTokens: 4000,
|
||||
})
|
||||
|
||||
const response = await openai.chat.completions.create(params)
|
||||
const usage = extractTokenUsage(response)
|
||||
tokensUsed = usage.totalTokens
|
||||
|
||||
// Log usage
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'FILTERING',
|
||||
entityType: 'Round',
|
||||
entityId,
|
||||
model,
|
||||
promptTokens: usage.promptTokens,
|
||||
completionTokens: usage.completionTokens,
|
||||
totalTokens: usage.totalTokens,
|
||||
batchSize: anonymized.length,
|
||||
itemsProcessed: anonymized.length,
|
||||
status: 'SUCCESS',
|
||||
})
|
||||
|
||||
const content = response.choices[0]?.message?.content
|
||||
if (!content) {
|
||||
throw new Error('Empty response from AI')
|
||||
}
|
||||
|
||||
const parsed = JSON.parse(content) as {
|
||||
projects: Array<{
|
||||
project_id: string
|
||||
meets_criteria: boolean
|
||||
confidence: number
|
||||
reasoning: string
|
||||
quality_score: number
|
||||
spam_risk: boolean
|
||||
}>
|
||||
}
|
||||
|
||||
// Map results back to real IDs
|
||||
for (const result of parsed.projects || []) {
|
||||
const mapping = mappings.find((m) => m.anonymousId === result.project_id)
|
||||
if (mapping) {
|
||||
results.set(mapping.realId, {
|
||||
meetsCriteria: result.meets_criteria,
|
||||
confidence: result.confidence,
|
||||
reasoning: result.reasoning,
|
||||
qualityScore: result.quality_score,
|
||||
spamRisk: result.spam_risk,
|
||||
})
|
||||
}
|
||||
}
|
||||
} catch (error) {
|
||||
// Check if parse error
|
||||
if (error instanceof SyntaxError) {
|
||||
const parseError = createParseError(error.message)
|
||||
logAIError('Filtering', 'batch processing', parseError)
|
||||
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'FILTERING',
|
||||
entityType: 'Round',
|
||||
entityId,
|
||||
model,
|
||||
promptTokens: 0,
|
||||
completionTokens: 0,
|
||||
totalTokens: tokensUsed,
|
||||
batchSize: anonymized.length,
|
||||
itemsProcessed: 0,
|
||||
status: 'ERROR',
|
||||
errorMessage: parseError.message,
|
||||
})
|
||||
|
||||
// Flag all for manual review
|
||||
for (const mapping of mappings) {
|
||||
results.set(mapping.realId, {
|
||||
meetsCriteria: false,
|
||||
confidence: 0,
|
||||
reasoning: 'AI response parse error — flagged for manual review',
|
||||
qualityScore: 5,
|
||||
spamRisk: false,
|
||||
})
|
||||
}
|
||||
} else {
|
||||
throw error // Re-throw for outer catch
|
||||
}
|
||||
}
|
||||
|
||||
return { results, tokensUsed }
|
||||
}
|
||||
|
||||
export async function executeAIScreening(
|
||||
config: AIScreeningConfig,
|
||||
projects: ProjectForFiltering[]
|
||||
): Promise<
|
||||
Map<
|
||||
string,
|
||||
{
|
||||
meetsCriteria: boolean
|
||||
confidence: number
|
||||
reasoning: string
|
||||
qualityScore: number
|
||||
spamRisk: boolean
|
||||
}
|
||||
>
|
||||
> {
|
||||
const results = new Map<
|
||||
string,
|
||||
{
|
||||
meetsCriteria: boolean
|
||||
confidence: number
|
||||
reasoning: string
|
||||
qualityScore: number
|
||||
spamRisk: boolean
|
||||
}
|
||||
>()
|
||||
projects: ProjectForFiltering[],
|
||||
userId?: string,
|
||||
entityId?: string
|
||||
): Promise<Map<string, AIScreeningResult>> {
|
||||
const results = new Map<string, AIScreeningResult>()
|
||||
|
||||
try {
|
||||
const openai = await getOpenAI()
|
||||
if (!openai) {
|
||||
// No OpenAI configured — flag all for manual review
|
||||
console.warn('[AI Filtering] OpenAI client not available - API key may not be configured')
|
||||
console.warn('[AI Filtering] OpenAI not configured')
|
||||
for (const p of projects) {
|
||||
results.set(p.id, {
|
||||
meetsCriteria: false,
|
||||
|
|
@ -290,133 +433,71 @@ export async function executeAIScreening(
|
|||
const model = await getConfiguredModel()
|
||||
console.log(`[AI Filtering] Using model: ${model} for ${projects.length} projects`)
|
||||
|
||||
// Anonymize project data — use numeric IDs
|
||||
const anonymizedProjects = projects.map((p, i) => ({
|
||||
project_id: `P${i + 1}`,
|
||||
real_id: p.id,
|
||||
title: p.title,
|
||||
description: p.description?.slice(0, 500) || '',
|
||||
category: p.competitionCategory || 'Unknown',
|
||||
ocean_issue: p.oceanIssue || 'Unknown',
|
||||
country: p.country || 'Unknown',
|
||||
tags: p.tags.join(', '),
|
||||
has_files: (p.files?.length || 0) > 0,
|
||||
}))
|
||||
// Convert and anonymize projects
|
||||
const projectsWithRelations = projects.map(toProjectWithRelations)
|
||||
const { anonymized, mappings } = anonymizeProjectsForAI(projectsWithRelations, 'FILTERING')
|
||||
|
||||
// Process in batches of 20
|
||||
const batchSize = 20
|
||||
for (let i = 0; i < anonymizedProjects.length; i += batchSize) {
|
||||
const batch = anonymizedProjects.slice(i, i + batchSize)
|
||||
// Validate anonymization
|
||||
if (!validateAnonymizedProjects(anonymized)) {
|
||||
console.error('[AI Filtering] Anonymization validation failed')
|
||||
throw new Error('GDPR compliance check failed: PII detected in anonymized data')
|
||||
}
|
||||
|
||||
const userPrompt = `Evaluate these projects against the following criteria:
|
||||
let totalTokens = 0
|
||||
|
||||
CRITERIA: ${config.criteriaText}
|
||||
// Process in batches
|
||||
for (let i = 0; i < anonymized.length; i += BATCH_SIZE) {
|
||||
const batchAnon = anonymized.slice(i, i + BATCH_SIZE)
|
||||
const batchMappings = mappings.slice(i, i + BATCH_SIZE)
|
||||
|
||||
PROJECTS:
|
||||
${JSON.stringify(
|
||||
batch.map(({ real_id, ...rest }) => rest),
|
||||
null,
|
||||
2
|
||||
)}
|
||||
console.log(`[AI Filtering] Processing batch ${Math.floor(i / BATCH_SIZE) + 1}/${Math.ceil(anonymized.length / BATCH_SIZE)}`)
|
||||
|
||||
Return your evaluation as JSON.`
|
||||
|
||||
console.log(`[AI Filtering] Processing batch ${Math.floor(i / batchSize) + 1}, ${batch.length} projects`)
|
||||
|
||||
const response = await openai.chat.completions.create({
|
||||
const { results: batchResults, tokensUsed } = await processAIBatch(
|
||||
openai,
|
||||
model,
|
||||
messages: [
|
||||
{ role: 'system', content: AI_SCREENING_SYSTEM_PROMPT },
|
||||
{ role: 'user', content: userPrompt },
|
||||
],
|
||||
response_format: { type: 'json_object' },
|
||||
temperature: 0.3,
|
||||
max_tokens: 4000,
|
||||
})
|
||||
config.criteriaText,
|
||||
batchAnon,
|
||||
batchMappings,
|
||||
userId,
|
||||
entityId
|
||||
)
|
||||
|
||||
console.log(`[AI Filtering] Batch completed, usage: ${response.usage?.total_tokens} tokens`)
|
||||
totalTokens += tokensUsed
|
||||
|
||||
const content = response.choices[0]?.message?.content
|
||||
if (content) {
|
||||
try {
|
||||
const parsed = JSON.parse(content) as {
|
||||
projects: Array<{
|
||||
project_id: string
|
||||
meets_criteria: boolean
|
||||
confidence: number
|
||||
reasoning: string
|
||||
quality_score: number
|
||||
spam_risk: boolean
|
||||
}>
|
||||
// Merge batch results
|
||||
for (const [id, result] of batchResults) {
|
||||
results.set(id, result)
|
||||
}
|
||||
}
|
||||
|
||||
console.log(`[AI Filtering] Parsed ${parsed.projects?.length || 0} results from response`)
|
||||
console.log(`[AI Filtering] Completed. Total tokens: ${totalTokens}`)
|
||||
|
||||
for (const result of parsed.projects) {
|
||||
const anon = batch.find((b) => b.project_id === result.project_id)
|
||||
if (anon) {
|
||||
results.set(anon.real_id, {
|
||||
meetsCriteria: result.meets_criteria,
|
||||
confidence: result.confidence,
|
||||
reasoning: result.reasoning,
|
||||
qualityScore: result.quality_score,
|
||||
spamRisk: result.spam_risk,
|
||||
})
|
||||
}
|
||||
}
|
||||
} catch (parseError) {
|
||||
// Parse error — flag batch for manual review
|
||||
console.error('[AI Filtering] JSON parse error:', parseError)
|
||||
console.error('[AI Filtering] Raw response content:', content.slice(0, 500))
|
||||
for (const item of batch) {
|
||||
results.set(item.real_id, {
|
||||
meetsCriteria: false,
|
||||
confidence: 0,
|
||||
reasoning: 'AI response parse error — flagged for manual review',
|
||||
qualityScore: 5,
|
||||
spamRisk: false,
|
||||
})
|
||||
}
|
||||
}
|
||||
} else {
|
||||
console.error('[AI Filtering] Empty response content from API')
|
||||
}
|
||||
}
|
||||
} catch (error) {
|
||||
// OpenAI error — flag all for manual review with specific error info
|
||||
console.error('[AI Filtering] OpenAI API error:', error)
|
||||
const classified = classifyAIError(error)
|
||||
logAIError('Filtering', 'executeAIScreening', classified)
|
||||
|
||||
// Extract meaningful error message
|
||||
let errorType = 'unknown_error'
|
||||
let errorDetail = 'Unknown error occurred'
|
||||
|
||||
if (error instanceof Error) {
|
||||
const message = error.message.toLowerCase()
|
||||
if (message.includes('rate_limit') || message.includes('rate limit')) {
|
||||
errorType = 'rate_limit'
|
||||
errorDetail = 'OpenAI rate limit exceeded. Try again in a few minutes.'
|
||||
} else if (message.includes('model') && (message.includes('not found') || message.includes('does not exist'))) {
|
||||
errorType = 'model_not_found'
|
||||
errorDetail = 'The configured AI model is not available. Check Settings → AI.'
|
||||
} else if (message.includes('insufficient_quota') || message.includes('quota')) {
|
||||
errorType = 'quota_exceeded'
|
||||
errorDetail = 'OpenAI API quota exceeded. Check your billing settings.'
|
||||
} else if (message.includes('invalid_api_key') || message.includes('unauthorized')) {
|
||||
errorType = 'invalid_api_key'
|
||||
errorDetail = 'Invalid OpenAI API key. Check Settings → AI.'
|
||||
} else if (message.includes('context_length') || message.includes('token')) {
|
||||
errorType = 'context_length'
|
||||
errorDetail = 'Request too large. Try with fewer projects or shorter descriptions.'
|
||||
} else {
|
||||
errorDetail = error.message
|
||||
}
|
||||
}
|
||||
// Log failed attempt
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'FILTERING',
|
||||
entityType: 'Round',
|
||||
entityId,
|
||||
model: 'unknown',
|
||||
promptTokens: 0,
|
||||
completionTokens: 0,
|
||||
totalTokens: 0,
|
||||
batchSize: projects.length,
|
||||
itemsProcessed: 0,
|
||||
status: 'ERROR',
|
||||
errorMessage: classified.message,
|
||||
})
|
||||
|
||||
// Flag all for manual review with error info
|
||||
for (const p of projects) {
|
||||
results.set(p.id, {
|
||||
meetsCriteria: false,
|
||||
confidence: 0,
|
||||
reasoning: `AI screening error (${errorType}): ${errorDetail}`,
|
||||
reasoning: `AI screening error: ${classified.message}`,
|
||||
qualityScore: 5,
|
||||
spamRisk: false,
|
||||
})
|
||||
|
|
@ -430,7 +511,9 @@ Return your evaluation as JSON.`
|
|||
|
||||
export async function executeFilteringRules(
|
||||
rules: FilteringRuleInput[],
|
||||
projects: ProjectForFiltering[]
|
||||
projects: ProjectForFiltering[],
|
||||
userId?: string,
|
||||
roundId?: string
|
||||
): Promise<ProjectFilteringResult[]> {
|
||||
const activeRules = rules
|
||||
.filter((r) => r.isActive)
|
||||
|
|
@ -441,23 +524,11 @@ export async function executeFilteringRules(
|
|||
const nonAiRules = activeRules.filter((r) => r.ruleType !== 'AI_SCREENING')
|
||||
|
||||
// Pre-compute AI screening results if needed
|
||||
const aiResults = new Map<
|
||||
string,
|
||||
Map<
|
||||
string,
|
||||
{
|
||||
meetsCriteria: boolean
|
||||
confidence: number
|
||||
reasoning: string
|
||||
qualityScore: number
|
||||
spamRisk: boolean
|
||||
}
|
||||
>
|
||||
>()
|
||||
const aiResults = new Map<string, Map<string, AIScreeningResult>>()
|
||||
|
||||
for (const aiRule of aiRules) {
|
||||
const config = aiRule.configJson as unknown as AIScreeningConfig
|
||||
const screeningResults = await executeAIScreening(config, projects)
|
||||
const screeningResults = await executeAIScreening(config, projects, userId, roundId)
|
||||
aiResults.set(aiRule.id, screeningResults)
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -3,8 +3,44 @@
|
|||
*
|
||||
* Strips PII (names, emails, etc.) from data before sending to AI services.
|
||||
* Returns ID mappings for de-anonymization of results.
|
||||
*
|
||||
* GDPR Compliance:
|
||||
* - All personal identifiers are stripped before AI processing
|
||||
* - Project/user IDs are replaced with sequential anonymous IDs
|
||||
* - Text content is sanitized to remove emails, phones, URLs
|
||||
* - Validation ensures no PII leakage before each AI call
|
||||
*/
|
||||
|
||||
import type {
|
||||
CompetitionCategory,
|
||||
OceanIssue,
|
||||
FileType,
|
||||
SubmissionSource,
|
||||
} from '@prisma/client'
|
||||
|
||||
// ─── Description Limits ──────────────────────────────────────────────────────
|
||||
|
||||
export const DESCRIPTION_LIMITS = {
|
||||
ASSIGNMENT: 300,
|
||||
FILTERING: 500,
|
||||
ELIGIBILITY: 400,
|
||||
MENTOR: 350,
|
||||
} as const
|
||||
|
||||
export type DescriptionContext = keyof typeof DESCRIPTION_LIMITS
|
||||
|
||||
// ─── PII Patterns ────────────────────────────────────────────────────────────
|
||||
|
||||
const PII_PATTERNS = {
|
||||
email: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g,
|
||||
phone: /(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/g,
|
||||
url: /https?:\/\/[^\s]+/g,
|
||||
ssn: /\d{3}-\d{2}-\d{4}/g,
|
||||
ipv4: /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/g,
|
||||
} as const
|
||||
|
||||
// ─── Basic Anonymization Types (Assignment Service) ──────────────────────────
|
||||
|
||||
export interface AnonymizedJuror {
|
||||
anonymousId: string
|
||||
expertiseTags: string[]
|
||||
|
|
@ -37,9 +73,67 @@ export interface AnonymizationResult {
|
|||
projectMappings: ProjectMapping[]
|
||||
}
|
||||
|
||||
// ─── Enhanced Project Types (Filtering/Awards) ───────────────────────────────
|
||||
|
||||
/**
|
||||
* Juror data from database
|
||||
* Comprehensive anonymized project data for AI filtering
|
||||
* Includes all fields needed for flexible filtering criteria
|
||||
*/
|
||||
export interface AnonymizedProjectForAI {
|
||||
project_id: string // P1, P2, etc.
|
||||
title: string // Sanitized
|
||||
description: string // Truncated + PII stripped
|
||||
category: CompetitionCategory | null // STARTUP | BUSINESS_CONCEPT
|
||||
ocean_issue: OceanIssue | null // Enum value
|
||||
country: string | null
|
||||
region: string | null // geographicZone
|
||||
institution: string | null
|
||||
tags: string[]
|
||||
founded_year: number | null // Just the year
|
||||
team_size: number
|
||||
has_description: boolean
|
||||
file_count: number
|
||||
file_types: string[] // FileType values
|
||||
wants_mentorship: boolean
|
||||
submission_source: SubmissionSource
|
||||
submitted_date: string | null // YYYY-MM-DD only
|
||||
}
|
||||
|
||||
/**
|
||||
* Project input with all relations needed for comprehensive anonymization
|
||||
*/
|
||||
export interface ProjectWithRelations {
|
||||
id: string
|
||||
title: string
|
||||
description?: string | null
|
||||
teamName?: string | null
|
||||
competitionCategory?: CompetitionCategory | null
|
||||
oceanIssue?: OceanIssue | null
|
||||
country?: string | null
|
||||
geographicZone?: string | null
|
||||
institution?: string | null
|
||||
tags: string[]
|
||||
foundedAt?: Date | null
|
||||
wantsMentorship?: boolean
|
||||
submissionSource: SubmissionSource
|
||||
submittedAt?: Date | null
|
||||
_count?: {
|
||||
teamMembers?: number
|
||||
files?: number
|
||||
}
|
||||
files?: Array<{ fileType: FileType | null }>
|
||||
}
|
||||
|
||||
/**
|
||||
* Mapping for de-anonymization
|
||||
*/
|
||||
export interface ProjectAIMapping {
|
||||
anonymousId: string
|
||||
realId: string
|
||||
}
|
||||
|
||||
// ─── Basic Anonymization (Assignment Service) ────────────────────────────────
|
||||
|
||||
interface JurorInput {
|
||||
id: string
|
||||
name?: string | null
|
||||
|
|
@ -51,9 +145,6 @@ interface JurorInput {
|
|||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Project data from database
|
||||
*/
|
||||
interface ProjectInput {
|
||||
id: string
|
||||
title: string
|
||||
|
|
@ -63,13 +154,7 @@ interface ProjectInput {
|
|||
}
|
||||
|
||||
/**
|
||||
* Anonymize juror and project data for AI processing
|
||||
*
|
||||
* This function:
|
||||
* 1. Strips all PII (names, emails) from juror data
|
||||
* 2. Replaces real IDs with sequential anonymous IDs
|
||||
* 3. Keeps only expertise tags and assignment counts
|
||||
* 4. Returns mappings for de-anonymization
|
||||
* Anonymize juror and project data for AI processing (Assignment service)
|
||||
*/
|
||||
export function anonymizeForAI(
|
||||
jurors: JurorInput[],
|
||||
|
|
@ -78,7 +163,6 @@ export function anonymizeForAI(
|
|||
const jurorMappings: JurorMapping[] = []
|
||||
const projectMappings: ProjectMapping[] = []
|
||||
|
||||
// Anonymize jurors
|
||||
const anonymizedJurors: AnonymizedJuror[] = jurors.map((juror, index) => {
|
||||
const anonymousId = `juror_${(index + 1).toString().padStart(3, '0')}`
|
||||
|
||||
|
|
@ -95,7 +179,6 @@ export function anonymizeForAI(
|
|||
}
|
||||
})
|
||||
|
||||
// Anonymize projects (keep content but replace IDs)
|
||||
const anonymizedProjects: AnonymizedProject[] = projects.map(
|
||||
(project, index) => {
|
||||
const anonymousId = `project_${(index + 1).toString().padStart(3, '0')}`
|
||||
|
|
@ -109,10 +192,9 @@ export function anonymizeForAI(
|
|||
anonymousId,
|
||||
title: sanitizeText(project.title),
|
||||
description: project.description
|
||||
? sanitizeText(project.description)
|
||||
? truncateAndSanitize(project.description, DESCRIPTION_LIMITS.ASSIGNMENT)
|
||||
: null,
|
||||
tags: project.tags,
|
||||
// Replace specific team names with generic identifier
|
||||
teamName: project.teamName ? `Team ${index + 1}` : null,
|
||||
}
|
||||
}
|
||||
|
|
@ -126,10 +208,77 @@ export function anonymizeForAI(
|
|||
}
|
||||
}
|
||||
|
||||
// ─── Enhanced Anonymization (Filtering/Awards) ───────────────────────────────
|
||||
|
||||
/**
|
||||
* Anonymize a single project with comprehensive data for AI filtering
|
||||
*
|
||||
* GDPR Compliance:
|
||||
* - Strips team names, email references, phone numbers, URLs
|
||||
* - Replaces IDs with sequential anonymous IDs
|
||||
* - Truncates descriptions to limit data exposure
|
||||
* - Keeps only necessary fields for filtering criteria
|
||||
*/
|
||||
export function anonymizeProjectForAI(
|
||||
project: ProjectWithRelations,
|
||||
index: number,
|
||||
context: DescriptionContext = 'FILTERING'
|
||||
): AnonymizedProjectForAI {
|
||||
const descriptionLimit = DESCRIPTION_LIMITS[context]
|
||||
|
||||
return {
|
||||
project_id: `P${index + 1}`,
|
||||
title: sanitizeText(project.title),
|
||||
description: truncateAndSanitize(project.description, descriptionLimit),
|
||||
category: project.competitionCategory ?? null,
|
||||
ocean_issue: project.oceanIssue ?? null,
|
||||
country: project.country ?? null,
|
||||
region: project.geographicZone ?? null,
|
||||
institution: project.institution ?? null,
|
||||
tags: project.tags,
|
||||
founded_year: project.foundedAt?.getFullYear() ?? null,
|
||||
team_size: project._count?.teamMembers ?? 0,
|
||||
has_description: !!project.description?.trim(),
|
||||
file_count: project._count?.files ?? 0,
|
||||
file_types: project.files
|
||||
?.map((f) => f.fileType)
|
||||
.filter((ft): ft is FileType => ft !== null) ?? [],
|
||||
wants_mentorship: project.wantsMentorship ?? false,
|
||||
submission_source: project.submissionSource,
|
||||
submitted_date: project.submittedAt?.toISOString().split('T')[0] ?? null,
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Anonymize multiple projects and return mappings
|
||||
*/
|
||||
export function anonymizeProjectsForAI(
|
||||
projects: ProjectWithRelations[],
|
||||
context: DescriptionContext = 'FILTERING'
|
||||
): {
|
||||
anonymized: AnonymizedProjectForAI[]
|
||||
mappings: ProjectAIMapping[]
|
||||
} {
|
||||
const mappings: ProjectAIMapping[] = []
|
||||
const anonymized = projects.map((project, index) => {
|
||||
mappings.push({
|
||||
anonymousId: `P${index + 1}`,
|
||||
realId: project.id,
|
||||
})
|
||||
return anonymizeProjectForAI(project, index, context)
|
||||
})
|
||||
|
||||
return { anonymized, mappings }
|
||||
}
|
||||
|
||||
// ─── De-anonymization ────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* De-anonymize AI results back to real IDs
|
||||
*/
|
||||
export function deanonymizeResults<T extends { jurorId: string; projectId: string }>(
|
||||
export function deanonymizeResults<
|
||||
T extends { jurorId: string; projectId: string }
|
||||
>(
|
||||
results: T[],
|
||||
jurorMappings: JurorMapping[],
|
||||
projectMappings: ProjectMapping[]
|
||||
|
|
@ -149,50 +298,155 @@ export function deanonymizeResults<T extends { jurorId: string; projectId: strin
|
|||
}
|
||||
|
||||
/**
|
||||
* Sanitize text to remove potential PII patterns
|
||||
* Removes emails, phone numbers, and URLs from text
|
||||
* De-anonymize project-only results (for filtering/awards)
|
||||
*/
|
||||
function sanitizeText(text: string): string {
|
||||
export function deanonymizeProjectResults<T extends { project_id: string }>(
|
||||
results: T[],
|
||||
mappings: ProjectAIMapping[]
|
||||
): (T & { realProjectId: string })[] {
|
||||
const projectMap = new Map(mappings.map((m) => [m.anonymousId, m.realId]))
|
||||
|
||||
return results.map((result) => ({
|
||||
...result,
|
||||
realProjectId: projectMap.get(result.project_id) || result.project_id,
|
||||
}))
|
||||
}
|
||||
|
||||
// ─── Text Sanitization ───────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Sanitize text to remove potential PII patterns
|
||||
* Removes emails, phone numbers, URLs, and other identifying information
|
||||
*/
|
||||
export function sanitizeText(text: string): string {
|
||||
let sanitized = text
|
||||
|
||||
// Remove email addresses
|
||||
let sanitized = text.replace(
|
||||
/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g,
|
||||
'[email removed]'
|
||||
)
|
||||
sanitized = sanitized.replace(PII_PATTERNS.email, '[email removed]')
|
||||
|
||||
// Remove phone numbers (various formats)
|
||||
sanitized = sanitized.replace(
|
||||
/(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/g,
|
||||
'[phone removed]'
|
||||
)
|
||||
sanitized = sanitized.replace(PII_PATTERNS.phone, '[phone removed]')
|
||||
|
||||
// Remove URLs
|
||||
sanitized = sanitized.replace(
|
||||
/https?:\/\/[^\s]+/g,
|
||||
'[url removed]'
|
||||
)
|
||||
sanitized = sanitized.replace(PII_PATTERNS.url, '[url removed]')
|
||||
|
||||
// Remove SSN-like patterns
|
||||
sanitized = sanitized.replace(PII_PATTERNS.ssn, '[id removed]')
|
||||
|
||||
return sanitized
|
||||
}
|
||||
|
||||
/**
|
||||
* Truncate text to a maximum length and sanitize
|
||||
*/
|
||||
export function truncateAndSanitize(
|
||||
text: string | null | undefined,
|
||||
maxLength: number
|
||||
): string {
|
||||
if (!text) return ''
|
||||
|
||||
const sanitized = sanitizeText(text)
|
||||
|
||||
if (sanitized.length <= maxLength) {
|
||||
return sanitized
|
||||
}
|
||||
|
||||
return sanitized.slice(0, maxLength - 3) + '...'
|
||||
}
|
||||
|
||||
// ─── GDPR Compliance Validation ──────────────────────────────────────────────
|
||||
|
||||
export interface PIIValidationResult {
|
||||
valid: boolean
|
||||
violations: string[]
|
||||
}
|
||||
|
||||
/**
|
||||
* Validate that data contains no personal information
|
||||
* Used for GDPR compliance before sending data to AI
|
||||
*/
|
||||
export function validateNoPersonalData(
|
||||
data: Record<string, unknown>
|
||||
): PIIValidationResult {
|
||||
const violations: string[] = []
|
||||
const textContent = JSON.stringify(data)
|
||||
|
||||
// Check each PII pattern
|
||||
for (const [type, pattern] of Object.entries(PII_PATTERNS)) {
|
||||
// Reset regex state (global flag)
|
||||
pattern.lastIndex = 0
|
||||
|
||||
if (pattern.test(textContent)) {
|
||||
violations.push(`Potential ${type} detected in data`)
|
||||
}
|
||||
}
|
||||
|
||||
// Additional checks for common PII fields
|
||||
const sensitiveFields = [
|
||||
'email',
|
||||
'phone',
|
||||
'password',
|
||||
'ssn',
|
||||
'socialSecurity',
|
||||
'creditCard',
|
||||
'bankAccount',
|
||||
'drivingLicense',
|
||||
]
|
||||
|
||||
const keys = Object.keys(data).map((k) => k.toLowerCase())
|
||||
for (const field of sensitiveFields) {
|
||||
if (keys.includes(field)) {
|
||||
violations.push(`Sensitive field "${field}" present in data`)
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
valid: violations.length === 0,
|
||||
violations,
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Enforce GDPR compliance before EVERY AI call
|
||||
* Throws an error if PII is detected
|
||||
*/
|
||||
export function enforceGDPRCompliance(data: unknown[]): void {
|
||||
for (let i = 0; i < data.length; i++) {
|
||||
const item = data[i]
|
||||
if (typeof item === 'object' && item !== null) {
|
||||
const { valid, violations } = validateNoPersonalData(
|
||||
item as Record<string, unknown>
|
||||
)
|
||||
if (!valid) {
|
||||
console.error(
|
||||
`[GDPR] PII validation failed for item ${i}:`,
|
||||
violations
|
||||
)
|
||||
throw new Error(
|
||||
`GDPR compliance check failed: ${violations.join(', ')}`
|
||||
)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Validate that data has been properly anonymized
|
||||
* Returns true if no PII patterns are detected
|
||||
*/
|
||||
export function validateAnonymization(data: AnonymizationResult): boolean {
|
||||
const piiPatterns = [
|
||||
/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/, // Email
|
||||
/(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/, // Phone
|
||||
]
|
||||
|
||||
const checkText = (text: string | null | undefined): boolean => {
|
||||
if (!text) return true
|
||||
return !piiPatterns.some((pattern) => pattern.test(text))
|
||||
// Reset regex state for each check
|
||||
for (const pattern of Object.values(PII_PATTERNS)) {
|
||||
pattern.lastIndex = 0
|
||||
if (pattern.test(text)) return false
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// Check jurors (they should only have expertise tags)
|
||||
// Check jurors
|
||||
for (const juror of data.jurors) {
|
||||
// Jurors should not have any text fields that could contain PII
|
||||
// Only check expertiseTags
|
||||
for (const tag of juror.expertiseTags) {
|
||||
if (!checkText(tag)) return false
|
||||
}
|
||||
|
|
@ -209,3 +463,30 @@ export function validateAnonymization(data: AnonymizationResult): boolean {
|
|||
|
||||
return true
|
||||
}
|
||||
|
||||
/**
|
||||
* Validate anonymized projects for AI (enhanced version)
|
||||
*/
|
||||
export function validateAnonymizedProjects(
|
||||
projects: AnonymizedProjectForAI[]
|
||||
): boolean {
|
||||
const checkText = (text: string | null | undefined): boolean => {
|
||||
if (!text) return true
|
||||
for (const pattern of Object.values(PII_PATTERNS)) {
|
||||
pattern.lastIndex = 0
|
||||
if (pattern.test(text)) return false
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
for (const project of projects) {
|
||||
if (!checkText(project.title)) return false
|
||||
if (!checkText(project.description)) return false
|
||||
if (!checkText(project.institution)) return false
|
||||
for (const tag of project.tags) {
|
||||
if (!checkText(tag)) return false
|
||||
}
|
||||
}
|
||||
|
||||
return true
|
||||
}
|
||||
|
|
|
|||
|
|
@ -1,5 +1,33 @@
|
|||
/**
|
||||
* AI-Powered Mentor Matching Service
|
||||
*
|
||||
* Matches mentors to projects based on expertise alignment.
|
||||
*
|
||||
* Optimization:
|
||||
* - Batched processing (15 projects per batch)
|
||||
* - Token tracking and cost logging
|
||||
* - Fallback to algorithmic matching
|
||||
*
|
||||
* GDPR Compliance:
|
||||
* - All data anonymized before AI processing
|
||||
* - No personal information sent to OpenAI
|
||||
*/
|
||||
|
||||
import { PrismaClient, OceanIssue, CompetitionCategory } from '@prisma/client'
|
||||
import { getOpenAI, getConfiguredModel } from '@/lib/openai'
|
||||
import { getOpenAI, getConfiguredModel, buildCompletionParams } from '@/lib/openai'
|
||||
import { logAIUsage, extractTokenUsage } from '@/server/utils/ai-usage'
|
||||
import { classifyAIError, createParseError, logAIError } from './ai-errors'
|
||||
|
||||
// ─── Constants ───────────────────────────────────────────────────────────────
|
||||
|
||||
const MENTOR_BATCH_SIZE = 15
|
||||
|
||||
// Optimized system prompt
|
||||
const MENTOR_MATCHING_SYSTEM_PROMPT = `Match mentors to projects by expertise. Return JSON.
|
||||
Format for each project: {"matches": [{project_id, mentor_matches: [{mentor_index, confidence_score: 0-1, expertise_match_score: 0-1, reasoning: str}]}]}
|
||||
Rank by suitability. Consider expertise alignment and availability.`
|
||||
|
||||
// ─── Types ───────────────────────────────────────────────────────────────────
|
||||
|
||||
interface ProjectInfo {
|
||||
id: string
|
||||
|
|
@ -26,17 +54,162 @@ interface MentorMatch {
|
|||
reasoning: string
|
||||
}
|
||||
|
||||
// ─── Batched AI Matching ─────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Get AI-suggested mentor matches for a project
|
||||
* Process a batch of projects for mentor matching
|
||||
*/
|
||||
export async function getAIMentorSuggestions(
|
||||
async function processMatchingBatch(
|
||||
openai: NonNullable<Awaited<ReturnType<typeof getOpenAI>>>,
|
||||
model: string,
|
||||
projects: ProjectInfo[],
|
||||
mentors: MentorInfo[],
|
||||
limit: number,
|
||||
userId?: string
|
||||
): Promise<{
|
||||
results: Map<string, MentorMatch[]>
|
||||
tokensUsed: number
|
||||
}> {
|
||||
const results = new Map<string, MentorMatch[]>()
|
||||
let tokensUsed = 0
|
||||
|
||||
// Anonymize project data
|
||||
const anonymizedProjects = projects.map((p, index) => ({
|
||||
project_id: `P${index + 1}`,
|
||||
real_id: p.id,
|
||||
description: p.description?.slice(0, 350) || 'No description',
|
||||
category: p.competitionCategory,
|
||||
oceanIssue: p.oceanIssue,
|
||||
tags: p.tags,
|
||||
}))
|
||||
|
||||
// Anonymize mentor data
|
||||
const anonymizedMentors = mentors.map((m, index) => ({
|
||||
index,
|
||||
expertise: m.expertiseTags,
|
||||
availability: m.maxAssignments
|
||||
? `${m.currentAssignments}/${m.maxAssignments}`
|
||||
: 'unlimited',
|
||||
}))
|
||||
|
||||
const userPrompt = `PROJECTS:
|
||||
${anonymizedProjects.map(p => `${p.project_id}: Category=${p.category || 'N/A'}, Issue=${p.oceanIssue || 'N/A'}, Tags=[${p.tags.join(', ')}], Desc=${p.description.slice(0, 200)}`).join('\n')}
|
||||
|
||||
MENTORS:
|
||||
${anonymizedMentors.map(m => `${m.index}: Expertise=[${m.expertise.join(', ')}], Availability=${m.availability}`).join('\n')}
|
||||
|
||||
For each project, rank top ${limit} mentors.`
|
||||
|
||||
try {
|
||||
const params = buildCompletionParams(model, {
|
||||
messages: [
|
||||
{ role: 'system', content: MENTOR_MATCHING_SYSTEM_PROMPT },
|
||||
{ role: 'user', content: userPrompt },
|
||||
],
|
||||
jsonMode: true,
|
||||
temperature: 0.3,
|
||||
maxTokens: 4000,
|
||||
})
|
||||
|
||||
const response = await openai.chat.completions.create(params)
|
||||
const usage = extractTokenUsage(response)
|
||||
tokensUsed = usage.totalTokens
|
||||
|
||||
// Log usage
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'MENTOR_MATCHING',
|
||||
entityType: 'Project',
|
||||
model,
|
||||
promptTokens: usage.promptTokens,
|
||||
completionTokens: usage.completionTokens,
|
||||
totalTokens: usage.totalTokens,
|
||||
batchSize: projects.length,
|
||||
itemsProcessed: projects.length,
|
||||
status: 'SUCCESS',
|
||||
})
|
||||
|
||||
const content = response.choices[0]?.message?.content
|
||||
if (!content) {
|
||||
throw new Error('No response from AI')
|
||||
}
|
||||
|
||||
const parsed = JSON.parse(content) as {
|
||||
matches: Array<{
|
||||
project_id: string
|
||||
mentor_matches: Array<{
|
||||
mentor_index: number
|
||||
confidence_score: number
|
||||
expertise_match_score: number
|
||||
reasoning: string
|
||||
}>
|
||||
}>
|
||||
}
|
||||
|
||||
// Map results back to real IDs
|
||||
for (const projectMatch of parsed.matches || []) {
|
||||
const project = anonymizedProjects.find(p => p.project_id === projectMatch.project_id)
|
||||
if (!project) continue
|
||||
|
||||
const mentorMatches: MentorMatch[] = []
|
||||
for (const match of projectMatch.mentor_matches || []) {
|
||||
if (match.mentor_index >= 0 && match.mentor_index < mentors.length) {
|
||||
mentorMatches.push({
|
||||
mentorId: mentors[match.mentor_index].id,
|
||||
confidenceScore: Math.min(1, Math.max(0, match.confidence_score)),
|
||||
expertiseMatchScore: Math.min(1, Math.max(0, match.expertise_match_score)),
|
||||
reasoning: match.reasoning,
|
||||
})
|
||||
}
|
||||
}
|
||||
results.set(project.real_id, mentorMatches)
|
||||
}
|
||||
|
||||
} catch (error) {
|
||||
if (error instanceof SyntaxError) {
|
||||
const parseError = createParseError(error.message)
|
||||
logAIError('MentorMatching', 'batch processing', parseError)
|
||||
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'MENTOR_MATCHING',
|
||||
entityType: 'Project',
|
||||
model,
|
||||
promptTokens: 0,
|
||||
completionTokens: 0,
|
||||
totalTokens: tokensUsed,
|
||||
batchSize: projects.length,
|
||||
itemsProcessed: 0,
|
||||
status: 'ERROR',
|
||||
errorMessage: parseError.message,
|
||||
})
|
||||
|
||||
// Return empty results for batch (will fall back to algorithm)
|
||||
for (const project of projects) {
|
||||
results.set(project.id, [])
|
||||
}
|
||||
} else {
|
||||
throw error
|
||||
}
|
||||
}
|
||||
|
||||
return { results, tokensUsed }
|
||||
}
|
||||
|
||||
/**
|
||||
* Get AI-suggested mentor matches for multiple projects (batched)
|
||||
*/
|
||||
export async function getAIMentorSuggestionsBatch(
|
||||
prisma: PrismaClient,
|
||||
projectId: string,
|
||||
limit: number = 5
|
||||
): Promise<MentorMatch[]> {
|
||||
// Get project details
|
||||
const project = await prisma.project.findUniqueOrThrow({
|
||||
where: { id: projectId },
|
||||
projectIds: string[],
|
||||
limit: number = 5,
|
||||
userId?: string
|
||||
): Promise<Map<string, MentorMatch[]>> {
|
||||
const allResults = new Map<string, MentorMatch[]>()
|
||||
|
||||
// Get projects
|
||||
const projects = await prisma.project.findMany({
|
||||
where: { id: { in: projectIds } },
|
||||
select: {
|
||||
id: true,
|
||||
title: true,
|
||||
|
|
@ -47,14 +220,16 @@ export async function getAIMentorSuggestions(
|
|||
},
|
||||
})
|
||||
|
||||
// Get available mentors (users with expertise tags)
|
||||
// In a full implementation, you'd have a MENTOR role
|
||||
// For now, we use users with expertiseTags and consider them potential mentors
|
||||
if (projects.length === 0) {
|
||||
return allResults
|
||||
}
|
||||
|
||||
// Get available mentors
|
||||
const mentors = await prisma.user.findMany({
|
||||
where: {
|
||||
OR: [
|
||||
{ expertiseTags: { isEmpty: false } },
|
||||
{ role: 'JURY_MEMBER' }, // Jury members can also be mentors
|
||||
{ role: 'JURY_MEMBER' },
|
||||
],
|
||||
status: 'ACTIVE',
|
||||
},
|
||||
|
|
@ -86,118 +261,111 @@ export async function getAIMentorSuggestions(
|
|||
}))
|
||||
|
||||
if (availableMentors.length === 0) {
|
||||
return []
|
||||
return allResults
|
||||
}
|
||||
|
||||
// Try AI matching if API key is configured
|
||||
if (process.env.OPENAI_API_KEY) {
|
||||
// Try AI matching
|
||||
try {
|
||||
return await getAIMatches(project, availableMentors, limit)
|
||||
} catch (error) {
|
||||
console.error('AI mentor matching failed, falling back to algorithm:', error)
|
||||
}
|
||||
}
|
||||
|
||||
// Fallback to algorithmic matching
|
||||
return getAlgorithmicMatches(project, availableMentors, limit)
|
||||
}
|
||||
|
||||
/**
|
||||
* Use OpenAI to match mentors to projects
|
||||
*/
|
||||
async function getAIMatches(
|
||||
project: ProjectInfo,
|
||||
mentors: MentorInfo[],
|
||||
limit: number
|
||||
): Promise<MentorMatch[]> {
|
||||
// Anonymize data before sending to AI
|
||||
const anonymizedProject = {
|
||||
description: project.description?.slice(0, 500) || 'No description',
|
||||
category: project.competitionCategory,
|
||||
oceanIssue: project.oceanIssue,
|
||||
tags: project.tags,
|
||||
}
|
||||
|
||||
const anonymizedMentors = mentors.map((m, index) => ({
|
||||
index,
|
||||
expertise: m.expertiseTags,
|
||||
availability: m.maxAssignments
|
||||
? `${m.currentAssignments}/${m.maxAssignments}`
|
||||
: 'unlimited',
|
||||
}))
|
||||
|
||||
const prompt = `You are matching mentors to an ocean protection project.
|
||||
|
||||
PROJECT:
|
||||
- Category: ${anonymizedProject.category || 'Not specified'}
|
||||
- Ocean Issue: ${anonymizedProject.oceanIssue || 'Not specified'}
|
||||
- Tags: ${anonymizedProject.tags.join(', ') || 'None'}
|
||||
- Description: ${anonymizedProject.description}
|
||||
|
||||
AVAILABLE MENTORS:
|
||||
${anonymizedMentors.map((m) => `${m.index}: Expertise: [${m.expertise.join(', ')}], Availability: ${m.availability}`).join('\n')}
|
||||
|
||||
Rank the top ${limit} mentors by suitability. For each, provide:
|
||||
1. Mentor index (0-based)
|
||||
2. Confidence score (0-1)
|
||||
3. Expertise match score (0-1)
|
||||
4. Brief reasoning (1-2 sentences)
|
||||
|
||||
Respond in JSON format:
|
||||
{
|
||||
"matches": [
|
||||
{
|
||||
"mentorIndex": 0,
|
||||
"confidenceScore": 0.85,
|
||||
"expertiseMatchScore": 0.9,
|
||||
"reasoning": "Strong expertise alignment..."
|
||||
}
|
||||
]
|
||||
}`
|
||||
|
||||
const openai = await getOpenAI()
|
||||
if (!openai) {
|
||||
throw new Error('OpenAI client not available')
|
||||
console.log('[Mentor Matching] OpenAI not configured, using algorithm')
|
||||
return getAlgorithmicMatchesBatch(projects, availableMentors, limit)
|
||||
}
|
||||
|
||||
const model = await getConfiguredModel()
|
||||
console.log(`[Mentor Matching] Using model: ${model} for ${projects.length} projects in batches of ${MENTOR_BATCH_SIZE}`)
|
||||
|
||||
const response = await openai.chat.completions.create({
|
||||
let totalTokens = 0
|
||||
|
||||
// Process in batches
|
||||
for (let i = 0; i < projects.length; i += MENTOR_BATCH_SIZE) {
|
||||
const batchProjects = projects.slice(i, i + MENTOR_BATCH_SIZE)
|
||||
|
||||
console.log(`[Mentor Matching] Processing batch ${Math.floor(i / MENTOR_BATCH_SIZE) + 1}/${Math.ceil(projects.length / MENTOR_BATCH_SIZE)}`)
|
||||
|
||||
const { results, tokensUsed } = await processMatchingBatch(
|
||||
openai,
|
||||
model,
|
||||
messages: [
|
||||
{
|
||||
role: 'system',
|
||||
content: 'You are an expert at matching mentors to projects based on expertise alignment. Always respond with valid JSON.',
|
||||
},
|
||||
{ role: 'user', content: prompt },
|
||||
],
|
||||
response_format: { type: 'json_object' },
|
||||
temperature: 0.3,
|
||||
max_tokens: 1000,
|
||||
batchProjects,
|
||||
availableMentors,
|
||||
limit,
|
||||
userId
|
||||
)
|
||||
|
||||
totalTokens += tokensUsed
|
||||
|
||||
// Merge results
|
||||
for (const [projectId, matches] of results) {
|
||||
allResults.set(projectId, matches)
|
||||
}
|
||||
}
|
||||
|
||||
console.log(`[Mentor Matching] Completed. Total tokens: ${totalTokens}`)
|
||||
|
||||
// Fill in any missing projects with algorithmic fallback
|
||||
for (const project of projects) {
|
||||
if (!allResults.has(project.id) || allResults.get(project.id)?.length === 0) {
|
||||
const fallbackMatches = getAlgorithmicMatches(project, availableMentors, limit)
|
||||
allResults.set(project.id, fallbackMatches)
|
||||
}
|
||||
}
|
||||
|
||||
return allResults
|
||||
|
||||
} catch (error) {
|
||||
const classified = classifyAIError(error)
|
||||
logAIError('MentorMatching', 'getAIMentorSuggestionsBatch', classified)
|
||||
|
||||
// Log failed attempt
|
||||
await logAIUsage({
|
||||
userId,
|
||||
action: 'MENTOR_MATCHING',
|
||||
entityType: 'Project',
|
||||
model: 'unknown',
|
||||
promptTokens: 0,
|
||||
completionTokens: 0,
|
||||
totalTokens: 0,
|
||||
batchSize: projects.length,
|
||||
itemsProcessed: 0,
|
||||
status: 'ERROR',
|
||||
errorMessage: classified.message,
|
||||
})
|
||||
|
||||
const content = response.choices[0]?.message?.content
|
||||
if (!content) {
|
||||
throw new Error('No response from AI')
|
||||
console.error('[Mentor Matching] AI failed, using algorithm:', classified.message)
|
||||
return getAlgorithmicMatchesBatch(projects, availableMentors, limit)
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Get AI-suggested mentor matches for a single project
|
||||
*/
|
||||
export async function getAIMentorSuggestions(
|
||||
prisma: PrismaClient,
|
||||
projectId: string,
|
||||
limit: number = 5,
|
||||
userId?: string
|
||||
): Promise<MentorMatch[]> {
|
||||
const results = await getAIMentorSuggestionsBatch(prisma, [projectId], limit, userId)
|
||||
return results.get(projectId) || []
|
||||
}
|
||||
|
||||
// ─── Algorithmic Fallback ────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Algorithmic fallback for multiple projects
|
||||
*/
|
||||
function getAlgorithmicMatchesBatch(
|
||||
projects: ProjectInfo[],
|
||||
mentors: MentorInfo[],
|
||||
limit: number
|
||||
): Map<string, MentorMatch[]> {
|
||||
const results = new Map<string, MentorMatch[]>()
|
||||
|
||||
for (const project of projects) {
|
||||
results.set(project.id, getAlgorithmicMatches(project, mentors, limit))
|
||||
}
|
||||
|
||||
const parsed = JSON.parse(content) as {
|
||||
matches: Array<{
|
||||
mentorIndex: number
|
||||
confidenceScore: number
|
||||
expertiseMatchScore: number
|
||||
reasoning: string
|
||||
}>
|
||||
}
|
||||
|
||||
return parsed.matches
|
||||
.filter((m) => m.mentorIndex >= 0 && m.mentorIndex < mentors.length)
|
||||
.map((m) => ({
|
||||
mentorId: mentors[m.mentorIndex].id,
|
||||
confidenceScore: m.confidenceScore,
|
||||
expertiseMatchScore: m.expertiseMatchScore,
|
||||
reasoning: m.reasoning,
|
||||
}))
|
||||
return results
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
@ -226,7 +394,6 @@ function getAlgorithmicMatches(
|
|||
})
|
||||
|
||||
if (project.description) {
|
||||
// Extract key words from description
|
||||
const words = project.description.toLowerCase().split(/\s+/)
|
||||
words.forEach((word) => {
|
||||
if (word.length > 4) projectKeywords.add(word.replace(/[^a-z]/g, ''))
|
||||
|
|
@ -267,7 +434,7 @@ function getAlgorithmicMatches(
|
|||
mentorId: mentor.id,
|
||||
confidenceScore: Math.round(confidenceScore * 100) / 100,
|
||||
expertiseMatchScore: Math.round(expertiseMatchScore * 100) / 100,
|
||||
reasoning: `Matched ${matchCount} keyword(s) with mentor expertise. Availability: ${availabilityScore > 0.5 ? 'Good' : 'Limited'}.`,
|
||||
reasoning: `Matched ${matchCount} keyword(s). Availability: ${availabilityScore > 0.5 ? 'Good' : 'Limited'}.`,
|
||||
}
|
||||
})
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,323 @@
|
|||
/**
|
||||
* AI Usage Tracking Utility
|
||||
*
|
||||
* Logs AI API usage to the database for cost tracking and monitoring.
|
||||
* Calculates estimated costs based on model pricing.
|
||||
*/
|
||||
|
||||
import { prisma } from '@/lib/prisma'
|
||||
import { Decimal } from '@prisma/client/runtime/library'
|
||||
import type { Prisma } from '@prisma/client'
|
||||
|
||||
// ─── Types ───────────────────────────────────────────────────────────────────
|
||||
|
||||
export type AIAction =
|
||||
| 'ASSIGNMENT'
|
||||
| 'FILTERING'
|
||||
| 'AWARD_ELIGIBILITY'
|
||||
| 'MENTOR_MATCHING'
|
||||
|
||||
export type AIStatus = 'SUCCESS' | 'PARTIAL' | 'ERROR'
|
||||
|
||||
export interface LogAIUsageInput {
|
||||
userId?: string
|
||||
action: AIAction
|
||||
entityType?: string
|
||||
entityId?: string
|
||||
model: string
|
||||
promptTokens: number
|
||||
completionTokens: number
|
||||
totalTokens: number
|
||||
batchSize?: number
|
||||
itemsProcessed?: number
|
||||
status: AIStatus
|
||||
errorMessage?: string
|
||||
detailsJson?: Record<string, unknown>
|
||||
}
|
||||
|
||||
export interface TokenUsageResult {
|
||||
promptTokens: number
|
||||
completionTokens: number
|
||||
totalTokens: number
|
||||
}
|
||||
|
||||
// ─── Model Pricing (per 1M tokens) ───────────────────────────────────────────
|
||||
|
||||
interface ModelPricing {
|
||||
input: number // $ per 1M input tokens
|
||||
output: number // $ per 1M output tokens
|
||||
}
|
||||
|
||||
/**
|
||||
* OpenAI model pricing as of 2024/2025
|
||||
* Prices in USD per 1 million tokens
|
||||
*/
|
||||
const MODEL_PRICING: Record<string, ModelPricing> = {
|
||||
// GPT-4o series
|
||||
'gpt-4o': { input: 2.5, output: 10.0 },
|
||||
'gpt-4o-2024-11-20': { input: 2.5, output: 10.0 },
|
||||
'gpt-4o-2024-08-06': { input: 2.5, output: 10.0 },
|
||||
'gpt-4o-2024-05-13': { input: 5.0, output: 15.0 },
|
||||
'gpt-4o-mini': { input: 0.15, output: 0.6 },
|
||||
'gpt-4o-mini-2024-07-18': { input: 0.15, output: 0.6 },
|
||||
|
||||
// GPT-4 Turbo series
|
||||
'gpt-4-turbo': { input: 10.0, output: 30.0 },
|
||||
'gpt-4-turbo-2024-04-09': { input: 10.0, output: 30.0 },
|
||||
'gpt-4-turbo-preview': { input: 10.0, output: 30.0 },
|
||||
'gpt-4-1106-preview': { input: 10.0, output: 30.0 },
|
||||
'gpt-4-0125-preview': { input: 10.0, output: 30.0 },
|
||||
|
||||
// GPT-4 (base)
|
||||
'gpt-4': { input: 30.0, output: 60.0 },
|
||||
'gpt-4-0613': { input: 30.0, output: 60.0 },
|
||||
'gpt-4-32k': { input: 60.0, output: 120.0 },
|
||||
'gpt-4-32k-0613': { input: 60.0, output: 120.0 },
|
||||
|
||||
// GPT-3.5 Turbo series
|
||||
'gpt-3.5-turbo': { input: 0.5, output: 1.5 },
|
||||
'gpt-3.5-turbo-0125': { input: 0.5, output: 1.5 },
|
||||
'gpt-3.5-turbo-1106': { input: 1.0, output: 2.0 },
|
||||
'gpt-3.5-turbo-16k': { input: 3.0, output: 4.0 },
|
||||
|
||||
// o1 reasoning models
|
||||
'o1': { input: 15.0, output: 60.0 },
|
||||
'o1-2024-12-17': { input: 15.0, output: 60.0 },
|
||||
'o1-preview': { input: 15.0, output: 60.0 },
|
||||
'o1-preview-2024-09-12': { input: 15.0, output: 60.0 },
|
||||
'o1-mini': { input: 3.0, output: 12.0 },
|
||||
'o1-mini-2024-09-12': { input: 3.0, output: 12.0 },
|
||||
|
||||
// o3 reasoning models
|
||||
'o3-mini': { input: 1.1, output: 4.4 },
|
||||
'o3-mini-2025-01-31': { input: 1.1, output: 4.4 },
|
||||
|
||||
// o4 reasoning models (future-proofing)
|
||||
'o4-mini': { input: 1.1, output: 4.4 },
|
||||
}
|
||||
|
||||
// Default pricing for unknown models (conservative estimate)
|
||||
const DEFAULT_PRICING: ModelPricing = { input: 5.0, output: 15.0 }
|
||||
|
||||
// ─── Cost Calculation ────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Get pricing for a model, with fallback for unknown models
|
||||
*/
|
||||
function getModelPricing(model: string): ModelPricing {
|
||||
// Exact match
|
||||
if (MODEL_PRICING[model]) {
|
||||
return MODEL_PRICING[model]
|
||||
}
|
||||
|
||||
// Try to match by prefix
|
||||
const modelLower = model.toLowerCase()
|
||||
for (const [key, pricing] of Object.entries(MODEL_PRICING)) {
|
||||
if (modelLower.startsWith(key.toLowerCase())) {
|
||||
return pricing
|
||||
}
|
||||
}
|
||||
|
||||
// Fallback based on model type
|
||||
if (modelLower.startsWith('gpt-4o-mini')) {
|
||||
return MODEL_PRICING['gpt-4o-mini']
|
||||
}
|
||||
if (modelLower.startsWith('gpt-4o')) {
|
||||
return MODEL_PRICING['gpt-4o']
|
||||
}
|
||||
if (modelLower.startsWith('gpt-4')) {
|
||||
return MODEL_PRICING['gpt-4-turbo']
|
||||
}
|
||||
if (modelLower.startsWith('gpt-3.5')) {
|
||||
return MODEL_PRICING['gpt-3.5-turbo']
|
||||
}
|
||||
if (modelLower.startsWith('o1-mini')) {
|
||||
return MODEL_PRICING['o1-mini']
|
||||
}
|
||||
if (modelLower.startsWith('o1')) {
|
||||
return MODEL_PRICING['o1']
|
||||
}
|
||||
if (modelLower.startsWith('o3-mini')) {
|
||||
return MODEL_PRICING['o3-mini']
|
||||
}
|
||||
if (modelLower.startsWith('o3')) {
|
||||
return MODEL_PRICING['o3-mini'] // Conservative estimate
|
||||
}
|
||||
if (modelLower.startsWith('o4')) {
|
||||
return MODEL_PRICING['o4-mini'] || DEFAULT_PRICING
|
||||
}
|
||||
|
||||
return DEFAULT_PRICING
|
||||
}
|
||||
|
||||
/**
|
||||
* Calculate estimated cost in USD for a given model and token usage
|
||||
*/
|
||||
export function calculateCost(
|
||||
model: string,
|
||||
promptTokens: number,
|
||||
completionTokens: number
|
||||
): number {
|
||||
const pricing = getModelPricing(model)
|
||||
|
||||
const inputCost = (promptTokens / 1_000_000) * pricing.input
|
||||
const outputCost = (completionTokens / 1_000_000) * pricing.output
|
||||
|
||||
return inputCost + outputCost
|
||||
}
|
||||
|
||||
/**
|
||||
* Format cost for display
|
||||
*/
|
||||
export function formatCost(costUsd: number): string {
|
||||
if (costUsd < 0.01) {
|
||||
return `$${(costUsd * 100).toFixed(3)}¢`
|
||||
}
|
||||
return `$${costUsd.toFixed(4)}`
|
||||
}
|
||||
|
||||
// ─── Logging ─────────────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Log AI usage to the database
|
||||
*/
|
||||
export async function logAIUsage(input: LogAIUsageInput): Promise<void> {
|
||||
try {
|
||||
const estimatedCost = calculateCost(
|
||||
input.model,
|
||||
input.promptTokens,
|
||||
input.completionTokens
|
||||
)
|
||||
|
||||
await prisma.aIUsageLog.create({
|
||||
data: {
|
||||
userId: input.userId,
|
||||
action: input.action,
|
||||
entityType: input.entityType,
|
||||
entityId: input.entityId,
|
||||
model: input.model,
|
||||
promptTokens: input.promptTokens,
|
||||
completionTokens: input.completionTokens,
|
||||
totalTokens: input.totalTokens,
|
||||
estimatedCostUsd: new Decimal(estimatedCost),
|
||||
batchSize: input.batchSize,
|
||||
itemsProcessed: input.itemsProcessed,
|
||||
status: input.status,
|
||||
errorMessage: input.errorMessage,
|
||||
detailsJson: input.detailsJson as Prisma.InputJsonValue | undefined,
|
||||
},
|
||||
})
|
||||
} catch (error) {
|
||||
// Don't let logging failures break the main operation
|
||||
console.error('[AI Usage] Failed to log usage:', error)
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Extract token usage from OpenAI API response
|
||||
*/
|
||||
export function extractTokenUsage(
|
||||
response: { usage?: { prompt_tokens?: number; completion_tokens?: number; total_tokens?: number } }
|
||||
): TokenUsageResult {
|
||||
return {
|
||||
promptTokens: response.usage?.prompt_tokens ?? 0,
|
||||
completionTokens: response.usage?.completion_tokens ?? 0,
|
||||
totalTokens: response.usage?.total_tokens ?? 0,
|
||||
}
|
||||
}
|
||||
|
||||
// ─── Statistics ──────────────────────────────────────────────────────────────
|
||||
|
||||
export interface AIUsageStats {
|
||||
totalTokens: number
|
||||
totalCost: number
|
||||
byAction: Record<string, { tokens: number; cost: number; count: number }>
|
||||
byModel: Record<string, { tokens: number; cost: number; count: number }>
|
||||
}
|
||||
|
||||
/**
|
||||
* Get AI usage statistics for a date range
|
||||
*/
|
||||
export async function getAIUsageStats(
|
||||
startDate?: Date,
|
||||
endDate?: Date
|
||||
): Promise<AIUsageStats> {
|
||||
const where: { createdAt?: { gte?: Date; lte?: Date } } = {}
|
||||
|
||||
if (startDate || endDate) {
|
||||
where.createdAt = {}
|
||||
if (startDate) where.createdAt.gte = startDate
|
||||
if (endDate) where.createdAt.lte = endDate
|
||||
}
|
||||
|
||||
const logs = await prisma.aIUsageLog.findMany({
|
||||
where,
|
||||
select: {
|
||||
action: true,
|
||||
model: true,
|
||||
totalTokens: true,
|
||||
estimatedCostUsd: true,
|
||||
},
|
||||
})
|
||||
|
||||
const stats: AIUsageStats = {
|
||||
totalTokens: 0,
|
||||
totalCost: 0,
|
||||
byAction: {},
|
||||
byModel: {},
|
||||
}
|
||||
|
||||
for (const log of logs) {
|
||||
const cost = log.estimatedCostUsd?.toNumber() ?? 0
|
||||
|
||||
stats.totalTokens += log.totalTokens
|
||||
stats.totalCost += cost
|
||||
|
||||
// By action
|
||||
if (!stats.byAction[log.action]) {
|
||||
stats.byAction[log.action] = { tokens: 0, cost: 0, count: 0 }
|
||||
}
|
||||
stats.byAction[log.action].tokens += log.totalTokens
|
||||
stats.byAction[log.action].cost += cost
|
||||
stats.byAction[log.action].count += 1
|
||||
|
||||
// By model
|
||||
if (!stats.byModel[log.model]) {
|
||||
stats.byModel[log.model] = { tokens: 0, cost: 0, count: 0 }
|
||||
}
|
||||
stats.byModel[log.model].tokens += log.totalTokens
|
||||
stats.byModel[log.model].cost += cost
|
||||
stats.byModel[log.model].count += 1
|
||||
}
|
||||
|
||||
return stats
|
||||
}
|
||||
|
||||
/**
|
||||
* Get current month's AI usage cost
|
||||
*/
|
||||
export async function getCurrentMonthCost(): Promise<{
|
||||
cost: number
|
||||
tokens: number
|
||||
requestCount: number
|
||||
}> {
|
||||
const startOfMonth = new Date()
|
||||
startOfMonth.setDate(1)
|
||||
startOfMonth.setHours(0, 0, 0, 0)
|
||||
|
||||
const logs = await prisma.aIUsageLog.findMany({
|
||||
where: {
|
||||
createdAt: { gte: startOfMonth },
|
||||
},
|
||||
select: {
|
||||
totalTokens: true,
|
||||
estimatedCostUsd: true,
|
||||
},
|
||||
})
|
||||
|
||||
return {
|
||||
cost: logs.reduce((sum, log) => sum + (log.estimatedCostUsd?.toNumber() ?? 0), 0),
|
||||
tokens: logs.reduce((sum, log) => sum + log.totalTokens, 0),
|
||||
requestCount: logs.length,
|
||||
}
|
||||
}
|
||||
Loading…
Reference in New Issue