MOPC-App/ai-data-processing.md at 928b1c65dce5c907be42ece82a057831c105e02f

7.1 KiB

Raw Blame History

Overview

This document describes how project data is processed by AI services in the MOPC Platform, ensuring compliance with GDPR Articles 5, 6, 13-14, 25, and 32.

Legal Basis

Processing Activity	Legal Basis	GDPR Article
AI-powered project filtering	Legitimate interest	Art. 6(1)(f)
AI-powered jury assignment	Legitimate interest	Art. 6(1)(f)
AI-powered award eligibility	Legitimate interest	Art. 6(1)(f)
AI-powered mentor matching	Legitimate interest	Art. 6(1)(f)

Legitimate Interest Justification: AI processing is used to efficiently evaluate ocean conservation projects and match appropriate reviewers, directly serving the platform's purpose of managing the Monaco Ocean Protection Challenge.

Data Minimization (Article 5(1)(c))

The AI system applies strict data minimization:

Only necessary fields sent to AI (no names, emails, phone numbers)
Descriptions truncated to 300-500 characters maximum
Team size sent as count only (no member details)
Dates sent as year-only or ISO date (no timestamps)
IDs replaced with sequential anonymous identifiers (P1, P2, etc.)

Anonymization Measures

Data NEVER Sent to AI

Data Type	Reason
Personal names	PII - identifying
Email addresses	PII - identifying
Phone numbers	PII - identifying
Physical addresses	PII - identifying
External URLs	Could identify individuals
Internal project/user IDs	Could be cross-referenced
Team member details	PII - identifying
Internal comments	May contain PII
File content	May contain PII

Data Sent to AI (Anonymized)

Field	Type	Purpose	Anonymization
project_id	String	Reference	Replaced with P1, P2, etc.
title	String	Spam detection	PII patterns removed
description	String	Criteria matching	Truncated, PII stripped
category	Enum	Filtering	As-is (no PII)
ocean_issue	Enum	Topic filtering	As-is (no PII)
country	String	Geographic eligibility	As-is (country name only)
region	String	Regional eligibility	As-is (zone name only)
institution	String	Student identification	As-is (institution name only)
tags	Array	Keyword matching	As-is (no PII expected)
founded_year	Number	Age filtering	Year only, not full date
team_size	Number	Team requirements	Count only
file_count	Number	Document checks	Count only
file_types	Array	File requirements	Type names only
wants_mentorship	Boolean	Mentorship filtering	As-is
submission_source	Enum	Source filtering	As-is
submitted_date	String	Deadline checks	Date only, no time

Technical Safeguards

PII Detection and Stripping

// Patterns detected and removed before AI processing
const PII_PATTERNS = {
  email: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g,
  phone: /(\+?\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}/g,
  url: /https?:\/\/[^\s]+/g,
  ssn: /\d{3}-\d{2}-\d{4}/g,
  ipv4: /\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/g,
}

Validation Before Every AI Call

// GDPR compliance enforced before EVERY API call
export function enforceGDPRCompliance(data: unknown[]): void {
  for (const item of data) {
    const { valid, violations } = validateNoPersonalData(item)
    if (!valid) {
      throw new Error(`GDPR compliance check failed: ${violations.join(', ')}`)
    }
  }
}

ID Anonymization

Real IDs are never sent to AI. Instead:

Projects: cm1abc123... → P1, P2, P3
Jurors: cm2def456... → juror_001, juror_002
Results mapped back using secure mapping tables

Data Retention

Data Type	Retention	Deletion Method
AI usage logs	12 months	Automatic deletion
Anonymized prompts	Not stored	Sent directly to API
AI responses	Not stored	Parsed and discarded

Note: OpenAI does not retain API data for training (per their API Terms). API data is retained for up to 30 days for abuse monitoring, configurable to 0 days.

Subprocessor: OpenAI

Aspect	Details
Subprocessor	OpenAI, Inc.
Location	United States
DPA Status	Data Processing Agreement in place
Safeguards	Standard Contractual Clauses (SCCs)
Compliance	SOC 2 Type II, GDPR-compliant
Data Use	API data NOT used for model training

OpenAI DPA: https://openai.com/policies/data-processing-agreement

Audit Trail

All AI processing is logged:

await prisma.aIUsageLog.create({
  data: {
    userId: ctx.user.id,      // Who initiated
    action: 'FILTERING',       // What type
    entityType: 'Round',       // What entity
    entityId: roundId,         // Which entity
    model: 'gpt-4o',          // What model
    totalTokens: 1500,        // Resource usage
    status: 'SUCCESS',        // Outcome
  },
})

Data Subject Rights

Right of Access (Article 15)

Users can request:

What data was processed by AI
When AI processing occurred
What decisions were made

Implementation: Export AI usage logs for user's projects.

Right to Erasure (Article 17)

When a user requests deletion:

AI usage logs for their projects can be deleted
No data remains at OpenAI (API data not retained for training)

Note: Since only anonymized data is sent to AI, there is no personal data at OpenAI to delete.

Right to Object (Article 21)

Users can request to opt out of AI processing:

Admin can disable AI features per round
Manual review fallback available for all AI features

Risk Assessment

Risk: PII Leakage to AI Provider

Factor	Assessment
Likelihood	Very Low
Impact	Medium
Mitigation	Automated PII detection, validation before every call
Residual Risk	Very Low

Risk: AI Decision Bias

Factor	Assessment
Likelihood	Low
Impact	Low
Mitigation	Human review of all AI suggestions, algorithmic fallback
Residual Risk	Very Low

Risk: Data Breach at Subprocessor

Factor	Assessment
Likelihood	Very Low
Impact	Low (only anonymized data)
Mitigation	OpenAI SOC 2 compliance, no PII sent
Residual Risk	Very Low

Compliance Checklist

Data minimization applied (only necessary fields)
PII stripped before AI processing
Anonymization validated before every API call
DPA in place with OpenAI
Audit logging of all AI operations
Fallback available when AI declined
Usage logs retained for 12 months only
No personal data stored at subprocessor

Contact

For questions about AI data processing:

Data Protection Officer: [DPO email]
Technical Contact: [Tech contact email]

7.1 KiB Raw Blame History

AI Data Processing - GDPR Compliance Documentation