AI Classification

Automatically analyse uploaded images using Cloudflare Workers AI to generate labels, captions, and dominant colours. This powers semantic search and makes your image library far more discoverable without any manual tagging.

Setup

Enable AI classification by setting the following environment variable:

bash

ENABLE_AI_CLASSIFICATION='true'

That's it. Images will now be automatically analysed after upload via the variant generation queue.

Optional: NSFW Detection

While not currently implemented out-of-the-box (Cloudflare Workers AI doesn't have a native NSFW model yet), you can enable the environment variable for when it becomes available:

bash

ENABLE_AI_NSFW_DETECTION='true'

For now, you'll need to integrate an external service like Sightengine or Replicate if you want NSFW detection. See the configuration guide in the source code for implementation details.

Advanced Configuration

Fine-tune AI behaviour with these optional variables:

bash

# Downscale images before AI processing (saves costs)
AI_IMAGE_DOWNSCALE_WIDTH='384'  # Default: 384px

# Maximum labels to store per image
AI_MAX_LABELS='10'  # Default: 10

# Minimum confidence score (0.0-1.0)
AI_MIN_CLASSIFICATION_SCORE='0.1'  # Default: 0.1

# Processing timeout (milliseconds)
AI_PROCESSING_TIMEOUT_MS='30000'  # Default: 30s

# Queue batch size
AI_BATCH_SIZE='5'  # Default: 5 images per batch

How It Works

When an image finishes uploading, the variant generation queue automatically triggers AI analysis:

mermaid

sequenceDiagram
    participant Upload as Upload Complete
    participant Queue as Variant Queue
    participant AI as Workers AI
    participant D1 as D1 Database

    Upload->>Queue: Image uploaded
    Queue->>Queue: Fetch original from R2
    Queue->>Queue: Downscale to 384px
    Queue->>AI: Classify (ResNet-50)
    AI-->>Queue: Labels + scores
    Queue->>AI: Generate caption (UForm Gen2)
    AI-->>Queue: Text description
    Queue->>D1: Store in image_ai table

    Note over Queue,D1: Results available via API

Models Used

Classification: @cf/facebook/detr-resnet-50 - Identifies objects and scenes in images
Captioning: @cf/unum/uform-gen2-qwen-500m - Generates natural language descriptions
Embeddings: @cf/baai/bge-base-en-v1.5 - Creates 768-dimensional vectors for semantic search (if Vectorize is enabled)

Processing Pipeline

Image is downscaled to 384px width to reduce processing costs
ResNet-50 model extracts classification labels (e.g., "cat", "outdoor", "tree")
UForm model generates a natural language caption (e.g., "A ginger cat sitting on a wooden fence")
Labels are filtered by confidence score (default: minimum 0.1) and limited to top 10
Results stored in the image_ai table with metadata:
- labels_json - Array of label strings
- label_scores_json - Corresponding confidence scores
- caption - Generated description
- model_version - Which models were used
- processing_time_ms - Performance tracking
- analyzed_at - Timestamp (used for re-analysis cooldown)

Search Integration

Classification results power the search functionality:

Full-text search searches through labels and captions
Semantic search (if Vectorize enabled) uses caption embeddings for similarity matching
Hybrid search combines both approaches for better results

Re-analysis Cooldown

To prevent excessive AI costs, images can only be re-analysed once every 28 days. This cooldown period is configurable in the source code (AI_PROCESSING.REANALYSIS_COOLDOWN_DAYS).

Cost Optimisation

AI classification uses Cloudflare Workers AI, which bills per request:

Classification: ~$0.011 per 1,000 images
Captioning: ~$0.006 per 1,000 images
Embeddings: ~$0.004 per 1,000 images

Total: Roughly $0.021 per 1,000 images (~2.1¢ per image)

To reduce costs:

Downscale images - Smaller input = faster processing. Default is 384px which provides good accuracy.
Increase MIN_SCORE - Fewer labels stored = smaller database, though accuracy may suffer.
Disable captioning - Edit packages/api/src/services/ai-classification.ts and skip the runCaptioning() call if you don't need it.
Disable semantic search - Set ENABLE_VECTORIZE='false' to skip embedding generation entirely.

Limitations

Colour extraction is currently disabled (requires image decoder library not available in Workers)
NSFW detection requires external integration (Cloudflare doesn't provide a native model yet)
Local development doesn't support Vectorize embeddings (remote-only)
Re-analysis has a 28-day cooldown to prevent abuse

Disabling

To disable AI classification entirely:

bash

ENABLE_AI_CLASSIFICATION='false'

Images will upload normally but won't be analysed. Existing AI data remains in the database.

AI Classification ​

Setup ​

Optional: NSFW Detection ​

Advanced Configuration ​

How It Works ​

Models Used ​

Processing Pipeline ​

Search Integration ​

Re-analysis Cooldown ​

Cost Optimisation ​

Limitations ​

Disabling ​

AI Classification

Setup

Optional: NSFW Detection

Advanced Configuration

How It Works

Models Used

Processing Pipeline

Search Integration

Re-analysis Cooldown

Cost Optimisation

Limitations

Disabling