AI Classification
Automatically analyse uploaded images using Cloudflare Workers AI to generate labels, captions, and dominant colours. This powers semantic search and makes your image library far more discoverable without any manual tagging.
Setup
Enable AI classification by setting the following environment variable:
ENABLE_AI_CLASSIFICATION='true'That's it. Images will now be automatically analysed after upload via the variant generation queue.
Optional: NSFW Detection
While not currently implemented out-of-the-box (Cloudflare Workers AI doesn't have a native NSFW model yet), you can enable the environment variable for when it becomes available:
ENABLE_AI_NSFW_DETECTION='true'For now, you'll need to integrate an external service like Sightengine or Replicate if you want NSFW detection. See the configuration guide in the source code for implementation details.
Advanced Configuration
Fine-tune AI behaviour with these optional variables:
# Downscale images before AI processing (saves costs)
AI_IMAGE_DOWNSCALE_WIDTH='384' # Default: 384px
# Maximum labels to store per image
AI_MAX_LABELS='10' # Default: 10
# Minimum confidence score (0.0-1.0)
AI_MIN_CLASSIFICATION_SCORE='0.1' # Default: 0.1
# Processing timeout (milliseconds)
AI_PROCESSING_TIMEOUT_MS='30000' # Default: 30s
# Queue batch size
AI_BATCH_SIZE='5' # Default: 5 images per batchHow It Works
When an image finishes uploading, the variant generation queue automatically triggers AI analysis:
sequenceDiagram
participant Upload as Upload Complete
participant Queue as Variant Queue
participant AI as Workers AI
participant D1 as D1 Database
Upload->>Queue: Image uploaded
Queue->>Queue: Fetch original from R2
Queue->>Queue: Downscale to 384px
Queue->>AI: Classify (ResNet-50)
AI-->>Queue: Labels + scores
Queue->>AI: Generate caption (UForm Gen2)
AI-->>Queue: Text description
Queue->>D1: Store in image_ai table
Note over Queue,D1: Results available via APIModels Used
- Classification:
@cf/facebook/detr-resnet-50- Identifies objects and scenes in images - Captioning:
@cf/unum/uform-gen2-qwen-500m- Generates natural language descriptions - Embeddings:
@cf/baai/bge-base-en-v1.5- Creates 768-dimensional vectors for semantic search (if Vectorize is enabled)
Processing Pipeline
- Image is downscaled to 384px width to reduce processing costs
- ResNet-50 model extracts classification labels (e.g., "cat", "outdoor", "tree")
- UForm model generates a natural language caption (e.g., "A ginger cat sitting on a wooden fence")
- Labels are filtered by confidence score (default: minimum 0.1) and limited to top 10
- Results stored in the
image_aitable with metadata:labels_json- Array of label stringslabel_scores_json- Corresponding confidence scorescaption- Generated descriptionmodel_version- Which models were usedprocessing_time_ms- Performance trackinganalyzed_at- Timestamp (used for re-analysis cooldown)
Search Integration
Classification results power the search functionality:
- Full-text search searches through labels and captions
- Semantic search (if Vectorize enabled) uses caption embeddings for similarity matching
- Hybrid search combines both approaches for better results
Re-analysis Cooldown
To prevent excessive AI costs, images can only be re-analysed once every 28 days. This cooldown period is configurable in the source code (AI_PROCESSING.REANALYSIS_COOLDOWN_DAYS).
Cost Optimisation
AI classification uses Cloudflare Workers AI, which bills per request:
- Classification: ~$0.011 per 1,000 images
- Captioning: ~$0.006 per 1,000 images
- Embeddings: ~$0.004 per 1,000 images
Total: Roughly $0.021 per 1,000 images (~2.1¢ per image)
To reduce costs:
- Downscale images - Smaller input = faster processing. Default is 384px which provides good accuracy.
- Increase MIN_SCORE - Fewer labels stored = smaller database, though accuracy may suffer.
- Disable captioning - Edit
packages/api/src/services/ai-classification.tsand skip therunCaptioning()call if you don't need it. - Disable semantic search - Set
ENABLE_VECTORIZE='false'to skip embedding generation entirely.
Limitations
- Colour extraction is currently disabled (requires image decoder library not available in Workers)
- NSFW detection requires external integration (Cloudflare doesn't provide a native model yet)
- Local development doesn't support Vectorize embeddings (remote-only)
- Re-analysis has a 28-day cooldown to prevent abuse
Disabling
To disable AI classification entirely:
ENABLE_AI_CLASSIFICATION='false'Images will upload normally but won't be analysed. Existing AI data remains in the database.