S3OD: Synthetic Salient Object Detection

Upload an image to remove its background using S3OD!

S3OD is trained on a large-scale fully synthetic dataset (140K+ images) generated with diffusion models. The model uses a DPT-based architecture with DINOv3 vision transformer backbone for robust salient object detection.

Model Variants:

General (Synth + Real): Default model trained on synthetic data and fine-tuned on all real datasets (DUTS, DIS, HR-SOD)
Synthetic Only: Trained exclusively on S3OD synthetic dataset
DIS-tuned: Fine-tuned specifically for highly-accurate dichotomous segmentation
SOD-tuned: Optimized for general salient object detection tasks

Key Features:

Single-step background removal with soft masks (smooth edges)
Multi-mask prediction with IoU scoring
Ambiguity detection for uncertain predictions
Works on any image resolution

📄 Paper | 💻 GitHub | 🤗 Model | 🗂️ Dataset