S3OD: Synthetic Salient Object Detection

Upload an image to remove its background using S3OD!

S3OD is trained on a large-scale fully synthetic dataset (140K+ images) generated with diffusion models. The model uses a DPT-based architecture with DINOv3 vision transformer backbone for robust salient object detection.

Model Variants:

  • General (Synth + Real): Default model trained on synthetic data and fine-tuned on all real datasets (DUTS, DIS, HR-SOD)
  • Synthetic Only: Trained exclusively on S3OD synthetic dataset
  • DIS-tuned: Fine-tuned specifically for highly-accurate dichotomous segmentation
  • SOD-tuned: Optimized for general salient object detection tasks

Key Features:

  • Single-step background removal with soft masks (smooth edges)
  • Multi-mask prediction with IoU scoring
  • Ambiguity detection for uncertain predictions
  • Works on any image resolution

📄 Paper | 💻 GitHub | 🤗 Model | 🗂️ Dataset

Model Variant

Choose the model variant trained on different datasets

Output Format
0 1