Pubs Projects Tools Join Team About Home

Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting

Segmented light field images can serve as a powerful representation in autonomous driving tasks: for example, object pose tracking. Segment Anything Model 2 (SAM 2) allows producing semantically meaningful segments for monocular images and videos. In this work, we introduce:
  • A novel light field segmentation method that adapts SAM 2 to the light field domain.
  • Segmentation refinement, a two-step method of light field segmentation: disparity-based mask propagation followed by reprompting of SAM 2.
  • Semantic occluding, a technique that uses latent semantic features of the SAM 2 image encoder model to estimate the occluded regions of the segments to refine the prompts provided to the model.

We show that our method produces semantically accurate and spatio-angularly consistent segments, avoids excessive oversegmentation of objects, and achieves higher performance than SAM 2 video tracking while being 7 times faster.

Publications

•  N. Goncharov and D. G. Dansereau, “Segment anything in light fields for real-time applications via constrained prompting,” under review, 2024. Available here.

Citing

If you find this work useful please cite
@article{goncharov2024_arxiv,
  title = {Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting},
  author = {Nikolai Goncharov and Donald G. Dansereau},
  journal={arXiv preprint arXiv:2411.13840},
  year = {2024}
}

Acknowledgments

This research was supported in part by funding from Ford Motor Company.

Themes

Downloads

The code for the work is available here.