Pubs Projects Tools Join Team About Home

Projects: Learning to See

Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting

  • An effective light field segmentation method
  • Combines epipolar constraints with the rich semantics learned by the pretrained SAM 2 foundation model for cross-view mask matching
  • Produces results of the same quality as the Segment Anything 2 (SAM 2) video tracking baseline, while being 7 times faster
  • Can be inferenced in real-time for autonomous driving problems such as object pose tracking

LBurst Learned Burst Feature Finder

  • A learned feature detector and descriptor for bursts of images
  • Noise-tolerant features outperform state of the art in low light
  • Enables 3D reconstruction from drone imagery in millilux conditions

TaCOS: Task-Specific Camera Optimization with Simulation

  • An end-to-end camera design method that co-designs cameras with perception tasks
  • We combine derivative-free and gradient-based optimizers and support continuous, discrete, and categorical parameters
  • A camera simulation including virtual environments and a physics-based noise model
  • Key step in simplifying the process of designing cameras for robots

Adapting CNNs for Fisheye Cameras without Retraining

  • RectConv adapts existing pretrained CNNs to work with fisheye images
  • Requires no additional data or training
  • Operates directly on the native fisheye image as captured from the camera
  • Works with multiple network architectures and tasks

NOCaL: Calibration-Free Odometry

  • Automatically interpreting new cameras by jointly learning novel view sythesis, odometry, and a camera model
  • A hypernetwork allows training with a wealth of existing cameras and datasets
  • A semi-supervised light field network adapts to newly introduced cameras
  • This work is a key step to automated integration of emerging camera technologies

Multi-modal learning: semantically accurate super-resolution with GANs

  • Jointly learning to super-resolve and label improves performance at both tasks
  • Adversarial training enforces perceptual realism
  • A feature loss forces semantic accuracy
  • Demonstration on aerial imagery for remote sensing

Learning to See with Sparse Light Field Video Cameras

  • Unsupervised learning of odometry and depth from sparse 4D light fields
  • Encoding sparse LFs for consumption by 2D CNNs for odometry and shape estimation
  • Toward unsupervised interpretation of general LF cameras and new imaging devices

Light Stage Object Classifier

  • Fast classification of visually similar objects using multiplexed illumination
  • Using light stage capture and rendering to drive optimization of multiplexing codes
  • Outperforms naive and conventional multiplexing patterns in accuracy and speed