Surf-NeRF: Surface Regularised Neural Radiance Fields
Neural Radiance Fields (NeRFs) provide a high fidelity, continuous scene representation that can realistically represent complex behaviour of light. Despite recent works like Ref-NeRF improving geometry through physics-inspired models, the ability for a NeRF to overcome shape-radiance ambiguity and converge to a representation consistent with real geometry remains limited. We demonstrate how curriculum learning of a surface light field model helps a NeRF converge towards a more geometrically accurate scene representation. We introduce four additional regularisation terms to impose geometric smoothness, consistency of normals and a separation of Lambertian and specular appearance at geometry in the scene, conforming to physical models. Our approach yields improvements of 14.4% to normals on positionally encoded NeRFs and 9.2% on grid-based models compared to current reflection-based NeRF variants. This includes a separated view-dependent appearance, conditioning a NeRF to have a geometric representation consistent with the captured scene. We demonstrate compatibility of our method with existing NeRF variants, as a key step in enabling radiance-based representations for geometry critical applications.
In this work:
- We devise a novel regularisation approach which uses the structure of a neural radiance field to sample density, normals and appearance in the vicinity of geometry in the scene, allowing for additional representation-driven regularisation terms to be applied.
- We apply local regularisation consistent with a surface light field radiance model, including geometric smoothness of density, local consistency of normals and a physically correct separation of Lambertian and specular appearance using a light interaction model.
- We leverage curriculum learning of a NeRF towards a more accurate geometric scene representation which maintains visual fidelity whilst refining the density representation of the scene.
Whilst we benchmark our approach on state-of-the-art physics based NeRF variants, our methodology may also be applied to other NeRF frameworks.
This work is a key step in the deployment of NeRFs as a scene representation where both geometric and visual fidelity are critical, like robotic manipulation and navigation in complex unstructured environments.
Publications
• J. Naylor, V. Ila, and D. G. Dansereau, “Surf-NeRF: Surface regularised neural radiance fields,” under review, 2024. Available here.
Citing
If you find this work useful please cite:
Acknowledgments
We would like to thank our reviewers for their thoughtful comments in improving this work. This research was undertaken with the assistance of resources and services from the National Computational Infrastructure (NCI), which is supported by the Australian Government. Access to these resources was provided through the Sydney Informatics Hub, a Core Research Facility of the University of Sydney.
Downloads
The code will appear on GitHub
here once ready for release.
Data will be available soon.
Gallery
(click to enlarge)
We demonstrate that applying our surface light field inspired regularisation in a curriculum learning fashion allows results to approach more realistic separation of appearance and geometry whilst maintaining visual fidelity.
By applying regularisation more frequently, we trade-off between overall training time and a complete separation of diffuse and specular components of the scene with correct geometry.
We use a first surface assumption to subsequently sample local regions and regularise depth and normals along with a smooth appearance model.
Where prior reflection-based models exhibit view-independent colour in the specular term, we regularise this Lambertian bias to more realistically represent the scene appearance.
Importantly, our approach makes no changes to model architecture, but rather introduces a second physically-based regularisation to reflection-based models. We envisage it's compatibility with similar volumetric NeRF models.
Our approach yields realistic separation of Lambertian scene elements from the view-dependent colour of the scene (pin stripes on the racer, toast).
The proposed approach also yields more accurate geometry in terms of depth and surface normals.
We also benchmark our approach on sinusoidally encoded NeRF variants which approximate a continuous field of density compared to hash-based encoding.
We demonstrate improved fidelity on these models also, notably on flat, textureless regions.
To benchmark our approach on challenging scenes, we include results from a captured dataset termed the Koala dataset. This dataset contains comparatively few views compared to other captured datasets, increasing shape-radiance ambiguity. We show better separation of specular and diffuse appearances of the scene, and enhanced geometric fidelity in regions around these visually challenging objects.
We show improved qualitative results across all scenes in the Shiny Blender dataset from Ref-NeRF. Quantitatively, we also demonstrate improved surface geometry in terms of both disparity and surface normals. See the paper for full quantitative results.
We demonstrate continued improvement by increasing the rate at which regularisation occurs. We perform a parameter study on our curriculum learning frequency to demonstrate this capability.