Local Implicit Ray Function for Generalizable Radiance Field Representation

CVPR 2023


Xin Huang1, Qi Zhang2, Ying Feng2, Xiaoyu Li2, Xuan Wang2, Qing Wang1

1Northwestern Polytechnical University    2Tencent AI Lab

Abstract


Architechture

We propose LIRF (Local Implicit Ray Function), a generalizable neural rendering approach for novel view rendering. Current generalizable neural radiance fields (NeRF) methods sample a scene with a single ray per pixel and may therefore render blurred or aliased views when the input views and rendered views capture scene content with different resolutions. To solve this problem, we propose LIRF to aggregate the information from conical frustums to construct a ray. Given 3D positions within conical frustums, LIRF takes 3D coordinates and the features of conical frustums as inputs and predicts a local volumetric radiance field. Since the coordinates are continuous, LIRF renders high-quality novel views at a continuously-valued scale via volume rendering. Besides, we predict the visible weights for each input view via transformer-based feature matching to improve the performance in occluded areas. Experimental results on real-world scenes validate that our method outperforms state-of-the-art methods on novel view rendering of unseen scenes at arbitrary scales.


Pipeline Overview


Architechture

The overview of LIRF. Our goal is to predict volumetric radiance fields from a set of multi-view images captured at a consistent image scale (×1), and output novel views at continuous scales (×0.5 ∼ ×4). Our proposed framework is composed of five parts: 1) extracting 2D feature maps from source images, 2) obtaining the image feature for the samples on target rays via local implicit ray function, 3) predicting the visibility weights of each source view by matching feature patches, 4) aggregating local image features from different source views and mapping them into colors and densities, 5) rendering a target pixel via volume rendering.


Low Resolution Views and Close-up Shots

We train LIRF on datasets with input images at ×1 resolution and GT images at different resolutions (×0.5 ∼ ×4). LIRF is capable of rendering high-fidelity views at varying scales. Comparied with IBRNet, NeuRay, and GeoNeRF, LIRF produces Low Resolution Views (×0.5) with fewer aliasing artifacts and Close-up Shots (×2) with fewer blurring artifacts.



Varying Resolutions

LIRF is a novel method for novel view synthesis of unseen scenes. It not only renders novel views with fewer blurring artifacts, but also produces novel views at arbitrary scales, even at higher scales than input views.



Citation


@inproceedings{huang2023lirf,
    title={Local Implicit Ray Function for Generalizable Radiance Field Representation},
    author={Xin, Huang and Qi, Zhang and Ying, Feng and Xiaoyu, Li and Xuan, Wang and Qing, Wang},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2023}
}

More


Xin Huang et al. Inverting the Imaging Process by Learning an Implicit Camera Model. CVPR 2023. [Project Page]
Xin Huang et al. HDR-NeRF: High Dynamic Range Neural Radiance Fields. CVPR 2022. [Project Page]