Inverting the Imaging Process by Learning an Implicit Camera Model

CVPR 2023


Xin Huang1, Qi Zhang2, Ying Feng2, Hongdong Li3, Qing Wang1

1Northwestern Polytechnical University    2Tencent AI Lab     3Australian National University

Abstract


Architechture

Representing visual signals with implicit coordinate-based neural networks, as an effective replacement of the traditional discrete signal representation, has gained considerable popularity in computer vision and graphics. In contrast to existing implicit neural representations which focus on modelling the scene only, this paper proposes a novel implicit camera model which represents the physical imaging process of a camera as a deep neural network. We demonstrate the power of this new implicit camera model on two inverse imaging tasks: i) generating all-in-focus photos, and ii) HDR imaging. Specifically, we devise an implicit blur generator and an implicit tone mapper to model the aperture and exposure of the camera's imaging process, respectively. Our implicit camera model is jointly learned together with implicit scene models under multi-focus stack and multi-exposure bracket supervision. We have demonstrated the effectiveness of our new model on a large number of test images and videos, producing accurate and visually appealing all-in-focus and high dynamic range images. In principle, our new implicit neural camera model has the potential to benefit a wide array of other inverse imaging tasks.


Pipeline Overview

In this paper, we propose an interesting component for implicit neural representations, an implicit camera model, to simulate the physical imaging process. In particular, our camera model contains an implicit blur generator module and an implicit tone mapper module, to estimate the point spread function and camera response function respectively. It is jointly optimized with scene models to invert the imaging process under the supervision of visual signals with different focuses and exposures.


Architechture

Video Deblurring

Our implicit camera model is applicable to video enhancement combined with video scene representations. We adopt the layered neural atlases representation, which decomposes the video into a set of layered 2D atlases to deal with object motions and camera motions. We evaluate our model for video deblurring on Deep Video Deblurring (DVD) dataset.



Video HDR Imaging

For the video HDR imaging task, the input is a video with alternating exposures. We show the results for HDR video reconstruction. We can see that our method recovers the texture of over-exposed areas based on information from other frames with a lower exposure.



Exposure and Focus Editing

The other consequence of our implicit camera model is that it enables rendering images with modified camera settings. When we keep the blur generator and tone mapper during the inference, our method can control the focus and exposure of rendered images.



Citation


@inproceedings{huang2023inverting,
    title={Inverting the Imaging Process by Learning an Implicit Camera Model},
    author={Xin, Huang and Qi, Zhang and Ying, Feng and Hongdong, Li and Qing, Wang},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2023}
}

More


Xin Huang et al. Local Implicit Ray Function for Generalizable Radiance Field Representation. CVPR 2023. [Project Page]
Xin Huang et al. HDR-NeRF: High Dynamic Range Neural Radiance Fields. CVPR 2022. [Project Page]