Equipped with the continuous representation capability of Multi-Layer Perceptron (MLP), Implicit Neural Representation (INR) has been successfully employed for Arbitrary-scale Super-Resolution (ASR). However, the limited receptive field of the linear layers in MLP restricts the representation capability of INR, while it is computationally expensive to query the MLP numerous times to render each pixel. Recently, Gaussian Splatting (GS) has shown its advantages over INR in both visual quality and rendering speed in 3D tasks, which motivates us to explore whether GS can be employed for the ASR task. However, directly applying GS to ASR is exceptionally challenging because the original GS is an optimization-based method through overfitting each single scene, while in ASR we aim to learn a single model that can generalize to different images and scaling factors. We overcome these challenges by developing two novel techniques. Firstly, to generalize GS for ASR, we elaborately design an architecture to predict the corresponding image-conditioned Gaussians of the input low-resolution image in a feed-forward manner. Secondly, we implement an efficient differentiable 2D GPU/CUDA-based scale-aware rasterization to render super-resolved images by sampling discrete RGB values from the predicted contiguous Gaussians. Via end-to-end training, our optimized network, namely GSASR, can perform ASR for any image and unseen scaling factors. Extensive experiments validate the effectiveness of our proposed method.
基于多层感知机(MLP)的连续表示能力,隐式神经表示(INR)已被成功应用于任意比例超分辨率(ASR)。然而,MLP 中线性层的有限感受野限制了 INR 的表示能力,同时多次查询 MLP 来渲染每个像素的计算开销较高。最近,高斯散点(Gaussian Splatting, GS)在 3D 任务中展现了其在视觉质量和渲染速度上的优势,这促使我们探索 GS 是否可以被用于 ASR 任务。然而,直接将 GS 应用于 ASR 面临极大的挑战,因为原始 GS 是一种通过对每个单一场景进行过拟合的优化方法,而在 ASR 中,我们的目标是学习一个可以泛化到不同图像和缩放因子的单一模型。 我们通过开发两项新技术克服了这些挑战。首先,为了将 GS 泛化到 ASR,我们精心设计了一种架构,以前馈的方式预测与输入低分辨率图像相关的图像条件高斯分布。其次,我们实现了一种高效的基于 GPU/CUDA 的可微分二维缩放感知光栅化,通过从预测的连续高斯分布中采样离散的 RGB 值来渲染超分辨率图像。通过端到端的训练,我们优化的网络,即 GSASR,可以对任意图像和未见过的缩放因子执行 ASR。大量实验验证了我们提出方法的有效性。