Some questions a about the CurveIntegrator code. #5

chenkang455 · 2022-10-21T07:29:42Z

    integral = 2 * coeffs[:, 1] / 3
    for i in range(3, n_deg, 2):
        integral = integral + 2 * coeffs[:, i] / (i + 2)
    baseline = (2 * blurry[:, 0] - integral) / 2
    baseline = baseline.unsqueeze(dim=1)
    coeffs = torch.cat([baseline, coeffs], dim=1) # [bs, n_deg+1, h, w]
    return coeffs, integrator_cache

The above code is found on the CurveIntegrator. I have no idea about how this code works. For example [ integral = 2 * coeffs[:, 1] / 3],i donnot find the correspoding formula in your paper. I would appreciate it if you could solve my question! Thanks a lot!

The text was updated successfully, but these errors were encountered:

chenkang455 · 2022-10-21T08:06:37Z

class FrameConstructor(nn.Module):
def init(self):
super(FrameConstructor, self).init()

def forward(self, coeffs, timestamps):
    # coeffs: [bs, n_deg+1, h, w]
    # timestamps: [bs, n_ts, h, w] or [bs, n_ts]
    n_deg = coeffs.shape[1] - 1
    n_ts = timestamps.shape[1]
    # torch.unsqueeze 增加一个维度
    if len(timestamps.shape) == 2:
        timestamps = timestamps.unsqueeze(-1).unsqueeze(-1)
    # bases: [bs, n_deg+1, n_ts, h, w]
    bases = torch.stack([timestamps ** i for i in range(n_deg + 1)], dim=1)
    recon = coeffs.unsqueeze(2) * bases
    recon = torch.sum(recon, dim=1)
    return recon

Besides , i donnot understand the function of the FrameConstructor, which is shown on the above. Thanks a lot!!!

chensong1995 · 2022-10-21T18:14:06Z

Hello chenkang455,

Thanks for your interest in our work!

We approximate the intensity of each pixel using a parametric polynomial function. Given an hxw pixel grid, the video is represented as hxw=180*240=43200 different polynomials. We use the symbol L_{xy}(t) in the paper to refer to the polynomial function associated with the pixel whose coordinates are (x, y). The function takes one single input: the timestamp. The function returns one single value: the intensity. To render a video frame at a particular timestamp, say t_0=0.03, we substitute t=t_0=0.03 in all 43200 polynomials. This gives 43200 different intensities, and we can assemble them into a grayscale frame with a resolution of 180x240. We can then render a few more frames at other timestamps, t_1, t_2, ..., and all these frames make up the video describing the motion of interest.

In CurveIntegrator, the forward method takes three positional arguments: derivative, blurry, keypoints. blurry is the input blurry image. Its dimensions are (batch_size, 1, h, w). The dimensions of derivative and keypoints are both (batch_size, num_kpts, h, w). For the i^{th} example in the batch, the derivative of the intensity, or dL_{xy}(t)/dt, for all pixels (x, y), must go through num_kpts points on the 2D plane. The coordinates of these points are (keypoints[i, j, x, y], derivative[i, j, x, y]) for 0 <= j < num_kpts. This corresponds to Equation (6) in the paper. On this line, we use a pre-calculated tensor called integrator_cache to transform the coefficients into the standard bases: dL_{xy}(t)/dt = c_0 + 1/2 * c_1 * t + 1/3 * c_2 * t^2 + 1/4 * c_3 * t^3 ... , where c_j = coeffs[i, j, x, y] for the i^{th} example in the batch. Taking the indefinite integral, we have L_{x, y} = c_0 * t + c_1 * t^2 + c2 * t^3 + ... + a, where a is the constant "baseline" created as a by-product of the indefinite integral. Recall that in Equation (3), we state that the definite (not indefinite) integral over [-T/2, T/2], when divided by T, is equal to the input blurry pixel B_{xy}. This is a sufficient constraint for us to analytically solve for the value a. In our experiments, we set T = 2, which means -T/2 = -1 and T/2= 1. The definite integral is then given as \int_{-1}^{1} L_{x, y} = (1/2 * c_0 * t^2 + 1/3 * c_1 * t^3 + 1/4 * c2 * t^4 + ... + a * t) |_{t=-1}^{t=1} = 2/3 * c_1 + 2/5 * c_3 + 2/7 * c_5 + ... + 2 * a. One-half of that will be 1/3 * c_1 + 1/5 * c_3 + 1/7 * c_5 + ... + a. Since this is equal to B_{xy}, we have a = B_{xy} - (1/3 * c_1 + 1/5 * c_3 + 1/7 * c_5 + ...).

As for FrameConstructor, this class allows us to render frames from the polynomial coefficients (coeffs) at specified timestamps.

I hope this helps! I encourage you to follow the derivation step by step with pen and paper and cross-check the results using a debugger in Python. Let me know if you have further questions.

Best,
Chen

chenkang455 · 2022-10-22T09:42:51Z

Thank you for your detailed answer, which is very helpful to me. I've understood how the code works. Thanks a lot!

weimengting · 2022-12-29T05:26:45Z

The explanation is very elaborate. Thank you very much!

chensong1995 added the question Further information is requested label Oct 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions a about the CurveIntegrator code. #5

Some questions a about the CurveIntegrator code. #5

chenkang455 commented Oct 21, 2022

chenkang455 commented Oct 21, 2022

chensong1995 commented Oct 21, 2022 •

edited

Loading

chenkang455 commented Oct 22, 2022

weimengting commented Dec 29, 2022

Some questions a about the CurveIntegrator code. #5

Some questions a about the CurveIntegrator code. #5

Comments

chenkang455 commented Oct 21, 2022

chenkang455 commented Oct 21, 2022

chensong1995 commented Oct 21, 2022 • edited Loading

chenkang455 commented Oct 22, 2022

weimengting commented Dec 29, 2022

chensong1995 commented Oct 21, 2022 •

edited

Loading