Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions a about the CurveIntegrator code. #5

Open
chenkang455 opened this issue Oct 21, 2022 · 4 comments
Open

Some questions a about the CurveIntegrator code. #5

chenkang455 opened this issue Oct 21, 2022 · 4 comments
Labels
question Further information is requested

Comments

@chenkang455
Copy link

    integral = 2 * coeffs[:, 1] / 3
    for i in range(3, n_deg, 2):
        integral = integral + 2 * coeffs[:, i] / (i + 2)
    baseline = (2 * blurry[:, 0] - integral) / 2
    baseline = baseline.unsqueeze(dim=1)
    coeffs = torch.cat([baseline, coeffs], dim=1) # [bs, n_deg+1, h, w]
    return coeffs, integrator_cache

The above code is found on the CurveIntegrator. I have no idea about how this code works. For example [ integral = 2 * coeffs[:, 1] / 3],i donnot find the correspoding formula in your paper. I would appreciate it if you could solve my question! Thanks a lot!

@chenkang455
Copy link
Author

class FrameConstructor(nn.Module):
def init(self):
super(FrameConstructor, self).init()

def forward(self, coeffs, timestamps):
    # coeffs: [bs, n_deg+1, h, w]
    # timestamps: [bs, n_ts, h, w] or [bs, n_ts]
    n_deg = coeffs.shape[1] - 1
    n_ts = timestamps.shape[1]
    # torch.unsqueeze 增加一个维度
    if len(timestamps.shape) == 2:
        timestamps = timestamps.unsqueeze(-1).unsqueeze(-1)
    # bases: [bs, n_deg+1, n_ts, h, w]
    bases = torch.stack([timestamps ** i for i in range(n_deg + 1)], dim=1)
    recon = coeffs.unsqueeze(2) * bases
    recon = torch.sum(recon, dim=1)
    return recon

Besides , i donnot understand the function of the FrameConstructor, which is shown on the above. Thanks a lot!!!

@chensong1995
Copy link
Owner

chensong1995 commented Oct 21, 2022

Hello chenkang455,

Thanks for your interest in our work!

We approximate the intensity of each pixel using a parametric polynomial function. Given an hxw pixel grid, the video is represented as hxw=180*240=43200 different polynomials. We use the symbol L_{xy}(t) in the paper to refer to the polynomial function associated with the pixel whose coordinates are (x, y). The function takes one single input: the timestamp. The function returns one single value: the intensity. To render a video frame at a particular timestamp, say t_0=0.03, we substitute t=t_0=0.03 in all 43200 polynomials. This gives 43200 different intensities, and we can assemble them into a grayscale frame with a resolution of 180x240. We can then render a few more frames at other timestamps, t_1, t_2, ..., and all these frames make up the video describing the motion of interest.

In CurveIntegrator, the forward method takes three positional arguments: derivative, blurry, keypoints. blurry is the input blurry image. Its dimensions are (batch_size, 1, h, w). The dimensions of derivative and keypoints are both (batch_size, num_kpts, h, w). For the i^{th} example in the batch, the derivative of the intensity, or dL_{xy}(t)/dt, for all pixels (x, y), must go through num_kpts points on the 2D plane. The coordinates of these points are (keypoints[i, j, x, y], derivative[i, j, x, y]) for 0 <= j < num_kpts. This corresponds to Equation (6) in the paper. On this line, we use a pre-calculated tensor called integrator_cache to transform the coefficients into the standard bases: dL_{xy}(t)/dt = c_0 + 1/2 * c_1 * t + 1/3 * c_2 * t^2 + 1/4 * c_3 * t^3 ... , where c_j = coeffs[i, j, x, y] for the i^{th} example in the batch. Taking the indefinite integral, we have L_{x, y} = c_0 * t + c_1 * t^2 + c2 * t^3 + ... + a, where a is the constant "baseline" created as a by-product of the indefinite integral. Recall that in Equation (3), we state that the definite (not indefinite) integral over [-T/2, T/2], when divided by T, is equal to the input blurry pixel B_{xy}. This is a sufficient constraint for us to analytically solve for the value a. In our experiments, we set T = 2, which means -T/2 = -1 and T/2= 1. The definite integral is then given as \int_{-1}^{1} L_{x, y} = (1/2 * c_0 * t^2 + 1/3 * c_1 * t^3 + 1/4 * c2 * t^4 + ... + a * t) |_{t=-1}^{t=1} = 2/3 * c_1 + 2/5 * c_3 + 2/7 * c_5 + ... + 2 * a. One-half of that will be 1/3 * c_1 + 1/5 * c_3 + 1/7 * c_5 + ... + a. Since this is equal to B_{xy}, we have a = B_{xy} - (1/3 * c_1 + 1/5 * c_3 + 1/7 * c_5 + ...).

As for FrameConstructor, this class allows us to render frames from the polynomial coefficients (coeffs) at specified timestamps.

I hope this helps! I encourage you to follow the derivation step by step with pen and paper and cross-check the results using a debugger in Python. Let me know if you have further questions.

Best,
Chen

@chensong1995 chensong1995 added the question Further information is requested label Oct 21, 2022
@chenkang455
Copy link
Author

Thank you for your detailed answer, which is very helpful to me. I've understood how the code works. Thanks a lot!

@weimengting
Copy link

The explanation is very elaborate. Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants