Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dwelltime: add standard errors based on hessian approximation #690

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

JoepVanlier
Copy link
Member

@JoepVanlier JoepVanlier commented Aug 19, 2024

Why this PR?
For some purposes, we need a quantity that calculates an approximate standard error rapidly.
Asymptotics are useful especially when the problem is well constrained (lots of data and a well-suited model).

image

2-component fit based on 100 dwells. Profile likelihood compared to asymptotic intervals. Confidence intervals would be given by the locations where the curves meet the threshold (dashed horizontal line).

image

2-component fit based on 1000 dwells. Profile likelihood compared to asymptotic intervals. Confidence intervals would be given by the locations where the curves meet the threshold (dashed horizontal line). Note how for lots of data, these two approaches produce almost identical results.

I have added the ability to plot them alongside the profiles, as it can be instructive to see how these type of errors compare. I think in the future, it would also be useful to explore this on the FdFitting side as well.

Small note on the implementation:

It is important to consider the constraint on the amplitudes when doing this uncertainty analysis. Without it, you get huge confidence intervals for alpha (since the problem is underdertermined). There are two ways of doing this. One you add entries to the Hessian corresponding to the constraint or you transform the derivatives into a subspace that fulfills the constraint, invert there and convert back. I chose the latter, since otherwise, we would have to choose how heavily we set this constraint constant. In the plot below you can see what varying the constraint does. You can basically see it converge to the solution we have now.

image

Comparing the current approach (dash-dot) with an approach where we take into account the effect of the constraint by adding values to components of the Hessian manually. Note how the approaches agree for large values of C.

The risk with the constant based one is that if you choose the constant too large, it blows up (see figure below), whereas if you choose it too small you don't take the constraint into account sufficiently.

image

Comparing the current approach (dash-dot) with an approach where we take into account the effect of the constraint by adding values to components of the Hessian manually. Note how excessively large values result in numerical problems.

Considering @rpauszek is working on the dwell time documentation at the moment, I have deferred writing documentation for this specifically at this time.

When calculating the asymptotic uncertainty interval, we should take into account that we actually impose a linear sum constraint (otherwise, the amplitudes will have indeterminate confidence intervals).

One could add the constraint explicitly to the Hessian by simply adding a large penalty term to the relevant derivatives. Considering that the sum constraint is of the form (1 - sum(a_i)) ** 2, this would result in adding a constant term d^2f/daidaj = -c with c large to all amplitude terms.

What is ugly is that we would need to choose this constant as large as possible without incurring numerical issues. This is why it is preferable to project onto the null space and then calculate the result back instead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant