-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adopt AbstractMCMC.jl interface #259
Conversation
@@ -211,7 +213,7 @@ nsteps(τ::Trajectory{TS, I, TC}) where {TS, I, TC<:FixedIntegrationTime} = | |||
## Kernel interface | |||
## | |||
|
|||
struct HMCKernel{R, T<:Trajectory} <: AbstractMCMCKernel | |||
struct HMCKernel{R, T<:Trajectory} <: AbstractMCMCKernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to double check this is just the removal of a space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. But also, now AbstractMCMCKernel
is AbstractMCMC.AbstractSampler
(I'll remove this alias, and make it explicit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any side effect it might cause?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think so. Anything particular in mind or just general question?
The only thing is that if we remove AbstractMCMCKernel
completely, not even aliasing, e.g. CoupledHMC.jl won't work.
Maybe best approach is to just make AbstractMCMCKernel <: AbstractMCMC.AbstractSampler
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine as long as CoupledHMC is fine (with sutff like mixture kernels still working).
src/sampler.jl
Outdated
return AbstractMCMC.step(rng, model, spl, state; kwargs...) | ||
end | ||
|
||
function AbstractMCMC.step( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep the old signature, which would construct HamiltonianModel
and HMCState
then call this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe. We have already introduced breaking changes on master with the introduction of HMCKernel
, etc., so it seems like it's as good a time as any to also transition completely over to AbstractMCMC.jl interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it's not like there's anything in the current impl that isn't supported by the new implementation:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There wasn't actually breaking changes to the outmost intefaces (e.g. those need to call sample).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, what do you mean? E.g. the example in the README doesn't work now since proposal
isn't constructed in the same way (might also be other issues, but that's the immediate one that came to mind).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'm cool with adding the old sample
back and deprecating:) That seems like a clean approach 👍
Also it's a bit cumbersome to write sample(..., HamiltonianModel(hamiltonian), ..) instead of sample(..., hamiltonian, ..), no?
I'm also cool with either of the following:
- Making
AbstractHamiltonian <: AbstractMCMC.AbstractModel
, which IMO seems sensible. - Additional overload that will automagically wrap
hamiltonian
inHamiltonianModel
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that option 1 is more sensible, as we already making kernel sub-typed of samples from AbstractMCMC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeaah I've realized I need to make quite a few changes. Because everything is mutated somehow, we essentially need to put everything in the sampler state
.
Sooo the question is what we actually use for the model
, etc. As of right now they would just be "fake" in the sense that they only contain the initial Hamiltonian
and the initial AbstractHMCKernel
😕 We could mutate them, but that seems very much against the AbstractMCMC.jl interface...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Do you mean it actually only make sense to make the log density and gradient as model but treating metric differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Only" is maybe strong, but yeah, I think it makes more sense to keep the hamiltonian
and the metric
separate. Potentially do away completely with the hamiltonian
and only keep metric or, alternatively, make hamiltonian
nothing but a wrapper around metric
(in case there are future cases we might want to cover) and just make a DifferentiableDensityModel <: AbstractModel
(could even move this + DensityModel
from AbstractMH.jl into AbstractMCMC.jl at some point, so these can be re-used e.g. for MALA which is present in AdvancedHMC.jl)
EDIT: I think it's best to keep hamiltonian, but just reconstruct it from a DifferentiableDensityModel
at every call to step
. And we just pass the "state" of hamiltonian around, e.g. metric
.
src/sampler.jl
Outdated
rng::AbstractRNG, | ||
model::HamiltonianModel, | ||
spl::HMCKernel; | ||
init_params, # TODO: implement this. Just do `rand`? Need dimensionality though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see multiple ways to do this but it's unclear what's the best.
Can you file a PR/discussion to chat on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do! But for now I'll just extract the dimensionality from the metric and use rand
. But yeah, I'll open an issue after this PR so we can figure out potential better defaults.
I think we should do this as to keep the functionality unchanged.
Does it make sense to add this functionality in AbtractMCMC first? Seems to be a useful thing to have in general. |
Yes, that's the intended use case. 👍
Yeah, I think the AHMC logging stuff should be moved to a callback. |
Not needed; with TuringLang/AbstractMCMC.jl#56 merged we can implement this as a callback very easily, and just make this callback the default. |
Project.toml
Outdated
@@ -3,6 +3,7 @@ uuid = "0bf59076-c3b1-5ca4-86bd-e02cd72cde3d" | |||
version = "0.2.28" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also increment the version here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make a minor version bump? Seems like quite a significant change given the introduction of HMCKernel
too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah a minor version bump sounds suitable.
Btw, waiting for TuringLang/Turing.jl#1579 since tests depend on Turing. |
Is there anything missing from this PR? If there is nothing major, let's try to merge this PR since it will benefit several downstream tasks. |
I've been working off the branch for this PR to support tempering on AHMC via the new MCMCTempering package and noticed that upon finishing sampling, even if I have |
A couple of things:
I'm also using this branch heavily in some work, and have an example of such a method implemented locally. But my implementation is using information from How are you using it? Are you using it with Turing or nah? |
I see yes, that is a fair assessment, I am using just AHMC, haven't tried with Turing but I agree it perhaps makes less sense to have something specific here for working with AHMC. On another note, I agree regarding Tor's comments on the state of the What is the minimal expectation for the return from |
Glad you brought this up! I'm not certain tbh 😕 @xukai92 @yebai What do you think? Should it return |
I think we should not change the return at least for this PR.
Both are OK but I don't think we have to make it perfect within this PR as long as it's used behind the scene (i.e. users don't need to create it explicitly).
What's the potential bug? |
The |
It might just be that the settings I have for the sampler is bad because it passes most of the time. |
They were fine in the other PR so I suppose you mean it's due to randomness? Seems like there are also conflicts. |
Btw, one annoying thing is that |
Doing it in another PR sounds more reasonable. Can you create an issue for this along with other missing thing from |
How could |
Will do 👍
Because this is a new minor version for AHMC, which won't be compat with Turing.jl since it has compat-bounds on AHMC + it indicates it's a breaking release and so Turing.jl should be expected to have to make some changes to be compatible with the new version. If you look at the integration tests you see So it's just a question of whether we make the tests pass or fail in this case; I went with letting it pass since this is also what we do in other Turing.jl-packages, e.g. DPPL. |
Okay this is just strange. It fails on macOS but seems to do fine otherwise. |
I see. Makes sense to me now.
Could be randomness again but I feel using the same parameters as the non-AMCMC version is enough. Otherwise it might indicate some issues. If I give same initial position and same random seed, do I expect to get exact same samples from the interface implementation? If so maybe we can add a test to check if samples are exactly the same. |
I realized what was causing the issue: I was including the adapted samples in the mean-computation:) Now that those are dropped, it works. But now there's a different issue? This one seems completely unrelated to the changes in this PR though. You got any clue @xukai92 ? |
Looks like the tests are fine now. Is turning off the progress (the last commit) the fix? |
No, that was just to make the logs nicer. It seems like one of the adaptation tests can fail if we're unlucky with the random seed, but as I said, this has nothing to do with this PR so IMO we merge and I'll make an issue reminding us that the adaptation test is fickle. |
Things to discuss
progress=false
to the underlyingAbstractMCMC.mcmcsample
and then use thecallback
keyword argument to log the progress. So the question is: should we do this so as to preserve the current logging functionality?StatsBase.sample
and then perform this step after the call toAbstractMCMC.mcmcsample
. Should we do this?