-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"failed to find valid initial parameters" without the use of truncated
#2476
Comments
Just to add that after updating Turing I also noticed this error in some models that were previously fine, and it kept erroring if I manually set (good) initial parameters :/ |
Thanks for reporting @bspanoghe. This is actually not an AD issue, but rather a case where the way we sample initial guesses for HMC/NUTS will never pick a value for @yebai is there a reason we don't sample from the prior for our initial guesses? We should improve our automatic initial parameter finding, but note that you can also get around this by giving a good initial parameter yourself, e.g. The only thing that changed in the linked PR, which was first included v0.35.2, is that this errors after 1000 attempts rather than getting into an infinite loop. If you have a case where were some model worked fine but stopped working after v0.35.2, or doesn't work even if you give it good initial parameters by hand, please do send us an MWE (@DominiqueMakowski). |
The [-2, 2] thing comes from Stan: https://mc-stan.org/rstan/reference/stan.html (see 'init' section) Some discussion about the rationale for this vs. sampling from the prior: https://discourse.mc-stan.org/t/initialization-using-prior/12512 |
In light of this it seems that maybe we should have a configurable option to initialise parameters using:
In the meantime, a quick fix could be generate julia> using DynamicPPL
julia> # instantiating a VarInfo samples from the prior; the [:] turns it into a vector of parameter values
# it would definitely be useful to have a convenience method for this
initial_params = DynamicPPL.VarInfo(condmodel)[:]
1-element Vector{Float64}:
112.45927127783834
julia> sample(condmodel, NUTS(), 1000; initial_params=initial_params)
┌ Info: Found initial step size
└ ϵ = 3.2
Sampling 100%|███████████████████████████████████████████████████████████████████| Time: 0:00:00
Chains MCMC chain (1000×13×1 Array{Float64, 3}):
Iterations = 501:1:1500
Number of chains = 1
Samples per chain = 1000
Wall duration = 0.07 seconds
Compute duration = 0.07 seconds
parameters = X
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ess_per_sec
Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
X 107.7615 69.9560 5.3643 170.6348 270.6350 1.0059 2337.4624
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
X 50.8721 62.5282 83.6638 127.2002 306.6304 |
I wonder if any option that requires a conscious action from the user is much better than asking them to just provide |
My two cents as a user is that I think adding the option might be overkill and not worth it, especially if there is a convenient & visibly documented way of doing that outside. I would personally love to see a convenience function to extract the mean of priors (ie a wrapper around |
It's quite cheap to add an option: (*mutters about the lack of sum types in Julia*) abstract type InitialisationStrategy
struct Prior <: InitialisationStrategy end
struct Uniform <: InitialisationStrategy
a::Real
b::Real
end
struct Manual <: InitialisationStrategy
params::AbstractVector{<:Real}
end Then document this, and add a link to the docs in the error message, that's how you can 'force' (or teach) people to think about initialisation. And I think it'd be a huge improvement over making users instantiate VarInfo, having a type like this carries semantic information that makes code easier to read and understand, so it's not just a matter of convenience / saving characters, it's also making code written with Turing clearer. |
I think we could generalise the In the longer term, we could add this |
Minimal working example
Description
Throws
ERROR: failed to find valid initial parameters in 1000 tries. This may indicate an error with the model or AD backend; please open an issue at https://github.com/TuringLang/Turing.jl/issues
.I assume this is related to #2389, sampling works fine with gradientless samplers. Might not be relevant since it seems like a known issue caused by the AD backends, but I've only seen the error mentioned along with a
truncated
distribution so since that's not the case here I thought it might be interesting to someone. Additional point of interest, sampling does work for low enough values ofY
, e.g.(Y = 5.0,)
works just fine.Julia version info
versioninfo()
Manifest
Presumably the most relevant parts:
And full:
]st --manifest
The text was updated successfully, but these errors were encountered: