Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pathfinder #848

Merged
merged 25 commits into from
Nov 8, 2023
Merged

Pathfinder #848

merged 25 commits into from
Nov 8, 2023

Conversation

SteveBronder
Copy link
Collaborator

@SteveBronder SteveBronder commented Sep 11, 2023

Submission Checklist

  • Run unit tests
  • Declare copyright holder and agree to license (see below)

Summary

This implements pathfinder in cmdstanr. Opening as a draft as I'm getting an error on the tests I'm not sure how to fix. Where is the validation for each of the arguments done at? I put them all in the validation function and while the model shows up with errors such as the below they do not actually throw an error.

Chain 1 test_bad_val is not a valid value for "max_lbfgs_iters"
Chain 1   Valid values: 0 < max_lbfgs_iters

I also still need to fix the docs. Once the above is done it should be ready for review

Copyright and Licensing

Simons Foundation

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the following licenses:

@jgabry
Copy link
Member

jgabry commented Sep 11, 2023

Thanks for working on this!

Regarding the argument validation, it looks like in PathfinderArgs you're using max_lbfgs_iters and but in the validation function you have max_lbfgs_iter (singular). Could that be the issue? (Although I thought $ allowed partial matching which should have worked here, but worth checking out)

Also (unrelated), some of the errors in the GHA checks are due to #843. So you'll probably just need to run tests locally until that PR is merged. I think @andrjohns is working on it.

@WardBrian
Copy link
Member

We should make sure that the names and behavior for the draws (num_psis_draws in cmdstan) argument match up between this and stan-dev/cmdstanpy#686

In that PR:
arguments are named draws and single_path_draws. If you don't specifically request single_path_draws, it is set equal to draws.
If paths > 1, draws is cmdstan's num_psis_draws and single_path_draws is cmdstan's num_draws
if paths is exactly 1, both of these are equivalent to cmdstan's num_draws

@andrjohns
Copy link
Collaborator

@SteveBronder I've merged in master with the fixes for the failing tests, so now anything that's left should be pathfinder-specific (i.e., git blame Steve)

@andrjohns
Copy link
Collaborator

@SteveBronder I'm happy to bring this PR over the finish line and sort out any cmdstanr-trickiness, what's left to be done/fixed?

@jgabry
Copy link
Member

jgabry commented Sep 25, 2023

Just a heads up that we just merged #800 (adding laplace method), which might result in a few conflicts

@SteveBronder SteveBronder marked this pull request as ready for review September 25, 2023 20:05
@SteveBronder
Copy link
Collaborator Author

@andrjohns I think this is ready for a first review! I pulled from the sampling method for tests but if you can think of other things I should add I'd be happy to add them

@andrjohns
Copy link
Collaborator

@andrjohns I think this is ready for a first review! I pulled from the sampling method for tests but if you can think of other things I should add I'd be happy to add them

Amazing, thanks for this! The current check failures are just from the docs needing to be re-generated and a stray test file being included. I'll fixup those and then review later today. Thanks again!

@codecov-commenter
Copy link

codecov-commenter commented Sep 26, 2023

Codecov Report

Merging #848 (436eae5) into master (021049b) will increase coverage by 0.15%.
Report is 11 commits behind head on master.
The diff coverage is 92.35%.

❗ Current head 436eae5 differs from pull request most recent head 094d797. Consider uploading reports for the commit 094d797 to get more accurate results

@@            Coverage Diff             @@
##           master     #848      +/-   ##
==========================================
+ Coverage   88.30%   88.46%   +0.15%     
==========================================
  Files          12       12              
  Lines        4362     4517     +155     
==========================================
+ Hits         3852     3996     +144     
- Misses        510      521      +11     
Files Coverage Δ
R/example.R 97.56% <ø> (ø)
R/model.R 92.69% <100.00%> (+0.37%) ⬆️
R/run.R 94.23% <100.00%> (+0.33%) ⬆️
R/utils.R 72.79% <100.00%> (+0.26%) ⬆️
R/args.R 97.95% <98.59%> (+0.07%) ⬆️
R/install.R 63.14% <88.88%> (+0.12%) ⬆️
R/fit.R 95.49% <62.50%> (-0.58%) ⬇️
R/csv.R 95.40% <65.00%> (-1.22%) ⬇️

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

Copy link
Collaborator

@andrjohns andrjohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of cleanups and some very minor q's, but otherwise this all looks good to me! Thanks for putting it together!

R/csv.R Outdated
@@ -254,7 +254,7 @@ read_cmdstan_csv <- function(files,
"\""
)
} else {
fread_cmd <- paste0("grep -v '^#' --color=never '", output_file, "'")
fread_cmd <- paste0("grep -v '^#' --color=never '", path.expand(output_file), "'")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was this change needed for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'll revert this change

R/model.R Show resolved Hide resolved
Comment on lines 94 to 95


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trim pls

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you drop this from the PR as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Odd, idk where this came from?

Copy link
Member

@jgabry jgabry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @SteveBronder! And thanks @andrjohns for reviewing. In addition to Andrew's comments I pointed out in one comment that the arguments need to match the cmdstanpy implementation. We're close but there are a few discrepancies. See comment.

R/model.R Outdated
tol_rel_grad = NULL,
tol_param = NULL,
history_size = NULL,
num_draws = NULL,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to make sure the arguments match with the cmdstanpy implementation. I think currently there are still discrepancies (e.g. num_draws was changed in cmdstanpy). See @WardBrian's comment here: #848 (comment)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look those over again

@SteveBronder
Copy link
Collaborator Author

@WardBrian for

arguments are named draws and single_path_draws. If you don't specifically request single_path_draws, it is set equal to draws.

Is single_path_draws a new argument for the python/R interface or does that overshadow another argument?

@SteveBronder
Copy link
Collaborator Author

@jgabry do you want to look at this again or is that a stale review?

@jgabry
Copy link
Member

jgabry commented Oct 24, 2023

Sorry for the delay! I’ll try to take a look tomorrow

Copy link
Member

@jgabry jgabry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @SteveBronder for implementing this and @andrjohns for reviewing. This looks great, just a few minor things. First, I added a few comments/suggestions about minor doc things that I failed to notice in my last review (sorry).

The bigger thing I noticed (although maybe I overlooked some discussion about this) is that the number of draws returned seems to be draws - 1 instead of draws:

file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
mod <- cmdstan_model(file)
stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
fit <- mod$pathfinder(data = stan_data, draws = 100)
nrow(fit$draws()) # 99 not 100

R/model.R Outdated Show resolved Hide resolved
R/model.R Show resolved Hide resolved
R/model.R Outdated Show resolved Hide resolved
@jgabry
Copy link
Member

jgabry commented Oct 25, 2023

In addition to my previous comments, we should also add something about pathfinder to the vignette, but that doesn't need to be part of this PR. I'll open a separate issue for that.

@jgabry jgabry linked an issue Oct 25, 2023 that may be closed by this pull request
@LuZhangstat
Copy link

LuZhangstat commented Oct 25, 2023 via email

@jgabry
Copy link
Member

jgabry commented Oct 25, 2023

Thank you for keeping me updated. Please don't forget to add me to the other issue and let me know if I can provide any help. Best, lu

Hi Lu! Have you tried out the implementation in this PR? No obligation, but let us know if you have any feedback if you test it. Also which other issue were you referencing that you should be added to? Happy to include you in any discussion you're interested in. Or was that a comment directed at @SteveBronder or someone else other than me who already knows which issue?

@LuZhangstat
Copy link

LuZhangstat commented Oct 29, 2023 via email

@jgabry
Copy link
Member

jgabry commented Nov 1, 2023

@LuZhangstat because this hasn't been merged yet you would need to install from this branch:

# install from feature/pathfinder branch
remotes::install_github("stan-dev/cmdstanr", ref = "feature/pathfinder")

# after that finishes
# simplest example model to test but you can try any model
library("cmdstanr")
file <- file.path(cmdstan_path(), "examples/bernoulli/bernoulli.stan")
mod <- cmdstan_model(file)
stan_data <- list(N = 10, y = c(0,1,0,0,0,0,0,0,0,1))
fit <- mod$pathfinder(data = stan_data) # you can test arguments like num_paths and others

Does that work?

@LuZhangstat
Copy link

LuZhangstat commented Nov 1, 2023 via email

@jgabry
Copy link
Member

jgabry commented Nov 1, 2023

Sounds good thanks Lu!

@jgabry
Copy link
Member

jgabry commented Nov 1, 2023

@SteveBronder I have a sec now so I can go ahead and take care of the doc-related comments in my recent review. Can you just see what's up with pathfinder returning draws-1 draws?

@jgabry
Copy link
Member

jgabry commented Nov 1, 2023

@SteveBronder I have a sec now so I can go ahead and take care of the doc-related comments in my recent review. Can you just see what's up with pathfinder returning draws-1 draws?

Ok I just fixed the remaining doc issues so I think the only things left to do before merging this are:

  • Figure out why there are only draws-1 draws returned (@SteveBronder can you take a look at this?)
  • Add Steve to the DESCRIPTION file
  • If @LuZhangstat notices any problems address those

Otherwise I think we're good to go!

@SteveBronder
Copy link
Collaborator Author

Just fixed!

@SteveBronder
Copy link
Collaborator Author

It was a silly error in csv.R

@SteveBronder SteveBronder changed the title [WIP] Pathfinder Pathfinder Nov 3, 2023
@jgabry
Copy link
Member

jgabry commented Nov 3, 2023

Thanks Steve!

@SteveBronder
Copy link
Collaborator Author

@jgabry good to go?

@jgabry
Copy link
Member

jgabry commented Nov 8, 2023

Good to go! Merging now.

@jgabry jgabry merged commit 357a07e into master Nov 8, 2023
12 checks passed
@jgabry jgabry deleted the feature/pathfinder branch November 8, 2023 17:13
@jgabry
Copy link
Member

jgabry commented Nov 8, 2023

Thanks again for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Stan 2.33] Pathfinder algorithm
7 participants