Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Eigen Maps for data and transformed data #865

Merged
merged 14 commits into from
Apr 15, 2021

Conversation

SteveBronder
Copy link
Contributor

@SteveBronder SteveBronder commented Mar 29, 2021

Summary

Eigen matrices and vectors used by the model are now Eigen::Map types. std::vector<Eigen> types will still continue to be the same as they are now. This PR differs from the original PR in that now instead of using a local stack allocator we just use the memory made by the original data.

Description

So this PR requires a couple little explainers.

First, the models class members now includes Eigen::Map<EigenType>s of each of the user specified matrices. For example in the diamond.stan model the members are now

   int N;
  Eigen::Matrix<double, -1, 1> Y__;
  int K;
  Eigen::Matrix<double, -1, -1> X__;
  int prior_only;
  int Kc;
  Eigen::Matrix<double, -1, -1> Xc__;
  Eigen::Matrix<double, -1, 1> means_X__; 
  Eigen::Map<Eigen::Matrix<double, -1, 1>> Y{nullptr, 0};
  Eigen::Map<Eigen::Matrix<double, -1, -1>> X{nullptr, 0, 0};
  Eigen::Map<Eigen::Matrix<double, -1, -1>> Xc{nullptr, 0, 0};
  Eigen::Map<Eigen::Matrix<double, -1, 1>> means_X{nullptr, 0};

So all the data now has __ after it and the user specified data names are now the names of the Eigen::Map<> member types. That {nullptr, 0} probably seem very odd, but Eigen's rules around maps require them to be initialized either by a default or in the initializer list of the class constructor. For our purposes, we only need these members to be initialized and have the correct memory size. An Eigen map is pretty simple, it has a pointer and some integers in it like the following pseudoclass.

template <typename T>
struct Map {
 T* m_data;
 Eigen::Index rows; 
 Eigen::Index cols;  
};

To make these Eigen maps actually reference the memory we want we'll use placement new in the model class's constructor body. By calling new (*EigenMap) Map(RealData...) we can make a new Map, overwriting our initially default constructed maps memory, that has access to the actual memory we want! That looks like the below

// In the model class's constructor body
// Allocate the memory we want
Y__ = Eigen::Matrix<double, -1, 1>(N);
// Use placement new to overwrite the default initialized map.
new (&Y) Eigen::Map<Eigen::Matrix<double, -1, 1>>(Y__.data(), N);
// For rest of program treat `Y` as if it's `Y__`
stan::math::fill(Y, std::numeric_limits<double>::quiet_NaN());

Huh! So all that's going on here is that we are allocating memory for Y__ on the first line, then we use placement new, passing it the pointer to our member map Y to construct a new map who knows about that Y__'s memory + size, then for the rest of the program we treat Y as if it was the original matrix that is now Y__.

The results from the comment here hold which is nice and I tested compiling a few models locally which seemed to work.

@rok-cesnovar
Copy link
Member

It seems that its having issues with the huge distribution models, causes the AWS instances to run out of memory when compiling.

@SteveBronder
Copy link
Contributor Author

Oh really? I'm only seeing

QueuedWaiting for run to start

Where are you seeing the distribution model thing?

@rok-cesnovar
Copy link
Member

You can see it in the previous run (I restarted the runs to check if its a instance/one-off issue) - https://jenkins.mc-stan.org/blue/organizations/jenkins/stanc3/detail/PR-865/5/pipeline/188

and in the old view for the currently running job (https://jenkins.mc-stan.org/job/stanc3/job/PR-865/6/execution/node/215/log/). You can find this under pipeline steps. Its been running for 6 hours now.

@rok-cesnovar
Copy link
Member

This "Waiting for run to start" seems to be a bug in the Blue Ocean view as its not actually waiting for an instance.

@SteveBronder
Copy link
Contributor Author

Ahh icic hmm, it seems to compile all of the example models in performance-tests-cmdstan. And it looks like it only took 5 minutes to compile those. Can we limit the number of parallel jobs running in the compile tests node?

@rok-cesnovar
Copy link
Member

You can just set -j for this job in the Jenkinsfile if we want to see if it works.

@SteveBronder
Copy link
Contributor Author

Alright so compile tests is passing in 1h 23m 11s with -j6 compared to 26m 47s with -j15 which I think is right in the ballpark accounting for running these in parallel. I can try to bump it up to j10 or so to see how fast that goes

@SteveBronder
Copy link
Contributor Author

Alrighty so I think I got this all up and working, @t4c1 had a look over the placement new scheme and gave it the thumbs up so if the ocaml looks good I think we are good!

Copy link
Collaborator

@nhuurre nhuurre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a crazy thought: Eigen::Map can map std::vector too, right? Would it be possible/reasonable to use X_flat__ in place of X__ to avoid copying? X_flat__ is represented in the MIR and would be a class member if its Decl wasn't wrapped in a Block.

./runPerformanceTests.py -j${env.PARALLEL} --runs=0 ../test/integration/good
./runPerformanceTests.py -j7 --runs=0 ../test/integration/good
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need adjustment?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because with the default parallel setting the instances run out of RAM, with -j7 the tests pass but they run for 3 hours instead of 20-30mins! We need to get to the bottom of this.

Does this effect "typical" models or is it just a factor with these huge test models ala this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think this is just with the huge models. Though @rok-cesnovar I think you and I have both tried these locally without issue. It could be the compiler we use for running these tests? I'm really not sure. My guess as to what's happening is that we are now generating signatures for both Eigen::Map<> types and Eigen::Matrix<> types

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I havent explored it for this PR yet, but have it on my list for the next few days.

Comment on lines 449 to 452
match (UnsizedType.is_eigen_type ut, Transform_Mir.is_opencl_var name) with
| true, false -> Some (name, ut)
| false, _ -> None
| true, _ -> None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
match (UnsizedType.is_eigen_type ut, Transform_Mir.is_opencl_var name) with
| true, false -> Some (name, ut)
| false, _ -> None
| true, _ -> None
if UnsizedType.is_eigen_type ut && not (Transform_Mir.is_opencl_var name)
then Some (name, ut)
else None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally cool with this change, though one thing I don't get yet. When is it better to use an if vs. a match? Is it just a style choice? I feel like when I see ifs I usually see us do a ; at the end I get kind of confused about execution stuff so I'm like "Oh idt I like that" but are ifs totally fine?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno, I think it's just a style choice. Personally I prefer if/when for booleans and match for anything more complex. Also if can omit else so it's a bit more compact if you only need one branch.

If you want to keep match maybe merge | true, _ -> None | false, _ -> None into a single | _, _ -> None?
Actually, thinking more about this, it's more natural to use List.filter here instead of List.filter_map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! Switched this to use filter and if

let nan_type (st, adtype) =
match (adtype, st) with
| UnsizedType.AutoDiffable, _ -> "DUMMY_VAR__"
| DataOnly, SizedType.SInt -> "std::numeric_limits<int>::quiet_NaN()"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not exist; only floating point types have quiet_NaN. https://en.cppreference.com/w/cpp/types/numeric_limits/has_quiet_NaN
pp_initialize ignores this value anyway so this never makes it to the generated code.

Comment on lines 197 to 200
let bad_type = Fmt.strf "%a" pp_ut (x, DataOnly) in
raise_s
[%message
"Error during Map data construction for " vident " of type " bad_type
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let bad_type = Fmt.strf "%a" pp_ut (x, DataOnly) in
raise_s
[%message
"Error during Map data construction for " vident " of type " bad_type
raise_s
[%message
"Error during Map data construction for " vident " of type "
(x : UnsizedType.t)

@SteveBronder
Copy link
Contributor Author

Just a crazy thought: Eigen::Map can map std::vector too, right? Would it be possible/reasonable to use X_flat__ in place of X__ to avoid copying? X_flat__ is represented in the MIR and would be a class member if its Decl wasn't wrapped in a Block.

A reasonable thought and good idea! We would move the placement new till after we've assigned X_flat__ and then do

new (&X) Eigen::Map<Eigen::Matrix<double, -1, 1>>(X_flat__.data(), N, M);

Though I think I'd like to do that in a separate PR as I haven't really investigated the transform mir stage (my ocaml-fu is not up to parsing it yet).

… decls, fix error message and removed quiet_NaN for int
@SteveBronder
Copy link
Contributor Author

This is ready for another review!

Copy link
Collaborator

@nhuurre nhuurre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ocaml looks good.

@SteveBronder SteveBronder merged commit 5408944 into stan-dev:master Apr 15, 2021
@SteveBronder
Copy link
Contributor Author

Rad thanks!

@hsbadr
Copy link
Member

hsbadr commented Apr 19, 2021

This can give a 2x++ speedup for models that do big vectorized ops with data.

That's great! Can we exclude the vectors/matrices in the data block? or what do we need to make it work? I think cmdstan passes the data to Stan via CSV files not in memory like rstan, which might be the culprit.

@hsbadr
Copy link
Member

hsbadr commented Apr 19, 2021

Since (R)cmdStan is not affected, the worst case is to make it optional, via a stanc argument, defaulting to FALSE. But, let's try to fix it first.

@SteveBronder
Copy link
Contributor Author

Can we exclude the vectors/matrices in the data block? or what do we need to make it work?

Those are the two types this works over so if we remove it for those it's just removing it. If it fails for data I'd guess it would also for for transformed data.

Error in new_CppObject_xp(fields$.module, fields$.pointer, ...) :
  Exception: variable does not exist; processing stage=data initialization; variable name=x; base type=double (in 'anon_model', line 3, column 2 to column 25)
failed to create the sampler; sampling not done

This is kind of a weird error because it implies x does not exist when it's trying to assign x in the model from the var_context, I think? Can you confirm your branch is on the stanc3 nightly build? I can run this tmrw morning to get a better idea of what's going on. My fear is that new_CppObject_xp is actually making a copy somewhere, or are you not allowed to use new in R C++ files? Also are you on windows/linux/mac?

@SteveBronder
Copy link
Contributor Author

Also is this the right branch for the rstan example above?

https://github.com/hsbadr/rstan/tree/StanHeaders_2.26

@hsbadr
Copy link
Member

hsbadr commented Apr 19, 2021

This is kind of a weird error because it implies x does not exist when it's trying to assign x in the model from the var_context, I think?

I can run this tmrw morning to get a better idea of what's going on. My fear is that new_CppObject_xp is actually making a copy somewhere, or are you not allowed to use new in R C++ files?

I think so, the variables do exist but likely rejected in the case of Eigen::Map (applying the workaround fixes it). it could be var_context, memory access denied, or data type issue (the variable does exist but rejected due to wrong type). the first thing we should do is to apply sanity checks before returning this error.

Can you confirm your branch is on the stanc3 nightly build?

rstan falls back to the nightly build if the JS doesn't exist. So, remove it after rstan installation:

file.remove(system.file("stanc.js", package = "rstan"))

Here's the version stamp in my code:

// Code generated by stanc v2.26.1-223-g5408944d

Also are you on windows/linux/mac?

I'm on Linux.

Also is this the right branch for the rstan example above?

No, use the develop branch: https://github.com/hsbadr/rstan, or download the packages from here.

@SteveBronder
Copy link
Contributor Author

Yeah that's the most up to date version of stanc3, let me tinker around with this tmrw I think I can figure this out without breaking upstream things

@SteveBronder
Copy link
Contributor Author

Actually I think the error message is coming from here

// This context.validate_dims before the actual making of the map
  96 :       context__.validate_dims("data initialization","x","double",
  97 :            std::vector<size_t>{static_cast<size_t>(t)});
// Error thrown before getting to map
  98 :       x__ = Eigen::Matrix<double, -1, 1>(t);
  99 :       new (&x) Eigen::Map<Eigen::Matrix<double, -1, 1>>(x__.data(), t);

Since the message is constructed here

https://github.com/stan-dev/stan/blob/develop/src/stan/io/validate_dims.hpp

    if (!context.contains_r(name)) {
      std::stringstream msg;
      msg << "variable does not exist"
          << "; processing stage=" << stage << "; variable name=" << name
          << "; base type=" << base_type;
      throw std::runtime_error(msg.str());
    }

Still looking into this, could be #871 did something, though that seems odd that this would mess up the var_context somehow

@hsbadr
Copy link
Member

hsbadr commented Apr 20, 2021

That makes sense. validate_dims has been implemented only in the development version of rstan. So, it wasn't never tested in the context of Rcpp module (@bgoodri any idea?). Changing context__.to_vec(N) to std::vector<size_t>{static_cast<size_t>(N) in #871 could also be related. Have you been able to reproduce the issue?

@SteveBronder
Copy link
Contributor Author

Yes I think I got it, it has to do with this PR but in a very weird way lol. rstan parses the model C++ code to check the model data and the user's data matches up here

parsed_data <- with(data, parse_data(get_cppcode(object)))

But with this PR that just returns back t and not y or x. I gotta figure out how parse_data() works and then I can open up a patch for this on your fork

@hsbadr
Copy link
Member

hsbadr commented Apr 20, 2021

Gotcha! This PR changes the objects parsed from the C++ code. For the example above, it should include "t" "x" "y", but I now get "t" "x__" "y__" "0" "0". Note the trailing __ at the end of the variable names, from this PR. Would it be safe to trim them?

@SteveBronder
Copy link
Contributor Author

Yep! Almost finished with a patch I'll send over to you

@hsbadr
Copy link
Member

hsbadr commented Apr 20, 2021

Yep! Almost finished with a patch I'll send over to you

Great! The following should work:

diff --git a/rstan/rstan/R/misc.R b/rstan/rstan/R/misc.R
index 0038aa8a..13495e00 100644
--- a/rstan/rstan/R/misc.R
+++ b/rstan/rstan/R/misc.R
@@ -1604,6 +1604,8 @@ parse_data <- function(cppcode) {
                   cppcode[private:public])
   # get them from the calling environment
   objects <- objects[nzchar(trimws(objects))]
+  objects <- setdiff(objects, "0")
+  objects <- gsub("__$", "", objects)
   stuff <- list()
   for (int in seq_along(objects)) {
    stuff[[objects[int]]] <- dynGet(objects[int], inherits = FALSE, ifnotfound = NULL)

@hsbadr
Copy link
Member

hsbadr commented Apr 20, 2021

Yep! Almost finished with a patch I'll send over to you

@SteveBronder Target the stan_2.25 branch for stan-dev/rstan#887.

@SteveBronder
Copy link
Contributor Author

Word! I posted up a PR but your solution also looks like it would work if that's easier for you

@hsbadr
Copy link
Member

hsbadr commented May 6, 2021

@SteveBronder rstanarm fails on load after successful build with the development versions of StanHeaders/rstan/stanc3:

** testing if installed package can be loaded from temporary location
Error: package or namespace load failed forrstanarmin dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object 'rstanarm.so':
  rstanarm.so: undefined symbol: _ZN24model_binomial_namespace24csr_matrix_times_vector2IN5Eigen3MapINS1_6MatrixIdLin1ELi1ELi0ELin1ELi1EEELi0ENS1_6StrideILi0ELi0EEEEES4_EENS3_IN5boost4math5tools12promote_argsIN4stan10value_typeIT_vE4typeENSD_IT0_vE4typeEffffE4typeELin1ELi1ELi0ELin1ELi1EEERKiSO_RKSE_RKSt6vectorIiSaIiEESV_RKSH_PSo
Error: loading failed

I see Eigen::Map in the undefined symbol; do you think it's related to this PR?

@SteveBronder
Copy link
Contributor Author

Are you on the latest version of math?

@hsbadr
Copy link
Member

hsbadr commented May 6, 2021

Are you on the latest version of math?

Yes.

@SteveBronder
Copy link
Contributor Author

Can you send me a script for trying to build what you have locally? My guess is that an old dll is not being recompiled with the newest stan math version. It probably has something to do with stan-dev/math#2462 where I added an autodiff specialization for csr_matrix_times_vector

@hsbadr
Copy link
Member

hsbadr commented May 6, 2021

Can you send me a script for trying to build what you have locally? My guess is that an old dll is not being recompiled with the newest stan math version. It probably has something to do with stan-dev/math#2462 where I added an autodiff specialization for csr_matrix_times_vector

Just install rstanarm from source code with the development version of rstan and remove stanc3 JS after rstan installation, to use the nightly version that has this PR merged:

file.remove(system.file("stanc.js", package = "rstan"))

The compilation is successful, but it fails on loading.

@hsbadr
Copy link
Member

hsbadr commented May 6, 2021

It probably has something to do with stan-dev/math#2462 where I added an autodiff specialization for csr_matrix_times_vector

I'll test it after reverting this Math PR and will let you know.

@hsbadr
Copy link
Member

hsbadr commented May 6, 2021

It probably has something to do with stan-dev/math#2462 where I added an autodiff specialization for csr_matrix_times_vector

I'll test it after reverting this Math PR and will let you know.

This doesn't fix the problem. I think it's related to Eigen::Map and could be the linked version of RcppEigen. It happens with rstanarm only though; all my models aren't affected.

@SteveBronder
Copy link
Contributor Author

Okay I'm trying to fix my home desktop rn but I'll look at this after I get that all done

@SteveBronder
Copy link
Contributor Author

Ah! So I'm also getting

Error: package or namespace load failed for ‘rstanarm’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/home/steve/R/x86_64-pc-linux-gnu-library/4.0/00LOCK-rstanarm/00new/rstanarm/libs/rstanarm.so':
/home/steve/R/x86_64-pc-linux-gnu-library/4.0/00LOCK-rstanarm/00new/rstanarm/libs/rstanarm.so: undefined symbol: _ZN24model_binomial_namespace24csr_matrix_times_vector2IN5Eigen3MapINS1_6MatrixIdLin1ELi1ELi0ELin1ELi1EEELi0ENS1_6StrideILi0ELi0EEEEES4_EENS3_IN5boost4math5tools12promote_argsIN4stan10value_typeIT_vE4typeENSD_IT0_vE4typeEffffE4typeELin1ELi1ELi0ELin1ELi1EEERKiSO_RKSE_RKSt6vectorIiSaIiEESV_RKSH_PSo

The 2 at the end of csr_matrix_times_vector is actually not part of the mangled name! It comes from csr_matrix_times_vector2 in the rstanarm repo. And that function is failing to work because it only accepts Eigen matrices and not Eigen maps. I can put in a patch to fix that function up. I think it can actually be removed now since we have the reverse mode specialization for csr_matrix_times_vector

@SteveBronder
Copy link
Contributor Author

@hsbadr I put up a patch in stan-dev/rstanarm#514 that should fix this

@hsbadr
Copy link
Member

hsbadr commented May 7, 2021

The 2 at the end of csr_matrix_times_vector is actually not part of the mangled name! It comes from csr_matrix_times_vector2 in the rstanarm repo. And that function is failing to work because it only accepts Eigen matrices and not Eigen maps. I can put in a patch to fix that function up. I think it can actually be removed now since we have the reverse mode specialization for csr_matrix_times_vector

That makes sense.

@hsbadr I put up a patch in stan-dev/rstanarm#514 that should fix this

Thanks! I'll test your patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants