Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization #59

Open
wants to merge 68 commits into
base: master
Choose a base branch
from
Open

Optimization #59

wants to merge 68 commits into from

Conversation

RoverVan
Copy link
Collaborator

No description provided.

@tdhock
Copy link
Collaborator

tdhock commented Jun 14, 2017

the travis build is failing because the linker on the travis VM can not find libopenblas. https://travis-ci.org/anujkhare/iregnet/builds/242848673#L735

It is not installed by default, but it is included in the "libopenblas-dev" debian/ubuntu pkg, which you can tell travis to install via https://docs.travis-ci.com/user/languages/r/#APT-packages

@RoverVan
Copy link
Collaborator Author

Hey @anujkhare , I wrote a blog with benchmark figures of new iregnet. As the result of benchmark, new iregnet gets great performance than before. I think it is time to merge the pull request.

@tdhock
Copy link
Collaborator

tdhock commented Aug 26, 2017

please resolve conflicts first

@RoverVan
Copy link
Collaborator Author

Ok @tdhock , i just resolved the conflicts. However, some errors happened on travis because RcppArmadillo just updated a few days ago and required higher c++ compiler version. So I changed the c++ compiler version by updating travis.yml and the problem was resolved.

@tdhock
Copy link
Collaborator

tdhock commented Aug 29, 2017

@anujkhare is it OK to merge with master? Looks fine to me.

@anujkhare
Copy link
Owner

@RoverVan great work with this! Just give me a bit, I am looking through it.

@tdhock
Copy link
Collaborator

tdhock commented May 16, 2019

hi @anujkhare can you please review and approve/merge if it is ok?

@anujkhare
Copy link
Owner

Well, my review is 2 years too late, but I'll go through it regardless. We can decide whether or not to pick anything up based on how much time we have.

case 0: target_compute_grad_response = compute_grad_response_logistic_right; break;
case 1: target_compute_grad_response = compute_grad_response_logistic_none; sort = false; break;
case 2: target_compute_grad_response = compute_grad_response_logistic_left; break;
case 3: target_compute_grad_response = compute_grad_response; sort = false; break;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isn't this using compute_grad_response_logistic_interval?


bool sort = true;

if(transformed_dist == IREG_DIST_GAUSSIAN){
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I'm unsure about why we needed to split these functions for each distribution.

One the one hand, this causes the code to bloat up. Since the gradient equations are the same, I'm wondering if we could have reused more code.

On the other, maybe the optimizations were not possible without doing so? @tdhock do you have any thoughts about this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it was necessary. IMO better to have slower, more maintainable code than bloated / duplicated / fast code. we should think about how to remove the duplication before merging.

/* Sort the whole matrix by censoring type, get the sorted X and y
* & Switch the right method for target_compute_grad_response
*/
if(transformed_dist != IREG_DIST_EXTREME_VALUE){
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that we are using the old code for the extreme value distribution?

@@ -538,3 +659,141 @@ compute_lambda_max(Rcpp::NumericMatrix X, double *w, double *z, double *eta,

return lambda_max;
}

static int *
sort_X_and_y(mat &sorted_X, mat &sorted_y, IREG_CENSORING *status, mat X, mat y, int function_type)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing docstring. Also, perhaps rename this to sort_x_and_y_by_censoring_status or something like that?

}
}

none_censoring_X_mat.resize(none_censoring_number, X.n_cols);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way for concatenating different matrices?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that we could potentially do is to simply count the number of rows of each censoring type up front, create one new matrix, and keep inserting entries from the original matrix into the new matrix by maintaining pointers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants