Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Descale feature contribution for Linear Regression & Logistic Regression #345
Descale feature contribution for Linear Regression & Logistic Regression #345
Changes from 5 commits
fe8e6df
4ad6204
1e2bc81
5cbe132
673dad8
4c44263
d3fa01b
7a882f2
8ba36c2
e4da5ab
0767e84
992db1a
7eaa209
2c5d2f7
99236f4
e1bb482
00fb0d6
27a2449
cb798db
3a541ed
083448a
606b6e1
1643635
6ca53fc
4c432e6
4f252bd
4be8752
0e63491
e6de82b
54d28a1
4677c96
fa4221f
a5901d8
90ff504
a7dea4e
35bdfe8
c6bae48
bc60187
a6839b2
23b2443
36c8420
c80cc1a
51627e9
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you dont need the inner function and you can do it faster. simply do:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another question, since
domain
isSeq[String]
how can we be sure thatd.toDouble
wont throw an error?! should we handle it gracefully?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For regression problem where there are too few unique labels (and they get treated as categoricals),
d.toDouble
should work fine. I'm not sure how it will behave on classification. For classification I assume thatdomain = Array("0", "1")
, is this correct ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did we decide to define
domain
asSeq[String]
in the first place? @Jauntbox @leahmcguireThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have the raw label rather than the indexed label you will get strings - this was designed to support that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it safe to do
toDouble
in this case ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any value which is not numeric with throw an exception, e.g.
"".toDouble
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but will there ever be non numeric string in the label field...? i thought they should be filtered out.