Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Tanhscaler nan output for constant feature #153

Merged
merged 1 commit into from
Mar 27, 2023
Merged

Conversation

ab93
Copy link
Member

@ab93 ab93 commented Mar 27, 2023

When the variance of input time series is zero, then we encounter a division by zero problem.
This led to NaN outputs.

This change checks if a feature is indistinguishable from a constant feature or not. If yes, then set the std as 1.0 for that feature. This will stop the feature from getting scaled by it's zero variance.

Signed-off-by: Avik Basu <ab93@users.noreply.github.com>
@ab93 ab93 added the bug Something isn't working label Mar 27, 2023
@ab93 ab93 self-assigned this Mar 27, 2023
@ab93 ab93 requested a review from mboussarov March 27, 2023 23:16
@ab93 ab93 marked this pull request as ready for review March 27, 2023 23:16
@codecov
Copy link

codecov bot commented Mar 27, 2023

Codecov Report

Merging #153 (6882d08) into main (69006eb) will increase coverage by 0.25%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #153      +/-   ##
==========================================
+ Coverage   97.19%   97.45%   +0.25%     
==========================================
  Files          33       33              
  Lines        1213     1218       +5     
  Branches       89       89              
==========================================
+ Hits         1179     1187       +8     
+ Misses         26       24       -2     
+ Partials        8        7       -1     
Impacted Files Coverage Δ
numalogic/preprocess/transformer.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ab93 ab93 merged commit b61ac1f into main Mar 27, 2023
@ab93 ab93 deleted the fix-tanhscaler-varzero branch March 27, 2023 23:37
@s0nicboOm
Copy link
Contributor

Can we use Standard Scaler instead?

@ab93
Copy link
Member Author

ab93 commented Mar 28, 2023

Can we use Standard Scaler instead?

@s0nicboOm as discussed, we should try extending the StandardScaler class and cover more edge-cases. Please open an issue so that we can keep a track of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants