-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best options choice for classification of small and unbalanced dataset #46
Comments
there is no way to tell which percentage of training (80% or 90%) is best! Depending on the sample size, you want to ensure there is enough training (helps improve performance), while also ensuring reasonable test set sizes.. If the test set size is too small, violin plots will have large variance. So pick accordingly. |
Hi Pradeep, I tried both of them and indeed violin plots have a large variance compared to the violin plots you show in the neuropredict documentation (I have 75 CN and 15 AD). Below with -t 0.9: and below with -t 0.8:
|
|
|
Use the confusion matrices and the misclassification rate plots to deduce alternative performance metrics |
Hi Pradeep,
For small and unbalanced dataset, do you recommend to use
-t 0.8
or-t 0.9
?Isn't possible to deactivate in the implemented pipeline the feature selection? If not, what is the advantage of always using feature selection when dealing with a small features' dataset?
Best,
Matthieu
The text was updated successfully, but these errors were encountered: