Predictive Techniques for Astrophysics

Primarily applied pandas, and scikit-learn tools on JINAbase to do:

regression/imputation - predict missing microturbulence values
classification - classify stars based on stellar types

Report of this project can be found here. Requirements can be found in the YAML file here

Exploratory Data Analysis

Performance

Regression

Classification

Random Forest Classifier

Class	Precision	Recall	F1-Score	Support
HB	1.00	1.00	1.00	23
MS	1.00	1.00	1.00	55
RG	1.00	0.99	1.00	227
SG	0.97	1.00	0.98	63
Accuracy			0.99	368
Macro Avg	0.99	1.00	0.99	368
Weight Avg	0.99	0.99	0.99	368

Support Vector Classifier

Class	Precision	Recall	F1-Score	Support
HB	0.95	0.91	0.93	23
MS	1.00	0.96	0.98	55
RG	0.99	0.99	0.99	227
SG	0.94	1.00	0.97	63
Accuracy			0.98	368
Macro Avg	0.97	0.97	0.97	368
Weight Avg	0.98	0.98	0.98	368

KNN Classifier

Class	Precision	Recall	F1-Score	Support
HB	1.00	0.83	0.90	23
MS	1.00	0.87	0.93	55
RG	0.99	0.99	0.99	227
SG	0.86	1.00	0.93	63
Accuracy			0.96	368
Macro Avg	0.96	0.92	0.94	368
Weight Avg	0.97	0.96	0.96	368

XGBoost Classifier

Class	Precision	Recall	F1-Score	Support
HB	1.00	1.00	1.00	23
MS	1.00	0.96	0.98	55
RG	1.00	0.99	1.00	227
SG	0.94	1.00	0.97	63
Accuracy			0.99	368
Macro Avg	0.99	0.99	0.99	368
Weight Avg	0.99	0.99	0.99	368

MLP Classifier

Class	Precision	Recall	F1-Score	Support
HB	0.96	0.96	0.96	23
MS	1.00	1.00	1.00	55
RG	1.00	0.99	0.99	227
SG	0.97	1.00	0.98	63
Accuracy			0.99	368
Macro Avg	0.98	0.99	0.98	368
Weight Avg	0.99	0.99	0.99	368

Predictions

Interactive 3D plot can be found here.

References

[1] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Random Forests, pages 587–604. Springer New York, New York, NY, 2009.

[2] Support vector machines — scikit-learn 1.2.2 documentation. https://scikit-learn.org/stable/modules/svm.html, 2023. Accessed:17-04-2023.

[3] Cosma Shalizi. CMU Statistics 36-462/662: Methods of Statistical Learning. Lecture Notes 11: k-Nearest Neighbors, 2022. URL: https://www.stat.cmu.edu/˜cshalizi/dm/22/lectures/11/lecture-11.pdf.

[4] Introduction to boosted trees — XGBoost documentation. https://xgboost.readthedocs.io/en/latest/tutorials/model.html#introduction-to-boosted-trees, 2022. Accessed: 15-04-2023.

[5] Roger Grosse. University of Toronto CSC 411: Machine Learning and Data Mining. Lecture 5: Multilayer Perceptrons, 2019. URL: https://www.cs.toronto.edu/˜mren/teach/csc411_19s/lec/lec10_notes1.pdf.

[6] Abdu Abohalima and Anna Frebel. Jinabase—a database for chemical abundances of metal-poor stars. The Astrophysical Journal Supplement Series, 238(2):36, oct 2018.

[7] Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(86):2579–2605, 2008.

[8] C. Chatfield. Problem Solving: A statistician’s guide, Second edition. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis, 1995.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Figures		Figures
EP4130.ipynb		EP4130.ipynb
EP4130_Project.pdf		EP4130_Project.pdf
README.md		README.md
ep4130.yml		ep4130.yml
index.html		index.html
tableExport.txt		tableExport.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive Techniques for Astrophysics

Exploratory Data Analysis

Performance

Regression

Classification

Predictions

References

About

Languages

dnshkmr7/EP4130-jinabase

Folders and files

Latest commit

History

Repository files navigation

Predictive Techniques for Astrophysics

Exploratory Data Analysis

Performance

Regression

Classification

Predictions

References

About

Topics

Resources

Stars

Watchers

Forks

Languages