Anton Markov et al (Credit scoring methods: Latest trends and points to consider, 2022) suggest that University of California Irvine's datasets are among the most popular public sources for credit score modeling. I have chosen the UCI (Statlog) German Credit Data to begin with.
This dataset contains information about 1000 loan applications, including personal and financial data, credit history, and loan characteristics.
Train models in order to predict weather a loan is benefitial or not, in other words predict its creditability for the finantial institution.
Due to some imbalanced columns, the logreg model presents difficulty in predicting "Bad" loans. To overcome this limitation we might consider oversampling the misrepresented categories in these columns.