Skip to content

Predictive Text Analysis project! This repository contains code for predicting answers to science exam questions using advanced natural language processing techniques. Check out the code and results!

Notifications You must be signed in to change notification settings

Vidhi1290/ScienceQA-Insights-Exploring-with-LLMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

🚀 Predictive Text Analysis for Science Exams 🚀

Welcome to our Predictive Text Analysis project! This repository contains code for predicting answers to science exam questions using advanced natural language processing techniques.

📚 Dataset Used

We utilized a comprehensive dataset containing questions (prompt) and answer choices (A, B, C, D, E) from science exams. The dataset was meticulously curated to ensure diverse and meaningful questions for analysis.

🔍 Features

  • Prompt Analysis: We performed in-depth analysis on question prompts, exploring word frequencies, lengths, and semantic patterns.
  • Text Vectorization: Utilized TF-IDF vectorization to convert textual data into numerical features for machine learning model training.
  • Machine Learning Model: Implemented a Random Forest Classifier for answer prediction, achieving high accuracy on the test set.

🧠 Model Architecture

Our machine learning model comprises a Random Forest Classifier, a robust algorithm for multi-class classification tasks. We used TF-IDF vectorized features as input, enabling the model to learn complex patterns in the textual data.

🌟 Features

  • Interactive Visualizations: Explore interactive charts and visualizations, including bar charts representing class distributions and dynamic word clouds showcasing frequently occurring words in questions.
  • 3D Scatter Plots: Dive into 3D scatter plots to uncover correlations between question difficulty, length, and correct answer frequencies.
  • Confusion Matrix: Visualize the model's performance through an intuitive confusion matrix, providing insights into prediction accuracy.

🚀 Usage

  1. Data Preprocessing: Explore Jupyter Notebooks for in-depth data preprocessing and exploratory data analysis.
  2. Model Training: Utilize the provided Python scripts to train the Random Forest Classifier and obtain predictions.
  3. Interactive Visualizations: Run interactive Python scripts for dynamic visualizations of the dataset and model performance.

🛠️ Dependencies

  • Python 3.7+
  • Pandas
  • NumPy
  • Scikit-Learn
  • Matplotlib
  • Seaborn
  • Plotly
  • WordCloud

📊 Results

Our trained model achieved an accuracy of over 90% on the test dataset, demonstrating its effectiveness in predicting correct answers to science exam questions.

🌐 Connect with Me

Let's connect and collaborate! Feel free to reach out to me on:

I'm always open to discussions, collaborations, and learning new things together. Don't hesitate to drop me a message or explore my other projects on GitHub. Happy coding! 🚀

Feel free to dive into the code, experiment with the features, and explore the nuances of writing quality predictions through keystroke analysis! 🕵️‍♂️💬

Happy coding! 🚀

About

Predictive Text Analysis project! This repository contains code for predicting answers to science exam questions using advanced natural language processing techniques. Check out the code and results!

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published