App store data analysis
Presentation (Google Slides)
We are a software startup researching our next product and are looking for potentially lucrative niches. So we can develop a better understanding of usage of mobile apps across iPhone and Android platforms, we want to analyze available app store data.
Which categories have the highest number of available apps?
Which categories have the highest number of installs?
What is the average price per category? (discounting free apps)
What distribution of free v. paid apps across categories
How do ratings correlate with installs?
How do reviews correlate with installs?
How does size correlate with installs?
What is the correlation between days since last update and rating?
Google Play Store (priority)
Apple Store (stretch goal)
In app-purchase data (if available)
Refine questions
Access initial data set
Format dataset
Use Pandas to clean and format your dataset(s).
Create a Jupyter Notebook describing the data exploration and cleanup process.
Illustrate analysis
Create a Jupyter Notebook illustrating the final data analysis.
Use Matplotlib to create a total of 6–8 visualizations of your data (ideally, at least 2 per ”question” you ask of your data).
Save PNG images of your visualizations to distribute to the class and instructional team, and for inclusion in your presentation.
(Optional) Determine potential API strategy Use at least one API, if you can find an API with data pertinent to your primary research questions.
Create a write-up summarizing your major findings. This should include a heading for each “question” you asked of your data and a short description of your findings and any relevant plots.