Analyzing a dataset from Aadhaar - a unique identity issued to all resident Indians using SparkSQL in Python
##Dependencies
- Spark 2.0: (http://spark.apache.org/)
- Python 2.7: (https://www.python.org/)
##Usage
I choose Aadhaar Dataset which is available at Aadhaar public data portal using SparkSQL in Python to query below questions.
##Queries
- Count the number of cards approved by States.
- Count the number of cards approved by Enrolment Agency.
- Count the number of cards rejected by States.
- Count the number of Aadhaar applicants by gender split by States.
spark-submit AadharAnalysis.py