Skip to content

Data mining techniques applied on census data for obtaining recommendations

Notifications You must be signed in to change notification settings

vpinnaka/US-census-data

Repository files navigation

US-census-data

Data mining techniques applied on census data for obtaining recommendations. Employed Hadoop filesystem to store the big dataset, devised Map Reduce framework on High performance computing cluster using Apache spark Machine learning library. Core algorithm concepts are employed for data preprocessing and wrangling. Refer to the Project document for more details.

Benefits derived from Association rule mining

The potential benefits derived from association rule mining are:

  • Employment status of the entire population in United states.
  • Education levels of normal US citizen.
  • Taxable income amount range for individuals.
  • Female Entrepreneurs in United states

Findings and Recommendations

  • Most of the united states population works in the private sector. Among the working population most of them are male.
  • Most of the US citizens are High school graduates.
  • Taxable income is less than $50000.
  • Female Entrepreneurs are taxable less than $50000.

Based on our findings we recommend The US government to focus on Education for citizens and reduce the taxes to encourage more women in Business.

Dataset obtained from

References

http://spark.apache.org/docs/latest/mllib-frequent-pattern-mining.html

About

Data mining techniques applied on census data for obtaining recommendations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published