Skip to content

ECON 403 Introduction to Data Science (for Economists)

Notifications You must be signed in to change notification settings

etaymaz/econ413

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ECON 413 INTRODUCTION TO DATA SCIENCE

Spring 2017

Instructor

Erol Taymaz

Room A216

etaymaz@metu.edu.tr

Lecture hours: Wed 14:40-17:30

Office hours: Wed 10:00-12:00

Course prerequisites: IS 100, ECON 206

Course credit: 3

Course description

Data science is an interdisciplinary field about scientific processes and systems to extract knowledge or insights from data in various forms. With the availability of substantial amount of data in various forms and resources, it has become essential for economists to be equipped with skills needed to collect, process, analyze, and present the data. The course will be taught as a series of workshops. Main topics and methods will be summarized and discussed in each lecture, and the students will write the code to perform the task assigned to them during the lecture. The students will learn how to write basic programs in R which is one of the most popular open-source programming language currently in use by data scientists.

Course objectives

By the end of the course the students will know how they use the data science for economic analysis, and learn the basic tools that they need for data analysis. At the end of the course the students apply these tools and techniques to analyze a real-world problem by using R in all stages of the research process.

Learning outcomes

Thus the students at the end of the semester will be able to:

  • Learn basic programming skills with R Programming
  • Access the data from various sources and formats
  • Reshape and clean the data for reporting and further analysis
  • Explore and visualize the data
  • Conduct statistics analysis by using R *Perform reproducible research

Grading

The course consists of lectures, practices (assignments) and a project. Practices involve performing a specific task about collecting, cleaning, analyzing, visualizing and presenting the data about a certain topic relevant for economists. Practices will be submitted individually. The project will involve all components of data analysis process, and students are encouraged to work in teams of two or three for a project. The project will seek to answer an important real-world probem. The students will collect and clean th data, model the problem, visualize the data and their analysis, and present their findings by using R.

Course grades will be based on an 6 practices (60 % each), and a (group) project (40 %).

Textbooks

Everitt, Brian S. and Hothorn, Torsten (2009), A Handbook of Statistical Analyses Using R, Chapman and Hall/CRC.

Peng, Roger D. (2015), R Programming for Data Science, Leanpub.

Venables, W. N., Smith, D. M. and the R Core Team (2015), An Introduction to R, R Core Team. Wickham, Hadley (2014), Advanced R, Chapman & Hall/CRC.

Wickham, Hadley (2016), ggplot2: Elegant Graphics for Data Analysis, Springer.

Zumel, N. and Mount, J. (2014), Practical Data Science with R, Manning Publications.

Outline of topics

  1. Introduction to Data Science
    • What is data science?
    • Why is data science important?
    • The data science process
  2. Introduction to R
    • What is R?
    • R language basics
    • Rstudio basics
  3. Data visualization
    • Data structures
    • File types
    • ggplot basics
    • Animation
    • Data exploration
    • Data presentation
  4. Data structures
    • Data structures in R
    • Matrix
    • Data frame
    • Data table
  5. Functions
    • Function components
    • Function arguments
    • Special functions
  6. Loops and loop functions
    • Looping in R
    • Loop functions
  7. Transforming and cleaning data
    • Data transformation
    • Data cleaning
    • Data merging
    • Creating new variables
    • Missing observations
  8. Reading and collecting data
    • Reading data files
    • Web sources
  9. Descriptive statistics
    • Univariate descriptive statistics
    • Bivariate descriptive statistics
  10. Statistical modeling
    • Statistical models
    • Linear regression models
    • Panel data models
  11. Text mining
    • Reading text data
    • Analyzing text data
  12. Maps
    • Map functions in R
    • Map visualizations
  13. Networks
    • Network analysis basics
    • Network visualizations
    • International trade networks
  14. Reproducible research
    • Rmarkdown basics
    • Presentation basics

About

ECON 403 Introduction to Data Science (for Economists)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published