-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmode_frequency.R
42 lines (29 loc) · 1.79 KB
/
mode_frequency.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
source('travel_survey_analysis_functions.R')
# Code Examples -----------------------------------------------------------
#Read the data from Elmer
#You can do this by using read.dt function. The function has two arguments:
# first argument passes a sql query or a table name (as shown in Elmer)
# in the second argument user should specify if the first argument is 'table_name' or 'sqlquery'
#Here is an example using sql query - first, you need to create a variable with the sql query
# and then pass this variable to the read.dt function
sql.query = paste("SELECT * FROM HHSurvey.v_persons_2017_2019_in_house")
person = read.dt(sql.query, 'sqlquery')
#If you would like to use the table name, instead of query, you can use the following code
#that will produce the same results
person = read.dt("HHSurvey.v_persons_2017_2019", 'table_name')
#Check the data
# this step will allow you to understand the variable and the table that you are analyzing
#you can use the following functions to check for missing values, categories, etc.
#this function will allow you to see all of the variables in the table, check the data type,
#and see the first couple of values for each of the variables
glimpse(person)
# to check the distribution of a specific variable, you can use the following code
#here, for example, we are looking at mode_freq_5 category
person %>% group_by() %>% summarise(n=n())
#if you analyze a numerical variable, you can use the following code to see the variable range
describe()
#to delete NA you can use the following code
#the best practices suggest to create a new variable with the updated table
person_no_na = person %>% filter(!is.na(mode_freq_3))
walk<-person_no_na %>% group_by(mode_freq_3) %>% summarise(tot_people=sum(hh_wt_2019))
write.table(walk, "clipboard", sep="\t", row.names=FALSE)