-
Notifications
You must be signed in to change notification settings - Fork 34
Homework 1
Download hw1.py
.
Write regular expressions matching the following cases (one expression per case):
- Numbers (e.g., two thousands eighteen).
- Dates (e.g., Dec. 25, 2018).
Use the template in hw1.py
to write your expressions:
RE_NUMBER = re.compile('your expression')
RE_DATE = re.compile('your expression')
Try to cover as many patterns as possible. Be aware that these cases can occur anywhere in a document. In the report, describe what kind of groups you make in your expressions.
Normalize numbers into digits:
- two thousands eighteen →
2018
. - 3 hundreds twenty one →
321
.
Normalize dates into a standardized format:
- April 1st, 2018 →
2018/04/01
. - Dec. 25 2018 →
2018/12/25
.
Complete the norm_number
and norm_date
function in hw1.py
, which take a string and convert all numbers and dates in the string into the standardized formats, respectively:
def norm_number(s):
"""
:param s: the input string
:return: the input string where all numbers are converted into their digit-forms.
"""
return s
def norm_date(s):
"""
:param s: the input string
:return: the input string where all dates are standarized.
"""
return s
In the report, provide interesting test cases for these functions.
Use any code in data_aggregation.py
to extract information for each course with its final exam schedule and save the result to a JSON file called course_exam_spring_2018.json
. Make sure to include information from all departments in the Course Atlas. In the report, describe any challenge that was not discussed during the classes.
Submit the followings to: https://canvas.emory.edu/courses/41979/assignments/115326
-
hw1.py
: including the code assigned above. -
hw1.pdf
: the report describing your approaches. -
course_exam_spring_2018.json
: the output of the schedule extraction.
Copyright © 2018 Emory University - All Rights Reserved.