Management commads to retrieve courses data from BuscaCursosUC and Catálogo UC and save it to the database.
Directly from terminal use the management command:
python manage.py scrape <action> [YYYY-S]
Also, scraping can be scheduled from /scraper
url (only allowed for staff users). See scheduler docs for more details on how scheduling works.
collect
-> Runs a full search in BC and Catalogo. Inserts or updates all BC content. Does NOT delete removed courses.update
-> Updates data of courses present in simple BC search. Creates a delete log with courses not found.delete
-> For every course in the delete log, retries the search in BC and deletes if the course does not exist. The delete log must be cleared manually.search
-> A course initial must be provided as extra argumemt (--initials <initials>
). Retrives results for that initials in BC.banner
-> Scrapes available cupos for all sections in a given period. Accepts a banner name parameter (--banner <banner_name>
) that adds to the database.
Uses the same schema defined on courses models.
- courses_course (id, initials, name, credits, req, con, restr, program, school, area, category)
- courses_section (id, course_id, period, section, nrc, teachers, schedule, format, campus, is_english, is_removable, is_special, available_quota, total_quota)
- courses_quota (section_id, date, category, quota, banner)
- courses_fullschedule (section_id, LMWJVS x 12345678)
- courses_scheduleinfo (section_id, total, ayu, clas, lab, pra, sup, tal, ter, tes)
- courses_category (id, name)
-
Courses main data http://buscacursos.uc.cl/?cxml_semestre={PERIOD}&cxml_sigla={SIGLA} http://buscacursos.uc.cl/?cxml_semestre={PERIOD}&cxml_nrc={NRC}
-
Quota details http://buscacursos.uc.cl/informacionVacReserva.ajax.php?nrc={NRC}&termcode={YEAR}-{SEMESTER}
-
Programs http://catalogo.uc.cl/index.php?tmpl=component&view=programa&sigla={SIGLA}
-
Requirements and restrictions http://catalogo.uc.cl/index.php?tmpl=component&view=requisitos&sigla=PSI5005