Skip to content

Some data scrapers from USOS via the official API and other methods made in Python for the promoCHATor project in the ML section.

License

Notifications You must be signed in to change notification settings

Solvro/script-promochator-usos-scraper

Repository files navigation

Solvro ML - USOS scraper for PromoCHATor project

This repository contains scripts scraping data from USOS for PromoCHATor project in Solvro ML section. These scripts allow to collect data about university teachers, their scientific achievements, abstracts of students' scientific papers. The project will be developed as necessary.

Table of contents

  1. Description
  2. Technologies
  3. Development
    1. Quick start
    2. Github workflow
  4. Current team

Description

This repository contains scripts scraping data from USOS. These scripts allow to collect data about university teachers, their scientific achievements, abstracts of students' scientific papers.

Technologies

Project uses following languages and technologies

  • Python 3.9.13

Development

Quick start

Setup project locally

  1. Clone the repository:

    git clone https://github.com/Solvro/script-promochator-usos-scraper.git
    
  2. Change directory:

    cd script-promochator-usos-scraper
    
  3. Create new virtual environment:

    python -m venv <your_env_name>
    
  4. Activate environment:

    ./venv/Scripts/activate
    
  5. Install the required modules:

    python -m pip install -r requirements.txt
    

Run usos-teachers-scraper

  1. Sign up for an API key:

    https://apps.usos.pwr.edu.pl/developers/
    
  2. Change directory:

    cd usos-teachers-scraper
    
  3. Create a config.json file. Then come up with a secret key. Finally, paste your Consumer Key and Consumer Secret:

    {
     "secret_key": "<your-secret-key>",
     "consumer_key": "<generated-consumer-key>",
     "consumer_secret": "<generated-consumer-secret>"
    }
    
  4. Run the script:

    python ./usos-teachers-scraper.py
    
  5. Visit the USOS authorization page:

    http://127.0.0.1:5000/start_oauth
    
  6. Fetch teachers data by visiting page:

    http://127.0.0.1:5000/fetch_staff
    

Run usos-abstracts-scraper

  1. Change directory:

    cd usos-abstracts-scraper
    
  2. Run the script:

    python ./usos-abstracts-scraper.py
    
  3. Input initial thesis id and final thesis id.

Github workflow

When you had assigned yourself to new task, you should stick to these steps

  1. git checkout main Check out main branch
  2. git pull origin main Pull current changes from main branch
  3. git fetch Be up to date with remote branches
  4. git checkout -b type/task Create new task branch
  5. git add . Add all changes we have made
  6. git commit -m "My changes description" Commit changes with proper description
  7. git push origin type/task Pushing our changes to remote branch
  8. On Github we are going to make Pull Request (PR) from our remote branch

Warning

Do not push changes directly to main branch

For further information read Solvro handbook

Github Solvro Handbook 🔥 - https://docs.google.com/document/d/1Sb5lYqYLnYuecS1Essn3YwietsbuLPCTsTuW0EMpG5o/edit?usp

Current team

This is our current team

About

Some data scrapers from USOS via the official API and other methods made in Python for the promoCHATor project in the ML section.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages