Problem statements, discussions, and prototypes
This repo will host problem statements and inception discussions. Baselined requirements and context will be in the project-specific folder, and accompanying discussions will be in the discussion section.
-
Clone the repository (if applicable) or navigate to the project directory.
-
Create a virtual environment:
python3 -m venv venv
-
Activate the virtual environment:
-
On Windows:
venv\Scripts\activate
-
On macOS/Linux:
source venv/bin/activate
-
-
Install the required packages:
pip install -r requirements.txt
-
Ensure your data files are placed in the
data/
directory. -
Run the script:
python school_mapping.py
-
Output: The script will generate a
school_mapping_results.csv
file containing the matched schools along with their district information.
- The script transliterates school names from Devanagari to Roman script and matches them with the names in English.
- District information is used to filter potential matches to ensure they are from the same district.
- Fuzzy matching is applied to find the best match based on the transliterated school names.
- Matches with a score above a specified threshold (default: 80) are included in the final output.
school_id_a
: School ID from Source Aschool_id_b
: School ID from Source Bmatch_score
: Fuzzy match scoreschool_name_a
: School name from Source Aschool_name_b
: School name from Source Bdistrict_id_a
: District ID from Source Adistrict_a
: District name from Source Adistrict_b
: District name from Source B
- pandas
- rapidfuzz