Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improved location of activities and outputs analysis #252

Merged
merged 24 commits into from
Sep 24, 2024

Conversation

MarieMcLaurent
Copy link
Contributor

This PR includes improvements in location of activities by using new data source and a better use of existing data. The PR also include results analysis used for validating our developments.

Parameters

exclude_no_employee

With this new parameter in data.sirene.cleaned file, it exclude all enterprises without employee ("NN", "00", and NaN values for "trancheEffectifsEtablissement").

home_location_source

In synthesis.locations.home.locations, with parameter home_location_source at "tiles" home location can be retrieved from centroid of INSEE tiles. Downloaded from INSEE, tiles are processed in a new script file data.tiles.raw in the same model as data.bpe.raw. Then tuiles_id column replace building_id column in return of synthesis.locations.home.locations and synthesis.spatial.home.locations. (No use of this column found in other files)

education_location_source

With this new parameter, person with education trips are send in education location corresponding to their age ranges : primary school for 0 to 10 years old, middle school for 11 to 14, high school for 15 to 17 and higher education for more than 18 years old. This parameter has 3 possible values: bpe, adresses & weighted. bpe is default value that represent actual education affectation without any restriction. weighted enable this affectation with given student count as weight for each type of education location. And adresses allows user to give education location with student count from custom geojson or geopackage file.

This new assignment is achieved thanks to two major changes in the processing of education trips: taking into account age ranges for OD choices, and distinguishing between different types of destination education locations.
For OD with AGEREV10 variable from MOBSCO data, age_range column is added to return of data.od.cleaned and data.od.weighted. This enables to have new weight sum in function of ["origin_id", "destination_id","age_range"] in place of ["origin_id", "destination_id"] for the new assignment with possibility of re-aggregation without age_range to find the current processing (as work).
Distinction of different education locations comes with changes in synthesis.locations.education, synthesis.population.spatial.primary.locations & synthesis.population.spatial.primary.candidates files with add of "TYPEQU" column. With this new column and use of age variable from person data-frame, the distinction between educational locations is made by dividing people, OD and locations into age range using EDUCATION_MAPPING dictionary. To compensate for the mismatch between locations and ODs, false locations are added in synthesis.locations.education for each location category missing from the communes.

Analysis output

documentation.flow_output

This is an output file for analyzing eqasim results in comparison with other user-supplied results. The analysis is presented as a difference in flow destination volume between these results on the INSEE 1km grid.
Theses outputs were used in order to validate our new code implementations.

Example of result:

image
Output contains one graph per age range x destination purpose. Age range are ["Yrs:0-10", "Yrs:11-14", "Yrs:15-18", "Yrs:18-25", "Yrs:25-50", "Yrs:50-65", "Yrs:65-75", "Yrs:75+"].

PR compatibility

This also included non filter of entd from other issue. #249

@sebhoerl
Copy link
Contributor

Hi, I just had a look and everything looks quite nice, thanks a lot! I think merging in the recent changes (department filtering, ...) will reduce the change set a bit, once the other PR is merged.

Do you think it would be too much of effort to split this one in individual PRs for the individual changes? Because it would be quite a bit of changes in one go (= four major new features). Otherwise, we can go with this one.

The case of education_file is a bit different to the existing philosophy that all data that can be used is available somewhere, I assume this is something that you created yourself? I'm fine with merging it as a feature, and it clearly shows that our plans make sense to further modularize the pipeline, so everybody can plug together the steps that they need and for which they have the data :)

I'll have a closer look on the changes in the coming days, let's first merge the other PR.

@sebhoerl
Copy link
Contributor

Ok the other PR is merged, you can now merge back in here.

@MarieMcLaurent
Copy link
Contributor Author

Ok I merged the other PR back here with resolution of conflicts.
Even if the first commits separate well the different features, corrections following make it complicate for me to split changes into individual PR. So if it's good for you, I think it would be easier to go with this one PR.

Copy link
Contributor

@sebhoerl sebhoerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Marie, thanks a lot for all of those changes, I now had a more detailed look. There are a couple of request / recommendations to keep the code structure a bit more simple and disentangle certain things that are repetitive. I hope everything is clear, otherwise also don't hesitate to contact me directly.

data/hts/entd/cleaned.py Outdated Show resolved Hide resolved
data/od/weighted.py Outdated Show resolved Hide resolved
data/od/weighted.py Show resolved Hide resolved
data/od/weighted.py Show resolved Hide resolved
data/od/weighted.py Show resolved Hide resolved
synthesis/population/spatial/primary/candidates.py Outdated Show resolved Hide resolved
synthesis/population/spatial/primary/candidates.py Outdated Show resolved Hide resolved
synthesis/population/spatial/primary/candidates.py Outdated Show resolved Hide resolved
synthesis/population/spatial/primary/locations.py Outdated Show resolved Hide resolved
synthesis/population/spatial/primary/locations.py Outdated Show resolved Hide resolved
@MarieMcLaurent
Copy link
Contributor Author

Hi Sebastian, firstly thanks for taking your time to make all those recommendations/ requests that help to understand the pipeline logic and simplify changes. I made most of changes as recommend to respect pipeline logic and nomenclature.
For od as in my comment on review as use of fillna only arises because lines with NaN weight are set to 0 in od.cleaned, it might be possible to delete them to avoid having to do the same thing for weighted. However, as the current code keeps these lines, is there any particular reason for this, or is it something that could be changed to avoid use of fillna?
Also with custom education location in an other file, I didn't included the given weight, as it was before, while keeping mix of custom locations and locations from bpe to avoid creation of multiple fake locations. Yet I don't know if this approch could beneficy more people or it would be better to have given weight to complete custom data with bpe.weighted file ?

I don't really know how work github review, do I need to mark as resolve all requests to validate them ?

synthesis/population/spatial/primary/candidates.py Outdated Show resolved Hide resolved
synthesis/locations/education.py Outdated Show resolved Hide resolved
@sebhoerl
Copy link
Contributor

Hi @MarieMcLaurent, just two small fixes where you left some debugging output. Otherwise it is good to merge :)

@sebhoerl
Copy link
Contributor

Ok looks good, just need to merge in develop and we can close the PR

@sebhoerl sebhoerl merged commit c2224ec into eqasim-org:develop Sep 24, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants