-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add/Remove Services and Routes in groups #112
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this performance win!
There are a couple of tests that need some minor tweaking (they currently cannot fail no matter what), so I'm not approving, but they will be quick fixes 👍
self._graph.graph['change_log'].remove(object_type='route', object_id=route_id, object_attributes=route_data) | ||
if isinstance(route_ids, set): | ||
route_ids = list(route_ids) | ||
missing_ids = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code looks to be duplicated - if so, it could be refactored into a helper method
tests/test_core_schedule.py
Outdated
|
||
def test_adding_multiple_services_updates_changelog(schedule, services_to_add): | ||
schedule.add_services(services_to_add) | ||
list(schedule.change_log().iloc[-2:][['change_event', 'new_id']].to_records()) == [(0, 'add', 'new_service_0'), (1, 'add', 'new_service_1')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you forget to make this an assertion? If I change ==
to !=
, I am pretty sure this test will still pass 😲
Also, it would be good to create the expectation directly from services_to_add
if possible, rather than hardcoding it like this using implicit knowledge of that fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦♀️
tests/test_core_schedule.py
Outdated
@@ -851,6 +881,22 @@ def test_removing_service_updates_vehicles(schedule): | |||
assert schedule.vehicles == {} | |||
|
|||
|
|||
def test_multiple_services_are_no_longer_present_in_schedule_after_removing(schedule_graph): | |||
s = Schedule(_graph=schedule_graph) | |||
assert set(s.service_ids()) == {'service1', 'service2'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are relying on implicit knowledge of the content of the fixture, and it makes the test a little bit less clear than it could be. It feels nicer to me to query the fixture here for the service names we're going to pass to remove_services
(and maybe assert that we don't have an empty list), rather than asserting that these magic values are in the schedule we built from the fixture.
Also, I don't think we really care what the route IDs are, so we don't need to pre-assert on those, and in doing so we have added another reason that this test could fail that isn't really the thing we want to test (it could fail if our assumption about the route IDs in the fixture was wrong, even if the code under test is correct).
tests/test_core_schedule.py
Outdated
def test_removing_multiple_services_updates_changelog(schedule_graph): | ||
s = Schedule(_graph=schedule_graph) | ||
s.remove_services(['service1', 'service2']) | ||
list(s.change_log().iloc[-2:][['change_event', 'old_id']].to_records()) == [(0, 'remove', 'service1'), (1, 'remove', 'service2')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Send out the assertion search squad! 😄
I had a mess about with this test, or at least the other one like it, and when I turned this statement into an assertion it failed. I think it was to do with the types of the objects in the two lists we're comparing; one was holding tuples and the other was holding numpy record objects, so they looked the same, but failed equality comparison. I'm guessing there is a quick and easy "pandonic" way to make sure our expectation list has records, not tuples (or the actual list has tuples rather than records)?
tests/test_core_schedule.py
Outdated
assert set(s.route_ids()) == set() | ||
|
||
|
||
def test_removing_multiple_services_updates_changelog(schedule_graph): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like these test names 👍
tests/test_core_schedule.py
Outdated
|
||
def test_adding_multiple_routes_updates_changelog(schedule, services_to_add): | ||
schedule.add_services(services_to_add) | ||
list(schedule.change_log().iloc[-2:][['change_event', 'new_id']].to_records()) == [(0, 'add', 'route_to_add1'), (1, 'add', 'route_to_add2')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another missing assert, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smashing!
@@ -11,6 +12,24 @@ def ensure_dir(direc): | |||
logging.warning(e) | |||
|
|||
|
|||
def setify(value: Union[str, list, set]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These don't seem like persistence-related helper functions - maybe we should move them later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I have a really hard time naming things 🥲 what name would you give them?
@@ -783,14 +783,21 @@ def services_to_add(): | |||
|
|||
|
|||
def test_multiple_services_are_present_in_schedule_after_adding(schedule, services_to_add): | |||
existing_routes = set(schedule.route_ids()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is much nicer now 😄
|
||
|
||
def test_adding_multiple_services_updates_changelog(schedule, services_to_add): | ||
schedule.add_services(services_to_add) | ||
list(schedule.change_log().iloc[-2:][['change_event', 'new_id']].to_records()) == [(0, 'add', 'new_service_0'), (1, 'add', 'new_service_1')] | ||
assert list(schedule.change_log().iloc[-len(services_to_add):][['change_event', 'new_id']].itertuples( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now with added asserts!
|
||
services_to_remove = list(s.service_ids()) | ||
s.remove_services(services_to_remove) | ||
assert list(s.change_log().iloc[-len(services_to_remove):][['change_event', 'old_id']].itertuples( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
list(s.change_log().iloc[-2:][['change_event', 'old_id']].to_records()) == [(0, 'remove', '1'), (1, 'remove', '2')] | ||
route_to_remove=['1', '2'] | ||
s.remove_routes(route_to_remove) | ||
assert list(s.change_log().iloc[-len(route_to_remove):][['change_event', 'old_id']].itertuples(index=False,name=None)) == [('remove', r) for r in route_to_remove] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
assert persistence.setify('ABC') == {'ABC'} | ||
|
||
def test_setifying_set_makes_no_difference(): | ||
assert persistence.setify({'ABC', 2}) == {'ABC', 2} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect the original set, or a semantically equivalent copy of that set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it won't be the original set, but I only care about the contents 😄
40 minutes to 2 minutes is massive, well done @KasiaKoz !! |
Before, each add/remove of a Service or Route object to/from Schedule was processed individually. When modifying PT Schedules on a larger scale this takes a long time to process. This change is not some huge smart re-write of existing methods, rather grabbing some low hanging fruit and wrapping it in a few loops that led to some major improvements. The biggest slow-down was (I think) due to updating the change log individually---these changes now take advantage of the big batch logging for changelog.
A test on a small test network (adding 10 services each with 3 routes inside)
![Screenshot 2022-04-01 at 17 00 49](https://user-images.githubusercontent.com/36536946/161301619-57a10189-e7e9-47f4-a6c8-e33c35f71b5c.png)
![Screenshot 2022-04-01 at 17 00 59](https://user-images.githubusercontent.com/36536946/161301635-f8fd8f67-1431-4545-a6ad-9f8621a78692.png)
![Screenshot 2022-04-01 at 17 01 06](https://user-images.githubusercontent.com/36536946/161301682-52ef6d5f-0ef0-42da-9f99-2b034c614fd2.png)
![Screenshot 2022-04-01 at 17 01 13](https://user-images.githubusercontent.com/36536946/161301693-95c301af-a73d-4e5b-b7e2-9b6aa3a739b9.png)
Before:
After:
Removing services, Before:
After:
I attempted this to speed things up in NZ. Adding ~300 services used to take 40mins, now takes ~2mins 💪