Nav 47 refactor raptor algorithm #68

munterfi · 2024-06-22T14:25:40Z

A PR for the refactoring of the Raptor algorithm we started together.

Summary:

Rename Raptor components to be more consistent.
Factor out classes according to SRP:
- Route scanner
- Footpath relaxer
- Label postprocessor
- Objective class to hold the progress of the Raptor
Extract interfaces for the Raptor algorithm
Remove redundant Same Stop Transfer Generator
Bugfix, where departure time was doubled when starting with location

This PR goes across the project, so avoid starting new branches at the moment. Let's finish the refactoring first. I think it would make sense to discuss the changes in a short meeting.

…ithm

…ide of raptor.

- Legs --> Labels - Departure / Arrival --> Source / Target

…with location.

…on is only favored because same stop transfer time is subtracted. Additionally add a method that reduces travel time be departing later (DEPARTURE) or arriving earlier (ARRIVAL) to reduce travel time by combining the last labels (when possible)

… labels and best times

…g them on the objective

… in route scanner

…t isolines and connections

munterfi · 2024-06-22T14:30:37Z

Eventually we could move the marked stops also into the Objective class?

…for arival time type.

clukas1

Very nice work, now I can follow my own code better :)

Although, I don't see any deal breaking issues with this PR, which prevent approval. Some comments to discuss.

Connection - Comparable. I would not implement comparable for connections, I believe comparing them by arrival time does not make any sense, or at least comparisons by departure time, travel time or number of transfers are just as likely.
Raptor / Objective: I like having this objective container for handling query specific attributes. And I agree that the markedStops should be managed by the spawnFromStop method. However, maybe we should rename the Raptor class to Router or RaptorRouter (which implements the RaptorAlgorithm) and the Objective to Request (because in my opinion it's not only a objective container). The idea behind is, that we can accept requests via the interface through the raptor algorithm and keep all constant raptor relevant data in this class, however move all of the request processing to the request class instance (which will only keep data relevant to the request), this could reduce the argument passing around even more. As a result, spawnFromStops would also be moved to Request.
Impl: I now know why I like having interface names starting with i (or some other convention). Can we rename the raptor Interfaces to RaptorLeg / RaptorConnection and the implementations to Leg/Connection?

clukas1 · 2024-06-23T10:40:56Z

src/main/java/ch/naviqore/raptor/RaptorAlgorithm.java

+
+public interface RaptorAlgorithm {
+
+    static RaptorBuilder builder(int sameStopTransferTime) {


This doesn't really work looking forward. I think the raptor builder is too much implementation specific that we can keep it in the interface. Anyway at the moment the Raptor Builder is inside the Impl package.

Agree, I think we will have to implement a factory for the Raptor.

- Update license descriptions accordingly.

munterfi · 2024-06-24T10:00:45Z

Very nice work, now I can follow my own code better :)

@clukas1, thanks for the review and feedback!

Although, I don't see any deal breaking issues with this PR, which prevent approval. Some comments to discuss.

Connection - Comparable. I would not implement comparable for connections, I believe comparing them by arrival time does not make any sense, or at least comparisons by departure time, travel time or number of transfers are just as likely.

My rationale behind using Comparable is to enforce a consistent order, rather than relying on potentially non-deterministic or algorithm-specific internals. However, I agree that comparing by arrival time might not be the most logical choice. We could consider alternatives like comparing by departure time, travel time, or number of transfers. But I think enforcing comparable in the interface is good, to ensure those toughts are made.

Raptor / Objective: I like having this objective container for handling query specific attributes. And I agree that the markedStops should be managed by the spawnFromStop method. However, maybe we should rename the Raptor class to Router or RaptorRouter (which implements the RaptorAlgorithm) and the Objective to Request (because in my opinion it's not only a objective container). The idea behind is, that we can accept requests via the interface through the raptor algorithm and keep all constant raptor relevant data in this class, however move all of the request processing to the request class instance (which will only keep data relevant to the request), this could reduce the argument passing around even more. As a result, spawnFromStops would also be moved to Request.

Agree to renaming Raptor to RaptorRouter, see 91b0772.

However I vote against renaming Objective to Request since the Objective is something internal, which is not even used outside of the spawnFromStops method. In naming it, I have borrowed this term from the field of algorithmics. Because the objective (function) should be minimized. In our case, the arrival time and the number of transfers. I think this fits the terminology quite well, but perhaps there is a better term? From my point of view, a Request is something that is sent to the RaptorRouter from outside and answered with a Response object.

I may not fully understand the suggestion to move the spawnFromStops method. But I like the separation of the main loop in the method spawnFromStops and the objective which is optimized. So the main logic of the process is still in the RaptorRaptor router. This allows us to separate the control flow cleanly from the complex details, which are then handled by the RouteScanner, FootpathRelaxer and the Objective. To implement the range query, we can just reuse the same Objective and spawn from the stops again at the next later departures at the given stops.

Impl: I now know why I like having interface names starting with i (or some other convention). Can we rename the raptor Interfaces to RaptorLeg / RaptorConnection and the implementations to Leg/Connection?

Can you further elaborate on this? LegImpl should never be visible outside of the Raptor package. So only ch.naviqore.raptor.Leg should be visible from the outside, which is quite clear that it belongs to the Raptor. The user does not need to know if it is an interface or a concrete class. If I use a class with Impl postfix outside the package, then it triggers a weird feeling. That's why I like it somehow 😄

But you are also right about the Impl postfix (removed in 5dccb64), see this artice: https://www.baeldung.com/java-interface-naming-conventions

Another widespread pattern in enterprise Java applications comes from the Hungarian Notation. In this section, we’ll briefly talk about the preceding I and the Impl suffix patterns and why we should avoid them.

The preceding I pattern suggests that any interface name starts with the capital letter I, an abbreviation for Interface. This is more common in C# code, where we don’t have a keyword to differentiate interface implementation from inheritance. However, this wouldn’t be necessary in Java since we can differentiate implementation from inheritance by looking at the keywords implements and extends.

The Impl suffix pattern suggests naming the interface’s implementations instead of the interface itself. Hence, all implementation names end with Impl. This usually appears when we create an interface with a single implementation and we can’t find a name that truly represents its specialization. However, the word Impl doesn’t add anything since the class signature shows it’s implementing something.

Therefore, we must avoid naming interfaces and classes using the I or Impl patterns, for example, UserImpl, IUser, IIdentifiable, and IdentifiableImpl.

We should discuss these points in more detail at our meeting on Wednesday.

…n raptor package

clukas1 · 2024-06-24T15:17:25Z

My rationale behind using Comparable is to enforce a consistent order, rather than relying on potentially non-deterministic or algorithm-specific internals. However, I agree that comparing by arrival time might not be the most logical choice. We could consider alternatives like comparing by departure time, travel time, or number of transfers. But I think enforcing comparable in the interface is good, to ensure those toughts are made.

I would still prefer not to implement it, the problem is that by implementing it, in my opinion it has to be 100% clear why and how it is intended or at least easy to explain. In this case here, I think there are many logical ways to implement it and none over weighs the other, hence I think it's cleaner not to implement it to make it explicit that one has to define how one wants to sort / compare the connections.

Agree to renaming Raptor to RaptorRouter, see 91b0772.

Thanks

However I vote against renaming Objective to Request since the Objective is something internal, which is not even used outside of the spawnFromStops method. In naming it, I have borrowed this term from the field of algorithmics. Because the objective (function) should be minimized. In our case, the arrival time and the number of transfers. I think this fits the terminology quite well, but perhaps there is a better term? From my point of view, a Request is something that is sent to the RaptorRouter from outside and answered with a Response object.

I may not fully understand the suggestion to move the spawnFromStops method. But I like the separation of the main loop in the method spawnFromStops and the objective which is optimized. So the main logic of the process is still in the RaptorRaptor router. This allows us to separate the control flow cleanly from the complex details, which are then handled by the RouteScanner, FootpathRelaxer and the Objective. To implement the range query, we can just reuse the same Objective and spawn from the stops again at the next later departures at the given stops.

I get why you want to name it Objective, however I disagree since the objective is implemented in the algorithm (minimize arrivalTime/maximize departureTime) and can not really be changed by replacing the Objective, except for the TimeType (which by this logic would have to carry the name objective). Therefore, I would name the object containing all the route request specific details something like Request or Query since it will hold all details relevant to the specific request and this is also the reason why I would move spawnFromStops to this object. I regard the RaptorRouter something like a service, that holds all schedule relevant data to query and accept requests. The processing of the request can then be done by a temporary processing instance (Query, Request, or at the moment Objective).

Can you further elaborate on this? LegImpl should never be visible outside of the Raptor package. So only ch.naviqore.raptor.Leg should be visible from the outside, which is quite clear that it belongs to the Raptor. The user does not need to know if it is an interface or a concrete class. If I use a class with Impl postfix outside the package, then it triggers a weird feeling. That's why I like it somehow 😄

I agree, that I wouldn't want to see a Impl postfix outside of the package. But I also get a cold shiver down my back if I see Impl inside the package, so thanks for renaming. The reason I would also like to name the Interfaces RaptorLeg/RaptorConnection instead of Leg/Connection is that we would prevent working with two implementations of Leg/Connection on the service level, resulting in Type Annotations specifying the full definition (e.g. ch.naviqore.raptor.Leg instead of simply RaptorLeg. Java should introduce alias imports, that would also resolve this issue...

…nterface

clukas1 · 2024-06-24T19:01:54Z

src/main/java/ch/naviqore/raptor/router/Query.java

@@ -29,7 +29,7 @@ class Query {
    private final TimeType timeType;

    private final int[] targetStops;
-    private final int cutOffTime;


Sorry, to be picky. but cut off is not one word 😅

Sorry again, my bad. Turns out American English uses cutoff, just feels wrong (and I prefer British English).

We should define what dialect to use and then apply this consistently throughout our codebase.

Based on a short research (https://stackoverflow.com/questions/157807/gb-english-or-us-english), American English seems to be the preferred choice in programming.

There are also some British English that are kind of unusual for me:

American English British English

Color Colour

Initialize Initialise

Optimize Optimise

Center Centre

License Licence

Traveler Traveller

Program Programme

clukas1 · 2024-06-24T19:06:01Z

src/main/java/ch/naviqore/raptor/router/Query.java

@@ -29,7 +29,7 @@ class Query {
    private final TimeType timeType;

    private final int[] targetStops;
-    private final int cutOffTime;


Sorry again, my bad. Turns out American English uses cutoff, just feels wrong (and I prefer British English).

clukas1 · 2024-06-24T19:11:59Z

src/main/java/ch/naviqore/raptor/router/RaptorRouter.java

@@ -131,7 +131,7 @@ private List<Query.Label[]> spawnFromStops(int[] sourceStopIndices, int[] target
        // initially relax all source stops and add the newly improved stops by relaxation to the marked stops
        Set<Integer> markedStops = query.initialize();
        markedStops.addAll(footpathRelaxer.relaxInitial(sourceStopIndices));
-        markedStops = query.removeSubOptimalLabelsForRound(0, markedStops);


Although not orthographically correct, camel casing in our company treats prefixes (sub, super, pre, post, etc.) as independent words. I generally prefer this because it improves readability but know that it's strictly speaking incorrect.

clukas1 · 2024-06-24T19:14:58Z

src/main/java/ch/naviqore/raptor/router/FootpathRelaxer.java

@@ -31,18 +31,18 @@ class FootpathRelaxer {

    /**
     * @param raptorRouter the current raptor instance for access to the data structures.
-     * @param objective    the best time per stop and label per stop and round.
+     * @param query        the best time per stop and label per stop and round.


The param description for query should be ~ object containing query configuration and intermediate working variables or something of this sort.

clukas1 · 2024-06-24T19:16:57Z

src/main/java/ch/naviqore/raptor/router/Query.java

@@ -9,10 +9,10 @@
 import java.util.*;

 /**
- * The objective stores the progress of the raptor algorithm. Each request needs a new objective instance.
+ * The query stores the progress of the raptor algorithm. Each request needs a new query instance.


The query stores the configuration of the raptor routing request and holds working variables to store the progress of the raptor algorithm.

clukas1 · 2024-06-24T19:17:48Z

src/main/java/ch/naviqore/raptor/router/RouteScanner.java

@@ -31,20 +31,20 @@ class RouteScanner {

    /**
     * @param raptorRouter the current raptor instance for access to the data structures.
-     * @param objective    the best time per stop and label per stop and round.
+     * @param query        the best time per stop and label per stop and round.


same as above

clukas1 · 2024-06-24T19:19:59Z

src/main/java/ch/naviqore/raptor/impl/RaptorConnection.java

@@ -15,7 +15,7 @@
 @NoArgsConstructor(access = AccessLevel.PACKAGE)
 @Getter
 @ToString
-class RaptorConnection implements Connection {
+class RaptorConnection implements Connection, Comparable<Connection> {


Why implement the Comparable? I still don't see a reason to keep this.

This is exactly the problem that I try to explain, we don't use it, but we should.

We should sort all our returned connections in RaptorAlgorithm / Router, by calling sort before return in the LabelProcessor:

return connections.stream().sorted().toList();

Which will then implicitly call the compareTo required by the Comparable Interface.

If we do this now, most of our tests fail. This is not because the Raptor is not working, but simply due to sorting issues. The problem is that the way we implement the Raptor currently influences the order of the results, which is not very stable. Additionally, if we implement further versions (native or mcRaptor), this instability will cause further issues.

In my opinion, the necessity to sort connections is an essential property. Otherwise, the order is not transparent for the user.

- Query holds now the complete routing logic. - Objective stores the labels and best times and serves as single point for their modification. - Introduce raptor data interface.

munterfi · 2024-06-24T20:44:44Z

@clukas1, thanks for the feedback! I tried to address most of it, feel free to continue. I will not work on the refactoring until our meeting on Wednesday to avoid conflicts.

I get why you want to name it Objective, however I disagree since the objective is implemented in the algorithm (minimize arrivalTime/maximize departureTime) and can not really be changed by replacing the Objective, except for the TimeType (which by this logic would have to carry the name objective). Therefore, I would name the object containing all the route request specific details something like Request or Query since it will hold all details relevant to the specific request and this is also the reason why I would move spawnFromStops to this object. I regard the RaptorRouter something like a service, that holds all schedule relevant data to query and accept requests. The processing of the request can then be done by a temporary processing instance (Query, Request, or at the moment Objective).

I did my best to move the main routing logic from the RaptorRouter to the Query object and encapsulate access to the Labels in the Objective object (you can rename this if you don't like the name, important for me is that the access to the labels and best times is encapsulated, which makes it a lot easier to track the routing algo in the debugger). Also introduced an internal RaptorData interface for access to the raptor data structures (interface segregation).

Additionally, I have discovered a strange issue in the getBestTime method:

public-transit-service/src/main/java/ch/naviqore/raptor/router/Objective.java

Line 65 in 92e2ab0

    
           // TODO: Strangely if this method is changed with the method above in 'getBestTimeForAllTargetStops' in the Query

, where I suspect that we are not updating the global best time when relaxing footpaths. To reproduce the issue, remove the _REMOVE prefix here:

public-transit-service/src/main/java/ch/naviqore/raptor/router/Query.java

Line 189 in 92e2ab0

int bestTimeForStop = objective.getBestTime_REMOVE(targetStopIdx);

and run all tests.

My interpretation is that we can remove this method and directly use getBestTime if we track the global best times correctly?

…s return.

…ng in reverse order over rounds.

clukas1 · 2024-06-24T21:57:06Z

@munterfi some final touches from my side. Good to merge in my opinion.

Summary of changes:

My interpretation is that we can remove this method and directly use getBestTime if we track the global best times correctly?

Actually both methods are needed (one of the reasons I introduced the transfer only test). The problem if you use the comparableBestTime as cut-off value all transfer legs will be removed at the end of the round. While writing this now, I just realized that the best way would have been to make add/subtract the same transfer time to route arrivals/departures instead of subtracting/adding them to transfers. Which would have solved this (improvement for a later issue)... However, I've renamed the methods and added a docstring to clarify when which one should be used in e2ce8f4.

This is exactly the problem that I try to explain, we don't use it, but we should.
We should sort all our returned connections in RaptorAlgorithm / Router, by calling sort before return in the LabelProcessor:
return connections.stream().sorted().toList();
Which will then implicitly call the compareTo required by the Comparable Interface.
If we do this now, most of our tests fail. This is not because the Raptor is not working, but simply due to sorting issues. The problem is that the way we implement the Raptor currently influences the order of the results, which is not very stable. Additionally, if we implement further versions (native or mcRaptor), this instability will cause further issues.
In my opinion, the necessity to sort connections is an essential property. Otherwise, the order is not transparent for the user.

Actually, they are sorted by default by the number of rounds (route trip legs) and arrival/departure time in descending/ascending order. I've reworded the docstring return value in the RaptorAlgorithm interface to be more explicit about this in c17df10

I did my best to move the main routing logic from the RaptorRouter to the Query object and encapsulate access to the Labels in the Objective object (you can rename this if you don't like the name, important for me is that the access to the labels and best times is encapsulated, which makes it a lot easier to track the routing algo in the debugger). Also introduced an internal RaptorData interface for access to the raptor data structures (interface segregation).

As you might have guessed, I disagreed with the Objective name. I've renamed it to StopLabelsAndTimes to be explicit about what it contains (44c6a2f). Hope you can also live with that name. Like the RaptorData to end with a positive comment.

clukas1

@munterfi cannot add you as reviewer since it's your PR. In my opinion this PR is good to merge. Please review my final touches and approve / merge or re-open the discussion.

Brunner246 and others added 22 commits June 21, 2024 12:05

REFACTOR: NAV-47 - Improved routing method signatures in Raptor algor…

d022e72

…ithm

REFACTOR: NAV-47 - Adjust all usages of raptor routing functions outs…

ac6326a

…ide of raptor.

REFACTOR: NAV-47 - Consistently rename raptor components

7c5a378

- Legs --> Labels - Departure / Arrival --> Source / Target

FIX: NAV-47 - Fix bug where departure time was doubled when starting …

0a02558

…with location.

ENH: NAV-47 - Remove redundant Same Stop Transfer Generator.

773934b

REFACTOR: NAV-47 - Extract route scanner from raptor

da67efb

REFACTOR: NAV-47 - Extract footpath relaxer from raptor

38c6cde

STYLE: NAV-47 - Format

d976ada

REFACTOR: NAV-47 - Also pass config into scanner

85b3ad6

REFACTOR: NAV-47 - Name initial relaxing of footpaths explicitly

f3d42b5

REFACTOR: NAV-47 - Remove todo

29a4ea5

REFACTOR: NAV-47 - Introduce an objective object to keep track of the…

4e09ffa

… labels and best times

REFACTOR: NAV-47 - Reduce arguments that are passed around, by storin…

5dd3967

…g them on the objective

REFACTOR: NAV-47 - Move label to Objective class

06e45c2

REFACTOR: NAV-47 - Remove direct access to label lists and best times…

f6d2e8c

… in route scanner

REFACTOR: NAV-47 - Move new round creation to main loop in raptor

9f3ac84

REFACTOR: NAV-47 - Improve readability of setup in objective

c5c5b1d

REFACTOR: NAV-47 - Extract a post-processor from raptor to reconstruc…

d5183a2

…t isolines and connections

REFACTOR: NAV-47 - Bind relaxer and scanner directly to raptor

145c75f

STYLE: NAV-47 - Format

400d4dd

REFACTOR: NAV-47 - Extract raptor algorithm interfaces

428f9d5

munterfi requested review from clukas1 and Brunner246 June 22, 2024 14:25

munterfi self-assigned this Jun 22, 2024

DOC: NAV-47 - Correct comments in relaxer and scanner

15945d0

munterfi added 3 commits June 22, 2024 16:37

REFACTOR: NAV-47 - Remove dependency on stop context of objective

21c4764

STYLE: NAV-47 - Rename initial relaxation

50e8e90

DOC: NAV-47 - Small javadoc update

3bcf459

clukas1 added 2 commits June 23, 2024 11:40

FIX: NAV-47 Fix get best label for stop when reconstructing isolines …

24b77c4

…for arival time type.

REFACTOR: NAV-47 - Remove unnecessary HashSet instantiation.

06ee847

clukas1 reviewed Jun 23, 2024

View reviewed changes

munterfi added 2 commits June 24, 2024 11:15

REFACTOR: NAV-47 - Event clearer set comparison

e3ad8ee

REFACTOR: NAV-47 - Rename Raptor to RaptorRouter

91b0772

- Update license descriptions accordingly.

REFACTOR: NAV-47 - Remove Impl postfix on interface implementations i…

5dccb64

…n raptor package

munterfi added 4 commits June 24, 2024 19:30

REFACTOR: NAV-47 - Remove Comparable implementation from connection i…

9538562

…nterface

REFACTOR: NAV-47 - Rename impl package to router

0ccee1c

REFACTOR: NAV-47 - Rename Objective to Query

eba6126

REFACTOR: NAV-47 - Rename camel case

719791b

clukas1 reviewed Jun 24, 2024

View reviewed changes

REFACTOR: NAV-47 - Refactor Query and Objective

92e2ab0

- Query holds now the complete routing logic. - Objective stores the labels and best times and serves as single point for their modification. - Introduce raptor data interface.

clukas1 added 5 commits June 24, 2024 23:15

DOC: NAV-47 - Clarify intent of two best time methods.

e2ce8f4

REFACTOR: NAV-47 - Rename Objective to StopLabelsAndTimes

44c6a2f

REFACTOR: NAV-47 - Format project.

ba9c31b

DOC: NAV-47 - Be more explicit about what the raptor algorithm method…

c17df10

…s return.

ENH: NAV-47 - Make getting bestLabel for stop more efficient by loopi…

d60814c

…ng in reverse order over rounds.

clukas1 self-requested a review June 24, 2024 21:57

clukas1 approved these changes Jun 24, 2024

View reviewed changes

munterfi merged commit 5e4c58b into main Jun 24, 2024
2 checks passed

munterfi deleted the NAV-47-Refactor-Raptor-Algorithm branch June 24, 2024 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nav 47 refactor raptor algorithm #68

Nav 47 refactor raptor algorithm #68

munterfi commented Jun 22, 2024

munterfi commented Jun 22, 2024

clukas1 left a comment

clukas1 Jun 23, 2024

munterfi Jun 24, 2024

munterfi commented Jun 24, 2024 •

edited

Loading

clukas1 commented Jun 24, 2024

clukas1 Jun 24, 2024

clukas1 Jun 24, 2024

munterfi Jun 24, 2024

clukas1 Jun 24, 2024

clukas1 Jun 24, 2024

clukas1 Jun 24, 2024

clukas1 Jun 24, 2024

clukas1 Jun 24, 2024

clukas1 Jun 24, 2024

munterfi Jun 24, 2024 •

edited

Loading

munterfi commented Jun 24, 2024 •

edited

Loading

clukas1 commented Jun 24, 2024 •

edited

Loading

clukas1 left a comment


		public interface RaptorAlgorithm {

		static RaptorBuilder builder(int sameStopTransferTime) {

American English	British English
Color	Colour
Initialize	Initialise
Optimize	Optimise
Center	Centre
License	Licence
Traveler	Traveller
Program	Programme

Nav 47 refactor raptor algorithm #68

Nav 47 refactor raptor algorithm #68

Conversation

munterfi commented Jun 22, 2024

munterfi commented Jun 22, 2024

clukas1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

munterfi commented Jun 24, 2024 • edited Loading

clukas1 commented Jun 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

munterfi Jun 24, 2024 • edited Loading

Choose a reason for hiding this comment

munterfi commented Jun 24, 2024 • edited Loading

clukas1 commented Jun 24, 2024 • edited Loading

clukas1 left a comment

Choose a reason for hiding this comment

munterfi commented Jun 24, 2024 •

edited

Loading

munterfi Jun 24, 2024 •

edited

Loading

munterfi commented Jun 24, 2024 •

edited

Loading

clukas1 commented Jun 24, 2024 •

edited

Loading