This program was written to automatically generate school bus routes for the Los Angeles Unified School District (LAUSD). We are grateful to the University of California Institute of Transportation Studies (ITS) for funding this project and to LAUSD for their extensive work with us to help us understand the problem instance and to refine our results. This project was led by Professor Mason Porter with students Cu Hauw Hung and David Spencer. This readme was written by me (David), and any errors herein are solely my own. Once our detailed report to ITS is made public, I will add a link to it here.
This program seeks solutions to the School Bus Routing Problem (SBRP). The problem size is too large to find optimal solutions (as many formulations of the SBRP are NP-hard), so we only seek approximate solutions. A recent review [1] of the problem is available. The LAUSD problem case has several characteristics which should be considered when designing an algorithm:
- Routes are constrained by a ride-time limit.
- LAUSD employs mixed-load routing, in which students from multiple schools ride the same bus. The practice of mixed-load routing is particularly beneficial when students for different schools need to traverse similar paths. Mixed-load routing does make the problem more challenging than non-mixed-load routing, as it prevents one from reducing the district-wide problem to many single-school problems and it necessitates consideration of bell times.
- LAUSD uses a heterogeneous fleet, meaning that buses have varying capacities. In addition, the capacities depend on the ages of the students assigned to a bus, since older students require more space on average.
- Some students are more likely than others to ride the bus on a given day. Because of this, LAUSD overbooks buses, sometimes assigning more than double or even more than triple the legal capacity of a bus.
- LAUSD prefers that elementary school students and high school students not ride together.
Additionally, we need a way to measure success of an algorithm. We decided to use a mixed objective function, minimizing a linear combination of the number of routes and the mean ride time for students. The number of routes is a proxy for cost of service and the mean ride time is a proxy for quality of service.
Note: Throughout this readme, when we say "stop," we mean a nonempty and maximal set of students who go to the same school and are all picked up at the same physical location. Therefore, there may be multiple stops at one physical location. This language differs from the language in our upcoming report, in which we use the phrase "student group," but is consistent with the code in this repository.
We tried several algorithms and many small modifications of each algorithm. A detailed report on our work is forthcoming; herein, we will only discuss the algorithm we ended up using in the program in the repository. Some elements of our work are shared between all of the algorithms we tried: the need to assign buses and the use of post-improvement procedures.
For each algorithm, it was necessary to decide how to perform bus assignment; we opted to do so after constructing routes rather than during the route construction process. Initially, the program constructs routes without considering bus capacities. Then, for each route (in increasing order of the number of students on the route), the program assigns a smallest possible bus to the route if any bus that can be assigned to the route exists.
For each route such that no bus can handle the entire route, the program seeks a partition of the stops on the route such that one bus can handle each part. The program runs a search algorithm to find the fewest number of buses that can pick up all of the students on a route without having students at the same stop ride multiple buses. This algorithm is brute-force, but takes advantage of some symmetries and also prunes some branches of the search tree when a solution is provably impossible. If the time taken to run this brute-force algorithm exceeds a second, the program uses a greedy algorithm that does not guarantee optimality but does find a valid assignment of buses to a partition of the routes. In our tests, the optimal assignment is found in over 99% of cases; in those remaining cases, the greedy algorithm is needed to produce a valid solution quickly.
Doing this bus assignment procedure after the initial route construction phase was more successful in our tests than approaches that assign buses during the route construction process.
We applied three procedures to the routes generated by each algorithm to improve them. These procedures were the same for each algorithm we tried. We used the following procedures:
- A procedure that attempts to reduce the number of routes by, for each route R, attempting to reassign each of its stops to other routes if possible. If such reassignment succeeds for every stop on R, then R is deleted; otherwise, all of the stops reassigned from R to another route are moved back to R. This procedure was introduced in [2].
- The two-opt procedure, a classical procedure in the Traveling Salesperson Problem and related problems which seeks to minimize the length of a path through a set of points by reversing the order of a subpath. In this application, we keep two-opt moves which reduce the mean ride time.
- A procedure which, for each stop, determines whether moving it to any other route would reduce the mean ride time without violating any problem constraints. If so, the move is made. (This procedure is turned off during special education routing, as many special education students are picked up at their homes, and the resulting large number of stops causes this procedure to become excessively slow.)
Performing one of these procedures may cause another procedure to be capable of improvements it could not make before. Therefore, we run these procedures in order repeatedly until no improvement is found by any procedure.
When comparing initial route construction algorithms to decide which one to use in the program, we used the following procedure and compared the resulting numbers of routes and mean ride times:
- Run the initial route constructon algorithm.
- Run the post-improvement procedures on the output of the initial route construction algorithm.
- Run the bus assignment procedure on the improved routes.
- Run the post-improvement procedures on the routes with assigned buses (ensuring while doing so that the changes made by the post-improvement procedures do not violate bus capacity constraints).
This procedure is also run in the program itself, potentially multiple times.
We settled on an initial route construction algorithm which constructs routes one at a time, adding stops to the currently-worked-on route until no stop is seen as worth adding to the route any more. Each route is initialized with one stop a, and then a set of candidate stops is generated: all stops for a's school and all stops for schools near a's school (to allow for creation of mixed-load routes). The choice of which candidate stop (if any) to add to a route is based on a balance of the cost of a candidate stop and the value of a candidate stop.
- The cost c(b) of a candidate stop b is defined to be the amount of time that the route's total time would increase by if the candidate stop was added to the route, or infinite if adding the stop to the route is infeasible (for instance, because of the ride-time limit).
- The value v(b) of a candidate stop b is defined to be a linear combination of a) the travel time from the stop to its school and b) the minimum travel time from the stop to its school or to any other unrouted stop for that school. The idea is that it is helpful when possible to pick up stops which are far from their schools and from other stops for their schools, since these stops may not easily be added to other routes.
Each candidate stop maintains its value, and the cost for each candidate stop is recomputed each time a stop is added. When it is time to add a stop, a candidate stop b which maximizes f(b) = v(b) - c(b) is chosen provided f(b) exceeds a parameter m. If f(b) does not exceed m for any b, the route is saved and a new route is initialized.
It is necessary to define how to choose the first stop which is added to the route. We found that a procedure which worked well was, at the beginning, sorting the list of all stops in decreasing order of travel time to their schools and, whenever a new route was initialized, initializing it with the first unrouted stop in this list.
There are some parameters in the inital route construction procedure: the weights for the linear combination and the minimum allowed value of f when choosing a stop to add. To determine these parameters, we run only the initial route construction algorithm (with no post-improvement procedures or bus assignment for the sake of speed) with different parameter values varying randomly within an interval we chose. We keep the best route plan produced among these and return it as the result of the initial route construction algorithm.
In the user interface provided, users specify an amount of time they wish the program to run. The program always runs one iteration of steps 1-4 in the Overall Execution section, and if enough time remains to run additional iterations, it does so with a perturbed list of stops with which to initialize the routes.
The initial route construction algorithm and the associated parts of the program are defined more formally in our report to ITS.
The program in this repository has a rudimentary user interface which allows for generating, viewing, and editing routes. Run main.py to start the program. Instructions for the use of the program are available here. Note that due to privacy concerns, we are unable to make any input data available. The specifications for the input data files are explained within the program.
[1] Ellegood, W. A., Solomon, S., North, J., & Campbell, J. F. (2019). School bus routing problem: Contemporary trends and research directions. Omega.
[2] Park, J., Tae, H., & Kim, B. I. (2012). A post-improvement procedure for the mixed load school bus routing problem. European Journal of Operational Research, 217(1), 204-213.