-
Notifications
You must be signed in to change notification settings - Fork 30
Inference Engine Sequence Discussion
The following is a sequence-based walkthrough of the Inference Engine, abbreviated as IE in the rest of this page.. An image representing is a TODO.
As mentioned in the Inference Engine page, the IE is an implementation of a particle filter. If you are new to particle filters, please take a moment to view this youtube video for a non-mathematical introduction.
The entry point into the IE is currently the PartitionedInputQueueListenerTask.processMessage(...)
. This class is responsible for subscribing to the Queue and serializing data off of it. The other important aspect of this class is the implementation of the concept of Parititoning. The IE may not be able to process the entire data set of an agency, and as its implementation is stateful, it does not lend itself to load balancing well. The solution is to divide (parition) the dataset based on vehicles. This is accomplished by only processing a vehicle if it belongs to a depot that has been assigned to the configured IE. See the acceptMessage(...)
Further details of the impl.
The above processMessage(...)
calls into VehicleLocationService.handleRealtimeEnvelopeRecord(...)
which takes the deserialized data structure of the Queue message and translates it to a NYCRawLocationRecord
. Although NYCRawLocationRecord
is rather poorly named, it is a key data model, allowing observation data to flow into the IE. VehicleLocationSevice
provides the IE capabilities as a service, with the handleRealtimeEnvelopeRecord(...)
taking care of the necessary synchronization of an internal thread pool. The synchronization concern is this: Each vehicle is independent of each other, so vehicles simply compete for server resources (the thread pool). However with realtime data you cannot guarantee messages are spaced such that a vehicle will not compete against itself -- hence the synchronization is at the vehicle level as well. This size of the thread pool is calculated as a function of the CPUs available, and has been tuned based on real world testing.
Once an instance of VehicleInferenceInstance
has been secured, it is passed to a work thread named ProcessingTask
which runs in the thread pool.
ProcessingTask
contains an instance of VehicleInferenceInstance
and updates it with the latest observation from the NYCRawLocationRecord
. The methods that query for state of the VehicleInferenceInstance
are not thread safe, so handleUpdate
attempts to synchronized the smallest possible block of contentious data. Once complete, it returns a NycQueuedInferredLocationBean
, the second key data model inside the IE. With NycRawLocationRecord
populating the input of the observation, the equally poorly named NycQueuedInferredLocationBean
represents the output of the IE as it will be placed on the Queue. Returning this data represents completion of the PartitionedInputQueueListenerTask.processMessage(...)
method.
VehicleInferenceInstance
is colloquially known as the "wrapper" for the IE. It does a large amount of setup to create an Observation
that the particle filter requires, and calls directly into ParticleFilter.updateFilter(...)
with that Observation
.
An Observation
wraps the NycRawLocationRecord
along with additional state about the Observation
. It's just a container which includes that plus several state variables (all set externally), as well as refs to the previous Observation
, RunResults
, and a comparator.
A BlockStateObservation
wraps an Observation
and a BlockState
. From the doc: Specifically, it contains information about the BlockState that is conditional on when it was observed.
ParticleFilter.runSingleStep(...)
appropriately delegates to updateFilter(...)
which updates the particle filter based on the single observation, which then potentially includes resampling based on that observation. I say potentially, as some particles are dropped for performance reasons. runSingleStep(...)
invokes MotionModel.move(...)
which incorporates the current observation into the existing state of the IE, after which it invokes computeBestState(...)
. computeBestState(...)
computes the probability of each particle and selects the most likely state for return.
MotionModel
is an implementation of a model representing the buses behaviour based on the dimensions or measures listed below. This is acheived by calculating the probability of each Likelihood
and summing that probability to create the overall probability. Next, this particle becomes the parent of the new particles created by sampling the parent. Thus the term parent
used throughout the IE refers to the instances of the particle before this current observation.
Likelihood
s represent 9 measurements about the vehicle. A Likelihood
is a model of the behaviour of a particular measurement as it applies to a particle weight. Some are really simple,like MovedLikelihood
, others are more complex like SchedLikelihood
.
Likelihoods rarely ever deal in absolute truth; all rules are formulated so that the set does not completely converge as you want multiple possibilities to be considered appropriately. With that said, Likelihoods often do consider the impossibility of a measurement such that for performance reason the particle can be dropped. Weightings of these probabilities are adjusted by hand based on experience, observation, and the established Integration Test Traces.
If you have read to this point and are still interested, look into the implementation of the Likelihoods, and the services responsible for sampling choices. It is in these details of the overall motion model that the accuracy of the system lives.
runSingleStep(...)
invokes