This Spring AI based project is meant to demonstrate how Retrieval Augmented Generation (RAG) can be used to generate a JSON formatted request to the GTFS API as provided by the project neo4j-gtfs using Natural Language.
The computer readable response from the API can then be narrated back to the user in Natural Language by leveraging OpenAI GPT-4.
The database and API used for performing retrievals rely on the project hosted at neo4j-gtfs originally developed by @tguless in 2016 to help calculate travel plans using the open GTFS transit plan file format by leveraging a Neo4j graph database.
Initial GPT-4 Prompt:
Today is Mon May 27 2024 19:24:08 GMT-0400 (EDT). Given the prompt: "I am going to leave from Westwood train station in NJ to go to Hoboken NJ between 7 and 8 am tomorrow what are my train options?" I need to convert any relevant dates in the prompt (there may be multiple) to Javascript Date format. To do it, give me the JS commands to execute without using variables in the format: { '{datetyperequested}' : '{js command to run to set a js variable to the date value as Date object}', '{2nd datetyperequested}' : '{js command to run to set a js variable to the date value as Date object}' } Make sure to only return properly formatted json and nothing else in your response. Make sure all the required dates are included in the JSON object.
Initial GPT-4 Prompt Response:
{ "departureTimeStart": "new Date(\"2024-05-28T07:00:00-04:00\")", "departureTimeEnd": "new Date(\"2024-05-28T08:00:00-04:00\")" }
Response processed through the Mozilla Rhino for Java JS evaluator, and subsequent GTFS compliant date formatting.
{ "departureTimeStart": "20240528", "departureTimeEnd": "20240528" }
Program then generates a follow up GPT-4 Prompt:
Given the following API @RequestMapping(value = "/planTrip", method = RequestMethod.POST, produces = MediaType.APPLICATION_JSON_VALUE) @ResponseBody public ArrayList<ArrayList<ArrayList<Stoptime>>> planTrip(@RequestBody TripPlan plan) Where TripPlan has the following structure: @Data public class TripPlan { private String travelDate; private String origStation; private String origArrivalTimeLow; private String origArrivalTimeHigh; private String destStation; private String destArrivalTimeLow; private String destArrivalTimeHigh; } and where the relevant dates in the format the API expects are as follows: { "departureTimeStart": "20240528", "departureTimeEnd": "20240528" } and all the time fields must have the format '06:46:00' and if a destination time window is not provided, either or both destArrivalTimeLow and destArrivalTimeHigh should be set to an empty string. Similarly if an origin station departure time window is not provided, either or both origArrivalTimeLow and origArrivalTimeHigh should be set to an empty string for the origStation and destStation. Only assign a string with the city name and nothing else can you give me the JSON payload to pass to this API to get the trip plan for the following user request "I am going to leave from Westwood train station in NJ to go to Hoboken NJ between 7 and 8 am tomorrow what are my train options?"?
Machine readable response from GPT-4 to pass to the API provided by the /plantTrip endpoint of the neo4j-gtfs API:
{ "travelDate": "20240528", "origStation": "Westwood", "origArrivalTimeLow": "07:00:00", "origArrivalTimeHigh": "08:00:00", "destStation": "Hoboken", "destArrivalTimeLow": "", "destArrivalTimeHigh": "" }
Effectively so far we have accomplished the following:
Now our program will post the above payload to localhost:8080/customrest/planTripNoTransfer which will be served up by the project neo4j-gtfs.
The endpoint will respond with the payload:
[ [ [ { "arrivalTime": "07:47:00", "departureTime": "07:47:00", "stopName": "WESTWOOD", "stopSequence": 8, "tripId": "4225" }, { "arrivalTime": "07:51:00", "departureTime": "07:51:00", "stopName": "ORADELL", "stopSequence": 9, "tripId": "4225" }, { "arrivalTime": "07:55:00", "departureTime": "07:55:00", "stopName": "RIVER EDGE", "stopSequence": 10, "tripId": "4225" }, { "arrivalTime": "07:59:00", "departureTime": "07:59:00", "stopName": "NEW BRIDGE LANDING", "stopSequence": 11, "tripId": "4225" }, { "arrivalTime": "08:03:00", "departureTime": "08:03:00", "stopName": "ANDERSON STREET", "stopSequence": 12, "tripId": "4225" }, { "arrivalTime": "08:06:00", "departureTime": "08:06:00", "stopName": "ESSEX STREET", "stopSequence": 13, "tripId": "4225" }, { "arrivalTime": "08:09:00", "departureTime": "08:09:00", "stopName": "TETERBORO", "stopSequence": 14, "tripId": "4225" }, { "arrivalTime": "08:13:00", "departureTime": "08:13:00", "stopName": "WOOD-RIDGE", "stopSequence": 15, "tripId": "4225" }, { "arrivalTime": "08:23:00", "departureTime": "08:23:00", "stopName": "FRANK R LAUTENBERG SECAUCUS LOWER LEVEL", "stopSequence": 16, "tripId": "4225" }, { "arrivalTime": "08:35:00", "departureTime": "08:35:00", "stopName": "HOBOKEN", "stopSequence": 17, "tripId": "4225" } ] ], [ [ { "arrivalTime": "07:31:00", "departureTime": "07:31:00", "stopName": "WESTWOOD", "stopSequence": 8, "tripId": "4213" }, { "arrivalTime": "07:34:00", "departureTime": "07:34:00", "stopName": "EMERSON", "stopSequence": 9, "tripId": "4213" }, { "arrivalTime": "07:38:00", "departureTime": "07:38:00", "stopName": "ORADELL", "stopSequence": 10, "tripId": "4213" }, { "arrivalTime": "07:42:00", "departureTime": "07:42:00", "stopName": "RIVER EDGE", "stopSequence": 11, "tripId": "4213" }, { "arrivalTime": "07:47:00", "departureTime": "07:47:00", "stopName": "NEW BRIDGE LANDING", "stopSequence": 12, "tripId": "4213" }, { "arrivalTime": "08:04:00", "departureTime": "08:04:00", "stopName": "FRANK R LAUTENBERG SECAUCUS LOWER LEVEL", "stopSequence": 13, "tripId": "4213" }, { "arrivalTime": "08:16:00", "departureTime": "08:16:00", "stopName": "HOBOKEN", "stopSequence": 14, "tripId": "4213" } ] ], [ [ { "arrivalTime": "07:14:00", "departureTime": "07:14:00", "stopName": "WESTWOOD", "stopSequence": 7, "tripId": "4219" }, { "arrivalTime": "07:17:00", "departureTime": "07:17:00", "stopName": "EMERSON", "stopSequence": 8, "tripId": "4219" }, { "arrivalTime": "07:21:00", "departureTime": "07:21:00", "stopName": "ORADELL", "stopSequence": 9, "tripId": "4219" }, { "arrivalTime": "07:25:00", "departureTime": "07:25:00", "stopName": "RIVER EDGE", "stopSequence": 10, "tripId": "4219" }, { "arrivalTime": "07:29:00", "departureTime": "07:29:00", "stopName": "NEW BRIDGE LANDING", "stopSequence": 11, "tripId": "4219" }, { "arrivalTime": "07:33:00", "departureTime": "07:33:00", "stopName": "ANDERSON STREET", "stopSequence": 12, "tripId": "4219" }, { "arrivalTime": "07:36:00", "departureTime": "07:36:00", "stopName": "ESSEX STREET", "stopSequence": 13, "tripId": "4219" }, { "arrivalTime": "07:39:00", "departureTime": "07:39:00", "stopName": "TETERBORO", "stopSequence": 14, "tripId": "4219" }, { "arrivalTime": "07:43:00", "departureTime": "07:43:00", "stopName": "WOOD-RIDGE", "stopSequence": 15, "tripId": "4219" }, { "arrivalTime": "07:54:00", "departureTime": "07:54:00", "stopName": "FRANK R LAUTENBERG SECAUCUS LOWER LEVEL", "stopSequence": 16, "tripId": "4219" }, { "arrivalTime": "08:06:00", "departureTime": "08:06:00", "stopName": "HOBOKEN", "stopSequence": 17, "tripId": "4219" } ] ] ]
And then our program invokes another ChatGPT API call to translate the JSON back to english:
This is a travel itinerary for three different trips. The first trip, with trip ID 4225, starts at Westwood at 07:47. It then stops at Oradell at 07:51, River Edge at 07:55, New Bridge Landing at 07:59, Anderson Street at 08:03, Essex Street at 08:06, Teterboro at 08:09, Wood-Ridge at 08:13, Frank R Lautenberg Secaucus Lower Level at 08:23, and finally arrives at Hoboken at 08:35. The second trip, with trip ID 4213, begins at Westwood at 07:31. It then moves to Emerson at 07:34, Oradell at 07:38, River Edge at 07:42, New Bridge Landing at 07:47, Frank R Lautenberg Secaucus Lower Level at 08:04, and finally arrives at Hoboken at 08:16. The third trip, with trip ID 4219, starts at Westwood at 07:14. It then stops at Emerson at 07:17, Oradell at 07:21, River Edge at 07:25, New Bridge Landing at 07:29, Anderson Street at 07:33, Essex Street at 07:36, Teterboro at 07:39, Wood-Ridge at 07:43, Frank R Lautenberg Secaucus Lower Level at 07:54, and finally arrives at Hoboken at 08:06.
Note, we really don’t need to utilize the OpenAI GPT-4 API to translate the last JSON payload back into english. It should be easy enough to use it to populate a templated text meant to be consumed by humans.