Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for HTTPS datasource for Graph Building #4482

Merged

Conversation

vpaturet
Copy link
Contributor

@vpaturet vpaturet commented Sep 29, 2022

Summary

This PR implements support for reading data sources from an HTTPS resource.
The data source provides only read operations.

Issue

Implements #4481

Unit tests

Added unit tests

Documentation

Updated BuildConfiguration.md

/* private methods */

private static boolean skipUri(URI uri) {
return !"https".equals(uri.getScheme());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doesn't happen a lot but do we not want to support HTTP without TLS?

Copy link
Contributor Author

@vpaturet vpaturet Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather not implement it right away (for security reasons). This could be easily added later if someone asks for it.

@t2gran t2gran self-requested a review September 29, 2022 13:50
@t2gran t2gran added this to the 2.2 milestone Sep 29, 2022
@t2gran t2gran added New Feature bump serialization id Add this label if you want the serialization id automatically bumped after merging the PR and removed bump serialization id Add this label if you want the serialization id automatically bumped after merging the PR labels Sep 29, 2022
@t2gran t2gran changed the title Add support for HTTPS datasource Add support for HTTPS datasource for Graph Building Sep 29, 2022
@codecov-commenter
Copy link

codecov-commenter commented Sep 29, 2022

Codecov Report

Base: 58.48% // Head: 58.62% // Increases project coverage by +0.13% 🎉

Coverage data is based on head (d396094) compared to base (1fb4d2f).
Patch coverage: 76.20% of modified lines in pull request are covered.

Additional details and impacted files
@@              Coverage Diff              @@
##             dev-2.x    #4482      +/-   ##
=============================================
+ Coverage      58.48%   58.62%   +0.13%     
- Complexity     11431    11548     +117     
=============================================
  Files           1512     1520       +8     
  Lines          60102    60611     +509     
  Branches        6829     6890      +61     
=============================================
+ Hits           35152    35534     +382     
- Misses         22887    22986      +99     
- Partials        2063     2091      +28     
Impacted Files Coverage Δ
...anner/ext/reportapi/model/BicycleSafetyReport.java 0.00% <0.00%> (ø)
...planner/ext/reportapi/resource/ReportResource.java 0.00% <0.00%> (ø)
.../org/opentripplanner/datastore/api/DataSource.java 9.09% <ø> (ø)
...lanner/graph_builder/module/osm/WayProperties.java 75.00% <ø> (-1.48%) ⬇️
...ipplanner/routing/graph/SerializedGraphObject.java 72.22% <ø> (-0.31%) ⬇️
...lanner/transit/model/basic/NonLocalizedString.java 71.42% <0.00%> (-1.91%) ⬇️
.../main/java/org/opentripplanner/util/HttpUtils.java 12.19% <0.00%> (-3.43%) ⬇️
...ipplanner/datastore/https/HttpsFileDataSource.java 42.10% <42.10%> (ø)
...opentripplanner/standalone/config/BuildConfig.java 87.50% <50.00%> (-1.14%) ⬇️
...g/opentripplanner/standalone/config/OtpConfig.java 88.88% <50.00%> (ø)
... and 106 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@vpaturet vpaturet marked this pull request as ready for review September 30, 2022 10:03
@vpaturet vpaturet requested a review from a team as a code owner September 30, 2022 10:03
Comment on lines 67 to 73
if (lastModifiedHeader != null) {
Date lastModifiedDate = DateUtils.parseDate(lastModifiedHeader);
if (lastModifiedDate != null) {
return lastModifiedDate.getTime();
}
}
return 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very much a matter of personal taste but I dislike repeated null checks (an null checks in general).

How to you like this?

Suggested change
if (lastModifiedHeader != null) {
Date lastModifiedDate = DateUtils.parseDate(lastModifiedHeader);
if (lastModifiedDate != null) {
return lastModifiedDate.getTime();
}
}
return 0;
return Optional
.ofNullable(lastModifiedHeader)
.map(DateUtils::parseDate)
.map(Date::getTime)
.orElse(0L);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong opinion about this, but it seems @t2gran has :-)
#4480 (comment)

Copy link
Contributor

@hannesj hannesj Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just hope that there would be a clean syntax for this, similar to how optional chaining in JavaScript is. I really dislike how the repeated null-checks look

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could also chime in here: #4480 (comment)

Copy link
Member

@t2gran t2gran Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Optional is not made for this, and it reduces readability - I wish Java had support for a safe operator like Kotlin(?.) as well, but until then, use the language feature, not a library feature intended for something else. I would probably written the above code like this, to avoid nested ifs - but this is just personal taste:

if (lastModifiedHeader == null) {
  return -1;
}
Date lastModifiedDate = DateUtils.parseDate(lastModifiedHeader);
if (lastModifiedDate != null) {
  return -1;
}
return lastModifiedDate.getTime();

PS ! The method should return -1 not 0 if the value does not exist

* @return last modified timestamp in ms, if unknown returns {@code -1}

@@ -85,4 +98,28 @@ public static InputStream openInputStream(URI uri, Map<String, String> headers)
return downloadUrl.openStream();
}
}

private static HttpResponse getResponse(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has nothing to do with your PR but now that we have this method we can actually use it in SiriHttpUtils.

Copy link
Member

@leonardehrenfried leonardehrenfried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've ran the code and it works great but I have a suggestion about the null checks.

What I want to add in the future is progress bar and local caching of files (if the ETag/Last-Modified) hasn't changed.

Copy link
Member

@leonardehrenfried leonardehrenfried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see the (in my view ugly) null checks go but the code is solid otherwise.

try {
return Long.parseLong(header);
} catch (Exception e) {
return 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* @return size in bytes, if unknown returns {@code -1}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 0 end up as the returned value for the HttpDataSource#size(). The JavaDoc is either wrong or this is.

Copy link
Member

@t2gran t2gran Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nice fix to this would be to change the DataSource interface and return an OptionalLong for size and lastModified. If you don´t want to do that in this PR, I can fix it in a separate PR later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the two methods to be consistent with the Javadoc.
Updating the DataSource interface should probably be done in a separate PR.

Google Cloud Storage files: `"gs://otp-test-bucket/a/b/graph.obj"`
Local files: `"file:///Users/kelvin/otp/streetGraph.obj"`
HTTPS resources: `"https://download.geofabrik.de/europe/norway-latest.osm.pbf"`
Google Cloud Storage files: `"gs://otp-test-bucket/a/b/graph.obj"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a mention of osmCacheDataInMem somewhere, or even enable it by default, if it is a https source?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll do this later

/**
* HTTPS data source metadata returned by the HTTP server (HTTP headers).
*/
public class HttpsDataSourceMetadata {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably have a toString, as it is being logged

hannesj
hannesj previously approved these changes Oct 3, 2022
@vpaturet vpaturet merged commit 495f1f5 into opentripplanner:dev-2.x Oct 4, 2022
t2gran pushed a commit that referenced this pull request Oct 4, 2022
@t2gran t2gran deleted the otp2_add_support_for_https_datasource branch March 31, 2023 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants