-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define how to expose version through the collection metadata API #1028
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several stylistic changes and complements.
decision-records/0004-data-api.md
Outdated
"content": " | ||
Privacy Policy | ||
============== | ||
|
||
Last updated: September 26, 2023 | ||
|
||
This privacy policy explains how, why and under what conditions the Open Terms Archive (OTA) site collects personal information and how it is used. Our privacy policy will change over time. And of course, we also record the changes of [our own documents](https://github.com/OpenTermsArchive/demo-versions/tree/main/Open%20Terms%20Archive) 😉 | ||
|
||
… | ||
" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This multi-line example is invalid. We would need to transform newlines to \n
. The example should also provide an example with JSON data, to demonstrate how it would be encoded.
Co-authored-by: Matti Schneider <matti@opentermsarchive.org>
Solutions A and C are interesting, but I find that while it's still possible to pass metadata via HTTP headers, this approach is less common and less discoverable than solution B. So, I vote for the solution B as it aligns with the common and anticipated conventions of API responses. It provides a straightforward and highly extensible method for passing metadata in a standardized manner. @MattiSG @clementbiron @madoleary @michielbdejong What do you think ? Note: The deadline elapsed a week ago, and I am starting the implementation this afternoon. The decision on the chosen solution will be confirmed tomorrow morning on the basis of the votes taken between now and then. |
Co-authored-by: Matti Schneider <matti@opentermsarchive.org>
Thanks @Ndpnt! Reading through solutions A, B and C, I came up with solution D that aims at keeping the best parts of each:
It has the added benefit of being possibly split in implementation in two parts, which are made clear in the proposal. Indeed, I feel that we have two orthogonal choices to make rather than 3 full-fledged solutions:
…and that choice 2 can be delayed: for sure, there is relevance in JSON extensibility. Let's start with implementing that one and, while we're at it, defining in this RFC how to implement the Markdown version, without committing to implementing it at this stage as we first collect feedback from the JSOn version 🙂 I thus vote for solution D, with only implementation of the JSON version of the endpoint! |
Thanks @MattiSG for this relevant contribution! I really like the idea of mixing solutions to keep the best parts of each 👍. At this stage I still find that using redirections to leverage caching is a premature optimization. And I prefer to wait until the need arises before choosing how to manage the cache. In addition, this caching optimisation also has the cost of introducing an extra round trip between client and server, which I still don't find desirable. So for now I'm in favor of the solution D without the redirection mechanism. It means where solution D returns HTTP 302, I suggest to returns HTTP 200 with the version (either in markdown format or in JSON format according to the Content-Type). I added it in the document in solution E. I thus vote for solution E (and still with only implementation of the JSON version). |
I'm in favour of solution E, which I feel is sufficient for the uses identified. Well done Nico for your work! |
I, too, am in favor of solution E. I think the implementation is solid, particularly returning HTTP 200 with the version, and I agree that optimization can come later. For posterity, I'd like to recap a conversation that happened outside of GitHub regarding this RFC and certain questions of mine. Generally, I was confused by the term "version" and the function of the date. I also didn't understand the difference between a version and a dataset. @MattiSG explained to me that a “version” is indeed “the content" of the term text, in Markdown format, and that a dataset is the consolidation of all those versions. Here is an example of a version: https://github.com/OpenTermsArchive/contrib-versions/blob/46d10a6c2ef50a11eaa39d4fa5ae3913e9258831/Atlassian/Privacy%20Policy.md Here is an example of a dataset: https://github.com/OpenTermsArchive/contrib-versions/releases/tag/dataset-contrib-2023-12-04 Regarding the date, I wondered whether it simply represents the day the terms were added or updated, and whether there could be multiple dates for the same terms type. I also questioned how handling the dates on the ToS;DR end would work, i.e., how we'd know which date to use for the API request. Matti answered these questions in kind. There is an infinity of valid dates between their first declaration in Open Terms Archive and the current time. This enables clients such as ToS;DR to request any date and let the API retrieve the version (terms text) that was valid at that date. In our use case, it's likely that we'll always use the current date and time to get the latest version, but this is something we should verify. Thank you! |
Solution E received the most votes and will thus be selected. Thanks for your participation :). |
| date | URL-encoded ISO 8601 datetime string | The date and time for which the version is requested | | ||
|
||
Note about the date: | ||
- A full date and time is required, and not a simple date (such as `2023-10-24`), to avoid ambiguities on days where a version changed, and timezone differences between client and server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe pin the timezone to UTC? So "A full UTC data and time"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the timezone is defined as an offset from UTC (ex: 2023-12-06T12:34:56+02:00
) is still ISO 8601 valid.
So, why forcing the timezone to be UTC?
| --------- | ------ | ---------------------- | | ||
| serviceId | URL-encoded string | The ID of the service | | ||
| termsType | URL-encoded string | The name of terms type | | ||
| date | URL-encoded ISO 8601 datetime string | The date and time for which the version is requested | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And require that the timezone be UTC?
Ah, looks like I came to this discussion just a day after it finished :) I think one way we could use this API is by setting the date far into the future, to get the latest known version, right? For instance:
And that would be cacheable! :) |
👍
We could also use a more elegant alias (
|
In the current proposal, this would not work and would reply with a I believe that this spontaneous usage, instead of using |
decision-records/0004-data-api.md
Outdated
|
||
- HTTP 200 with the Markdown content as body when a version is applicable at this date | ||
- HTTP 404 Not Found with Markdown content `# Error\n\n_No version found for date <requested-date>_` for a date that is anterior to the first available version | ||
- HTTP 400 Bad Request with JSON content `# Error\n\n_Requested date <requested-date> is in the future, no version can exist there_` if the date is in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an inconsistency with the Markdown / JSON content in this proposal here.
- HTTP 400 Bad Request with JSON content `# Error\n\n_Requested date <requested-date> is in the future, no version can exist there_` if the date is in the future. | |
- HTTP 400 Bad Request with Markdown content `# Error\n\n_Requested date <requested-date> is in the future, no version can exist there_` if the date is in the future. |
Co-authored-by: Matti Schneider <matti@opentermsarchive.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks everyone! That works for me 🙂 👍
My only concern is the lack of a /latest
route for implementers, but since we introduced a 416 for far-future dates, the /9999
trick suggested will not work and will thus enforce the use of Date.now
. This might not be comfortable but is at least forward-compatible, and we could thus introduce the /latest
route as an improvement after shipping the first version.
This PR is an experiment to create RFC that will lead to a decision record delivery.
The idea is to add comments right onto the document, much like you would with code. I will update the document for each consensus reached.
The deadline for this RFC is 27/11 end of day AoE (Anywhere on Earth).