Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

countResponse and booleanResponse not splitted per ResultSet #129

Open
costero-e opened this issue Jun 7, 2024 · 7 comments
Open

countResponse and booleanResponse not splitted per ResultSet #129

costero-e opened this issue Jun 7, 2024 · 7 comments

Comments

@costero-e
Copy link
Collaborator

costero-e commented Jun 7, 2024

Now that we are implementing beacons in different projects, we are facing the problem that count and boolean responses are always a result of the aggregation of the resultSets, but they never show the results splitted by resultSet. This makes impossible to know, for a client requesting a beacon that only allows boolean/count granularity, which are the datasets that are responding positively or negatively. As we can see in the resultSet response from the standards:

        "ResultsetInstance": {
            "additionalProperties": true,
            "properties": {
                "exists": {
                    "type": "boolean"
                },
                "id": {
                    "description": "id of the resultset",
                    "example": "datasetA",
                    "type": "string"
                },
                "info": {
                    "description": "Additional details that could be of interest about the Resultset. Provided to clearly enclose any attribute that is not part of the Beacon specification.",
                    "type": "object"
                },
                "results": {
                    "items": {
                        "type": "object"
                    },
                    "minItems": 0,
                    "type": "array"
                },
                "resultsCount": {
                    "description": "Number of results in this Resultset.",
                    "type": "integer"
                },
                "resultsHandovers": {
                    "$ref": "../../common/beaconCommonComponents.json#/definitions/ListOfHandovers",
                    "description": "List of handovers that apply to this resultset, not to the whole Beacon or to a result in particular."
                },
                "setType": {
                    "default": "dataset",
                    "description": "Entry type of resultSet. It SHOULD MATCH an entry type declared as collection in the Beacon configuration.",
                    "type": "string"
                }
            },

We believe this resultSet response should be naturally returned by any type of granularity, with results being exclusively for the record granularity and resultsCount for the count and record granularity.

I will work on a solution in the next days to have this implemented unless the opinion on this is mostly the contrary and I wanted to know what do you think.

Thanks,
Oriol

@mbaudis
Copy link
Member

mbaudis commented Jun 9, 2024

This has been in discussion for a while & also is in line with the need to add an optional beacon identifier to each ResultsetInstance - the following from the presentation at ELIXIR AHM 2023 (this was not "formal" and the resultsCount etc. just demonstrate what needs to be implemented):

resultSets:
  - id
    beaconId
    infoUrl
    setType
    exists
    resultsCount
    returnedGranularity
    ...
    results []
  - id
    beaconId
    infoUrl
    exists
    ...

So, yes; and I had also called before for (especially in the network context but in general, too) for ant response being a ResultSetRespones (w/ an optional response part) to allow "single Beacon, different dataset permissions" responses.

@mbaudis
Copy link
Member

mbaudis commented Jun 11, 2024

Back to this statement:

We believe this resultSet response should be naturally returned by any type of granularity, with results being exclusively for the record granularity and resultsCount for the count and record granularity.

Yes, this has come up before - granularity does not explicitly say one should not report per dataset (resultSet); however, currently AFAIK the results element is required per resultSet. I had somewhere documented that sjust the removal of this requirement would enable boolean or count resulsetResponses.

@costero-e
Copy link
Collaborator Author

HI @mbaudis,
you are right that the removal of the results plus the removal of resultsCount required condition would be enough (if we think in boolean resultSets, for example).
My solution is aligned with what you are saying @mbaudis. In addition, I would have resultSets splitted by count and boolean, take a look at my work here a629568. These new two responses would be like a resultSets but in the case of countSets there would be no results element, and in the case of booleanSets, no results and no resultsCount element. This would allow to still have the three types of responses differentiated.

Let me know what you think.
Thanks,
Oriol

@mbaudis
Copy link
Member

mbaudis commented Jun 11, 2024

@costero-e Well, this looks1 per se correct - but

  • it doesn't solve mixed granularity responses since always the response type specific parameters are required elements
  • I don't know how those responses can be requested, in contrast to the non-resultset count/boolean responses.
  • a booleanResultsetResponse still is different from a basic one regarding it security aspects (not necessarily relevant IMO but still...)

So since we have to take it as a given that we need an option for mixed responses, at least for larger network queries, we either need then another response type for those which is more permissive. And that can realistically be used for any response...

I understand that it is good for explanations to have tightly defined schemas; "this beacon only provides up to beaconCountResultsetResponse and does not implement record level delivery" etc. IMO the only argument.

Footnotes

  1. It would even look better in YAML ¯\_(ツ)_/¯

@costero-e
Copy link
Collaborator Author

Hi @mbaudis, ok, I have just added the YAML solution, here 3f9edbb. A bit more correct now!
About your answer, I see what you mean, but I think a Beacon Network should decide which response it is returning (record, count or boolean). A Beacon Network can't respond with a mixed granularity, can it? Maybe this has its own limitation, for example, a response will be always limited by the max granularity accepted by each beacon and dataset, but IMO then a user should refine the query per beacon/dataset (knowing what is the max granularity the user has per dataset/beacon) if the user wants more complete responses.

@mbaudis
Copy link
Member

mbaudis commented Jun 11, 2024

A Beacon Network can't respond with a mixed granularity, can it? Maybe this has its own limitation, for example, a response will be always limited by the max granularity accepted by each beacon and dataset, but IMO then a user should refine the query per beacon/dataset (knowing what is the max granularity the user has per dataset/beacon) if the user wants more complete responses.

Network is not the same as Aggregator

In reality you want to create aggregators with many different beacons, w/ different status of maintenance and content etc. And there are many explorative query scenarios where one just wants to fire off at least count queries etc. Yes, one could iterate through query-response-requery ... scenarios but that's e.g. not really nice over a UI and certainly against the "discovery" aspect. So if every beacon not responding with the requested resultSet type just gives an error this is not helpful, even for an overview.

@gsfk
Copy link
Contributor

gsfk commented Aug 7, 2024

Hi @mbaudis, ok, I have just added the YAML solution, here 3f9edbb.

+1 for this or a similar solution, I've been writing network code for our own count beacons, but I don't have a way to serve a standard count response that shows a breakdown of counts for each beacon. So this is still an issue even without mixing granularities.

We are also working on beacons with multiple datasets where a count-per-dataset response would probably be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants