-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should all the requests be identifiable with a single HEI? #3
Comments
I would vote that we stay with the current design. Perhaps it makes the implementation a bit more complicated, but it also scales better for centralized (multi-HEI) institutions. Also, I wouldn't be required to rewrite stuff ;) |
The problem occurred to me while working on the Java libraries. I had to decide how to represent the entity which is requesting (the actor). Methods will grant access to the data based on that entity. It could be also used to log requests for auditing. It seemed appropriate to see the HEI as such actor and not the list of HEIs (as it isn't an entity on its own). |
After a quick summary of the problem presented by @Awerin in Warsaw, most developers seemed to prefer the As indicated in the current structure of the manifest file (4.0.0-rc2), I have designed the architecture to allow communication between computer systems (EWP Hosts), and each of those would represent a group of HEIs (as opposed to just a single HEI):
Monday, we have kind of decided to migrate toward a different approach:
Tuesday, one more approach was proposed:
If we really don't want to support A1, then I think that A3 would be more understandable than A2. Regardless of what we choose, this is a big change in the core architecture and most specifications and APIs will need to be updated. |
A2 does seem to cover the main need (most or all use cases), and if a requesting host should need information for more than one of its own HEIs, it can always send multiple requests. We live in 2016, so I'm not worried about a few possible extra HTTP requests and responses. So I'm fine with A2. |
A3 seems like another good step. @mpuzar, could you give an example of the complications that A3 causes for those hosts which handle multiple HEIs? |
Having responsibility for 60 HEIs, we would need to make 60 different connectors (probably virtual, but still) for each and every one of them, 60 certificates, etc. If made virtually, they would need to analyse which HEI it was called. These are the first things that come to my mind. |
we would handle somewhat over 100 HEIs on our SAAS hosts general host to host communication is a bit troublesome
in case of in this case I think we also wouldn't go for any batch synchonisations since these could stack up to ten thousands of requests (even in 2016 we brought down a federal database with a similar scenario in austria) and these batch requests should probably not be done during working hours, just saying but maybe I am missing something, please correct me i think for usual business |
In the original architecture (A1), you would need only a single HTTP request for that. Of course, the performance of this single request varies on the implementation of the server endpoint (but A1 allows it to be implemented in a maximum efficient way, with only BTW, in order to understand each other properly we should try use clearly defined terms. EWP Host is a provider of services for a group of HEIs. So, when you're saying that "host to host communication is a bit troublesome", you mean the A1 architecture? |
It very important to stress the fact that we are now talking about changing the architecture itself - not individual APIs. Our original architecture (A1) was designed precisely with this in mind - to allow multi-HEI providers (such as Norway and SOP) to design complex APIs in the most efficient way, and be able to publish such APIs via our Registry. It is a bit more difficult to grasp, that's true... But it allows for "maximum possible performance", regardless of the APIs we may think of in the future. None of the other proposals do (A2, A3). |
In other words, if we really intend for our architecture to support such APIs in the future, then I still feel that we should stick with A1, as we have originally proposed. If we select other architecture only because "it seems to be enough for us now", then this may mean that it will need to be extended in a followup project - and - we will still eventually end up with complexities of A1... |
yes by host to host i meant EWP host to EWP host or |
If I understand correctly, your performance issues spawn from the fact, that your SAAS doesn't have a central database. That is, if your API implementation receives a query about multiple HEIs, you will need to - internally - query all of them, and this could take too much time. Is this correct? If so, won't this link be a better solution for this particular problem? E.g. You would be able to declare in your manifest file, that you are willing to query at most 10 HEIs in one request. |
I re-read the related things to this. in the case that one HEI requests its 5 agreements from another HEI the answer could at worst contain 100+ agreements of the other HEIs represented by us, which would actually not have been requested in the first place but again, if we know which HEI(s) is calling which HEI(s) we can perfectly handle these requests efficiently |
A1 architecture itself would just give you "a list of the HEIs covered by the requesting EWP Host". However, you may require more information in the particular APIs. For example, see this vesion of the Outgoing Mobility Search API. Both
I think I understand your problem now. And I think you can prevent that with a simple trick. As you know, it is you who will generate the IDs for such agreements. So, you are allowed to include the HEI ID in your generated agreement IDs. Then, whenever you see an ID of an agreement, you will know which HEI you should ask. So, I'm just saying, that the A1 architecture does not seem to prevent you from doing things efficiently. |
And in some (rare?) cases A1 also allows you to do things more efficiently than A2 or A3. For example, when external institutions will ask you for the list of all agreements (probably via IIA Search API), then you will be allowed to return a cached response (whenever you are quite sure that no new agreement was created in any of your HEIs). Whereas in A3, you would get 100 requests, instead of just 1 (you would need to serve 100 different, though probably also cached, responses). |
I don't think we should rely on tricks. For some methods they may not be even possible. They may also be problematic if, for example, a HEI ID changes. What that example shows is that adding information about caller and receiver to the context of a method may be important. Solution A1 hides that information.
What would be the use case for that? As it was already stated we probably shouldn't be afraid of multiple requests if that only means more headers in the network layer. If it would also mean more database queries than it would be more of an issue. Even if we decide that it might be an issue to send one request for every HEI we want to ask for, then we should remember that in many cases we won't have other options. For example, in Poland there will be a host for every HEI that is using USOS. |
I am not against migrating to A3. I just want existing SAAS providers to understand what they're giving up and make this decision knowingly. They could make use of A1.
It does not include that information on itself, but it still can be included on the API-level.
Periodical full synchronization. (We cannot trust CNR notifications to always be sent and delivered.)
There will be now. But who knows how it will look in 10 years time? |
if affected HEIs can be clearly identified in each request I think A1 would be the one to go since it would be internally the same as A3 (iteration through entries) but with a single request most probably the way to go is A1 with parameters for calling and target HEIs and the possibility to specify in the manifest the limit for calling/target institutions even down to 1, which leads to a degration to A3 |
In other words, you would vote to do the following:
Did I understand you correctly? |
yes, sounds like the most feasible approach to me whats your opinion @mpuzar @mikesteez ? |
My opinion is that we should choose a
|
Note, that - even in A3 - some requests in the network will not be bound to specific HEIs. So it will be However, we can make all requests bound to "some institutions" (and HEI would be a "subclass" of such). E.g. the Registry Service and the Central IIA Host could be bound to such non-HEI institutions. Then, it would indeed be
This seems unrelated. Could you post this comment in the related thread? |
I would point the same arguments as @mikesteez. |
@georgschermann, would you be against using A3? If I understood your earlier arguments properly, both proposals are okay for you? @mpuzar, @erasmus-without-paper/wp3, @erasmus-without-paper/all-members? Will all of you be able live with A3? If not, we will stay with A1. It will be a lot of work to change all the specs. So I will need to have some kind of approval from each of you, even if half-hearted. |
A 1:1 mapping, as described by @mikesteez, can be achieved with A1/A2 as well (by specifying the requesting and target HEIs). There is absolutely no reason for a radical change such as A3 to get it working. Please don't do it. |
Norway does not agree with the change in solution for the following reasons: First of all, the intension of this network was to be able to provide a simple as possible way to communicate between two peers without putting too much implementation strain on the single peer. Making each peer handle the whole of the communication means more work for each and the additional hassle of maintaining a certificate. We do not see how A3 is easier than A1. If we are talking about just a couple of peers communicating, then it really does not matter, but here we are potentially talking about thousands. We feel that it will be much harder to get new partners to join this network if they have to handle the cost of certificate maintenance on top of have to understand and implement the whole communications protocol. I am quite confident that very few universities would have join our EMREX project if we had told them that they had to serve their own certificate and host their own node. Remember that we want to "sell" EWP to more and more and more partners. The lowest possible cost for them, both in money and work, is key. The solution we have today is flexible in that it allows several HEI's to cooperate on implementation of a host. The host doesn't have to be on the national level, but for some countries it will. We think @wrygiel has a good solution with the use of requesting_hei_id and responding_hei_id and see no need to change the whole architecture. |
A1 with HEI identification and A3 are both perfectly fine for us.
this would allow host implementers to force 1:1 communication, if they are not comfortable with implementing |
I think we shouldn't discuss using separate client certificates in this issue. For me A2 is a more important step than A3 as it seems more crucial to be able to always tell which HEI is asking for the data. A1 means that a request is handled as if it was composed of multiple requests but there is no information about which HEI is asking for what. But that information may be important. Handling such requests adds a complexity to validation, querying and merging of results. |
In A1, you will still be able to tell that, but on the API-level (not the network-level). This way, we are able to introduce both:
In A1, each request can be bound to multiple requesting and multiple responding HEIs. But as requested here - for majority of APIs - it will be bound to a single requesting and a single responding HEI, exactly as you would expect. |
@richared I'm afraid I don't understand how it can be simpler to have a very complex |
with A1 small institutions would still have the chance to limit all the requests to one caller and one target and implement the APIs as if they were |
Ok, then! We will stick with A1 for now. Probably nobody would be 100% happy, regardless of what we chose, but we need to pick one way and stick with it, at least for the following months (I have lots of specifications to update, and I will base the changes on A1 and this comment). We can return to this discussion after the model is completely designed and specific APIs start to be implemented. |
I have spent a day preparing a new version of the Architecture and Security document. The new version attempts to describe the "peculiarities" of A1 architecture in more detail.
|
This change is related to the conclusion of this topic: erasmus-without-paper/ewp-specs-api-echo#3
This change is related to the conclusion of this topic: erasmus-without-paper/ewp-specs-api-echo#3
This change is related to the conclusion of this topic: erasmus-without-paper/ewp-specs-api-echo#3
@erasmus-without-paper/all-members
To all developers,
I would like to bring your attention to an important specification detail which you might have missed. @Awerin just started implementing Echo API and, while he admitted it can be done this way, he still was quite surprised with this design decision. I would like for all developers to discuss this detail now, because the more we go into it, the harder it will get to change it.
A1. Current
0..* -> 0..*
designThis detail is: In the current architecture specification, each EWP request can be bound with multiple (0..*) requesting HEIs.
Pros:
Neither pro nor con:
Cons:
A2. Alternate
0..1 -> 0..*
design (@Awerin's proposal)An alternative would look like this: Each request would be made in the name of a single HEI (or no HEIs). Many requests would be accompanied with a
requesting_hei_id
header/parameter (a single HEI who is asking about the data) which would replace parameters likesending_hei_id
described above.Pros:
Cons:
A3...
0..1 -> 0..1
design(The "A3" proposal was introduced later in the conversation. Please read the comments below.)
The text was updated successfully, but these errors were encountered: