Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separation of natural and surrogate keys? #9

Closed
kaiqu opened this issue Jan 2, 2017 · 24 comments
Closed

Separation of natural and surrogate keys? #9

kaiqu opened this issue Jan 2, 2017 · 24 comments
Assignees

Comments

@kaiqu
Copy link

kaiqu commented Jan 2, 2017

It seems a bit overkill to me to have both "los-id", "loi-id" and "component-code". In component-studied, los-id seems to be = component-code plus a type prefix (for Course, I presume), while loi-id seems to be some other human-readable code. In component-recognized, los-id and loi-id seem to be global, artificial ID's.
In the model, LOS Code is meant to be a human-readable code (like component-code), and LOI is identified by that plus Academic Term. Are artificial ID's necessary at all?

@kaiqu
Copy link
Author

kaiqu commented Jan 3, 2017

Never mind - I see this is explained in the Courses README...

@kaiqu
Copy link
Author

kaiqu commented Jan 3, 2017

Ok - I see the need for the extra identifiers, but I don't think "component-code" is a good name: LA Component is the combination of a LOI and a LA, but the code identifies a LOS, and should be called los-code, in my opinion. Or is it important to conform to the EUC template on this, @wrygiel ?

@wrygiel
Copy link
Collaborator

wrygiel commented Jan 4, 2017

Or is it important to conform to the EUC template on this, @wrygiel ?

Not necessarily. We could call it los-code too.

We will have a hard time explaining the difference between los-id and los-code to all the partners... I'm still trying to come up with a better solution for this problem. One of the alternatives to consider is including the LOS type in the LOS ID, e.g. Course/ABC where ABC would be the "component code" to be displayed to the user. XSD could verify that all LOS IDs begin with LOS type. I didn't think it through yet, but perhaps this would be better than the currently proposed solution. Do you think this could work?

@kaiqu
Copy link
Author

kaiqu commented Jan 4, 2017

We will have a hard time explaining the difference between los-id and los-code to all the partners...

How about a naming convention where "-id" means an artificial key (system-generated and not necessarily human-readable) and "-code" means a natural one (i.e. human-readable and possibly represented in a SIS)? From what I've seen, this seems to be a common pattern in both systems and literature...

I'm still trying to come up with a better solution for this problem. One of the alternatives to consider is including the LOS type in the LOS ID, e.g. Course/ABC where ABC would be the "component code" to be displayed to the user. XSD could verify that all LOS IDs begin with LOS type. I didn't think it through yet, but perhaps this would be better than the currently proposed solution. Do you think this could work?

How about structuring los-id with sub-elements: "type" (i.e. Course) and "los-code"?
From a modeling point of view, this is cleaner.

@wrygiel
Copy link
Collaborator

wrygiel commented Jan 4, 2017

How about structuring los-id with sub-elements: "type" (i.e. Course) and "los-code"?
From a modeling point of view, this is cleaner.

How to pass them in GET parameter then?

@kaiqu
Copy link
Author

kaiqu commented Jan 4, 2017

How about structuring los-id with sub-elements: "type" (i.e. Course) and "los-code"?
From a modeling point of view, this is cleaner.

How to pass them in GET parameter then?

Ah, there is that... You could split it into two parameters, but that would complicate the interface. I leave the decision to you.

@wrygiel
Copy link
Collaborator

wrygiel commented Jan 4, 2017

Another solution I think of is to convince partners to use UUIDs. This might be the easiest way to eliminate this issue and similar ones.

E.g. today one of my colleagues rightly noted, that some of our HEIs tend to change their component codes, or even Organization Unit IDs. I believe these are not so isolated cases. However, if we can convince partners to include UUID columns for each of their EWP entities, we will solve all these issues in one move. What do you think?

@kaiqu
Copy link
Author

kaiqu commented Jan 4, 2017

Another solution I think of is to convince partners to use UUIDs. This might be the easiest way to eliminate this issue and similar ones.
E.g. today one of my colleagues rightly noted, that some of our HEIs tend to change their component codes, or even Organization Unit IDs. I believe these are not so isolated cases. However, if we can convince partners to include UUID columns for each of their EWP entities, we will solve all these issues in one move. What do you think?

Yes - with the naming convention I suggested earlier, we would get:

  • los-id: UUID
  • loi-id: UUID
  • los-code: Optional natural key (if the system has such a thing)

Agree?

@wrygiel
Copy link
Collaborator

wrygiel commented Jan 4, 2017

Sounds okay. How about using UUIDs in ounit-id too?

@kaiqu
Copy link
Author

kaiqu commented Jan 5, 2017

How about using UUIDs in ounit-id too?

Sounds good (with an optional, natural ounit-code as well).

@wrygiel wrygiel changed the title Component structures Separation of natural and surrogate keys? Jan 5, 2017
@wrygiel
Copy link
Collaborator

wrygiel commented Jan 5, 2017

This proposal is a big change as it will probably require adding UUID columns in many tables in all partner databases. @erasmus-without-paper/all-members - what do you think?

UUIDs would fix two separate problems: 1. The problem of many partners not having a common unique ID for LOSes. 2. The problem of some partners not using surrogate keys in their internal IDs.

Still, there are other ways of fixing these problems, and some partners might be against introducing UUIDs.

@kaiqu
Copy link
Author

kaiqu commented Jan 5, 2017

This proposal is a big change as it will probably require adding UUID columns in many tables in all partner databases. @erasmus-without-paper/all-members - what do you think?
UUIDs would fix two separate problems: 1. The problem of many partners not having a common unique ID for LOSes. 2. The problem of some partners not using surrogate keys in their internal IDs.
Still, there are other ways of fixing these problems, and some partners might be against introducing UUIDs.

How about generalizing the ID elements to contain anything, including UUIDs or natural keys? The important thing is that the IDs don't change - if a HEI never changes its (natural) keys, it can send those?

@wrygiel
Copy link
Collaborator

wrygiel commented Jan 5, 2017

This doesn't fix problem 1 though. Partners would need to fix it themselves (e.g. by introducing UUIDs).

@wrygiel
Copy link
Collaborator

wrygiel commented Jan 5, 2017

In other words we're struggling between suggesting and enforcing safety measures.

@kaiqu
Copy link
Author

kaiqu commented Jan 5, 2017

All partners must present immutable keys (not just for LOS, but for all references for future use). If they have neither UUIDs nor common (natural or surrogate) keys across tables representing the same "thing", they must supply one or the other - either in extra columns, tables or views. I don't see any way around that.

I think your earlier suggestion to concate natural keys with a "table flag" (e.g. "Course/1234") is good for such cases. Generalized IDs allows for maximum flexibility, to minimize the impact on the partner system.

@wrygiel wrygiel self-assigned this Jan 9, 2017
@wrygiel
Copy link
Collaborator

wrygiel commented Jan 9, 2017

I think your earlier suggestion to concate natural keys with a "table flag" (e.g. "Course/1234") is good for such cases.

Ok! Let's try this then.

wrygiel added a commit to erasmus-without-paper/ewp-specs-api-courses that referenced this issue Jan 9, 2017
wrygiel added a commit to erasmus-without-paper/ewp-specs-api-course-replication that referenced this issue Jan 9, 2017
wrygiel added a commit that referenced this issue Jan 9, 2017
@wrygiel
Copy link
Collaborator

wrygiel commented Jan 9, 2017

I will also update the examples, so that they use surrogate IDs in all kinds of entities. Non-UUID IDs will still be allowed everywhere, but I guess that examples should be exemplary. :)

wrygiel added a commit to erasmus-without-paper/ewp-specs-api-courses that referenced this issue Jan 9, 2017
wrygiel added a commit to erasmus-without-paper/ewp-specs-api-course-replication that referenced this issue Jan 9, 2017
wrygiel added a commit to erasmus-without-paper/ewp-specs-api-institutions that referenced this issue Jan 9, 2017
wrygiel added a commit that referenced this issue Jan 9, 2017
wrygiel added a commit to erasmus-without-paper/ewp-specs-api-ounits that referenced this issue Jan 9, 2017
wrygiel added a commit to erasmus-without-paper/ewp-specs-api-imobility-tors that referenced this issue Jan 9, 2017
wrygiel added a commit to erasmus-without-paper/ewp-specs-api-courses that referenced this issue Jan 9, 2017
wrygiel added a commit that referenced this issue Jan 9, 2017
@wrygiel
Copy link
Collaborator

wrygiel commented Jan 9, 2017

I think it's done!

@kaiqu, please take a look at the commits above.

@wrygiel wrygiel closed this as completed Jan 9, 2017
wrygiel added a commit to erasmus-without-paper/ewp-specs-api-iias that referenced this issue Jan 9, 2017
@kaiqu
Copy link
Author

kaiqu commented Jan 9, 2017

Looks good (except see my comment about documenting that IDs must be stable over time).

@wrygiel
Copy link
Collaborator

wrygiel commented Jan 9, 2017

I believe you mean this comment.

Immutability is an integral part of being a surrogate key, and we already do advise usage of surrogate keys. But perhaps you are right, and we should state the reason (immutability) more clearly. Maybe it even deserves a separate section in Architecture and Security document.

@wrygiel
Copy link
Collaborator

wrygiel commented Jan 9, 2017

Added some more notes in key places. Also added iia-code and ounit-code (which will work in parallel with existing iia-id and ounit-id, as los-code works in parallel with los-id).

@kaiqu
Copy link
Author

kaiqu commented Jan 10, 2017

Immutability is an integral part of being a surrogate key, and we already do advise usage of surrogate keys. But perhaps you are right, and we should state the reason (immutability) more clearly.

Yes, especially since they can be a mixture of "classical" surrogate keys like serial number, UUIDs, concatenations of natural keys with type labels...

Maybe it even deserves a separate section in Architecture and Security document.

Yes, it should definitely be in an overarching description, as it is a general principle across APIs.

@kaiqu
Copy link
Author

kaiqu commented Jan 10, 2017

Also added iia-code and ounit-code (which will work in parallel with existing iia-id and ounit-id

I think the Institutions API should follow the same pattern, replacing other-id with hei-code.

@wrygiel
Copy link
Collaborator

wrygiel commented Jan 10, 2017

Yes, it should definitely be in an overarching description, as it is a general principle across APIs.

People tend to forget those when they get into the details. I put a reminder in each of the key places in XSDs instead (there are not so much of them).

I think the Institutions API should follow the same pattern, replacing other-id with hei-code.

This is a little different, because many of the other-id values are in fact immutable (or even surrogate) IDs (e.g. PIC codes). Let's leave it as it is for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants