Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ID length normalization #112

Closed
Echsecutor opened this issue Jan 23, 2024 · 5 comments
Closed

ID length normalization #112

Echsecutor opened this issue Jan 23, 2024 · 5 comments

Comments

@Echsecutor
Copy link
Collaborator

Echsecutor commented Jan 23, 2024

Currently the DL normalization of GS1 IDs does not take length differences (leading zeros) into account.

I added an example for an event which uses some id, e.g.

<epcClass>https://id.gs1.org/01/4064074123453/10/245</epcClass>

and the same event writing the same id as

<epcClass>https://id.gs1.org/01/04064074123453/10/245</epcClass>

This currently leads to different hashes, but in my oppinion should not

See the failing test in #113

@RalphTro
Copy link
Owner

Dear @Echsecutor ,
Thanks for bringing this to the table. Understood your point.
Indeed - the canonical GS1 DL URI embedding a GTIN MUST represent the GTIN in its 14-digit format.
Curiously, the normaliser module should take care about that:
https://github.com/RalphTro/epcis-event-hash-generator/blob/master/epcis_event_hash_generator/dl_normaliser.py

I just made some tests, and added the following sample epcs to https://github.com/RalphTro/epcis-event-hash-generator/blob/master/tests/examples/epclist_normalisation.jsonld (see branch 'issue112'):

                    "https://id.gs1.org/01/4064074123453/21/245",
                    "https://id.gs1.org/01/04064074123453/21/246",
                    "https://id.example.com/01/12345670/21/123",
                    "https://id.example.com/01/061414112345/21/123",
                    "https://id.example.com/01/4012345123456/21/123"

...into

I then executed the code, and it correctly transformed the above into:

epc=https://id.gs1.org/01/04064074123453/21/245
epc=https://id.gs1.org/01/04064074123453/21/246
epc=https://id.gs1.org/01/00000012345670/21/123
epc=https://id.gs1.org/01/00061414112345/21/123
epc=https://id.gs1.org/01/04012345123456/21/123

So, from my POV, it actually works, any idea why it doesn't work for you? E.g. do you think it makes sense to specifically test it with an XML file?

@Echsecutor
Copy link
Collaborator Author

I have added an xml example here https://github.com/RalphTro/epcis-event-hash-generator/pull/113/files which to me looks like it doesn't work, i.e. those 3 events lead to 2 different hashes

@RalphTro
Copy link
Owner

Thanks, @Echsecutor ,

I think I found the reason for this bahaviour in your XML file:
In the first and second event, the inputQuantityList looks as follows in the pre-hash string:
inputQuantityListquantityElementepcClass=https://id.gs1.org/01/04064074123453/10/245

But in the third one, the GTIN has another GTIN indicator digit (9 instead of 0):
inputQuantityListquantityElementepcClass=https://id.gs1.org/01/94064074123453/10/245

...and this MUST of course lead to a different hash value.

I just noticed that the GTIN as part of the first GS1 DL URI has an incorrect check digit (it must be 04064074123450), the second one is correct though. When canonicalising EPC URNs to GS1 DL URIs, our tool calculates the check digit.

And THIS is the reason why the hash value is still different even if you corrected the EPC URN. The pre-hash string has the correct check digit (0) after canonicalising the EPC URN, while event 1 + 2 still have the incorrect ones. So, again, it is correct that our implementation returns different hash values.

Now, the interesting question is: does our implementation needs to check each and every identifier populating the epc/quantityLists? This would go beyond of what is specified in the CBV. What is your view on this?

Hope this helps/clarified your question?

Kind regards,
Ralph

@Echsecutor
Copy link
Collaborator Author

Ahh... so this eventually is a typo in my test data and actually the normalization is already in place. Big sorry for wasting your time on this one @RalphTro !

I do not think that validating correct inputs (such as check digits) is within the scope of this hash generater reference implementation. Though it would help with stupid user errors ... ;)

@RalphTro
Copy link
Owner

Dear @Echsecutor ,
No worries. Glad I could help you for a change. ;-)
See you soon;
Ralph

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants