-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-Evaluate Anonymisation and Security Measure names for Correctness #15
Comments
I agree that Anonymisation should not be a subclass of Pseudoanonymisation, given that data cannot be both anonymised and pseudoanynomised. It could be argued that Anonymisation could be either Full (or True) Anonymisation or Psuedoanonymisation, in which case Pseudoanonymisation would be a subclass of Anonymisation, but that may introduce confusion between Anonymisation and Full Anonymisation and therefore be undesirable. So having Anonymisation and Pseudoanonymisation as parallels may be the best solution. A possible name for a superclass for both types of anonymisation as well as encryption might be Data Obfuscation. |
Hi Maya, thanks for the input, I agree with your arguments. I tried looking up EDPB and ISO definitions for these terms and how they are used, and it is similar to what you propose. But other uses (e.g. industry, technical) considers 'anonymisation' as a broad range of techniques which also includes pseudo-anonmisation. Then there is further confusion as to what data is produced as an outcome of these processes. An anonmisation process may still produce personal data (non-anonymous) if its associated with an identifer. For example, consider the case where an identifier is associated with a exact location. The anonymisation technique replaces this with country. Now the data is anonymised through anonymisation process but is still personal data. So there is a distinction between anonimisation as a technical term and that as applied for GDPR. To support your proposal, maybe we can have Anonymisation as the general class of anonymisation-related techniques, and specifically PseudoAnonymisation and CompleteAnonymisation as subclasses. Data Obfuscation involves other techniques in addition to anonymisation, so it can be the parent class of Anonymisation once those other concepts have been identified. |
Recording conversation at PEPR'22 about Anonymisation, where Damien pointed out this problem. The potential operation is changing "Anonymisation" to "AnonymisationMeasure" and "CompleteAnonymisation" to "Anonymisation" so as to bring these concepts in line with what is defined legally and in standards (e.g. ISO 29100) while keeping the 'taxonomy' of anonymisation approaches in tech/org measures. |
Thanks Harshvardhan! To add a bit more explanation to this, I see a fairly serious risk with calling "Anonymization" the concept that corresponds to "The class of measures/processes that are used in order to make data less identifiable": we end up in a situation where people might use "Anonymization" on their data, and end up with data that is not "anonymized" according to ISO standards & EU regulation. This confusion happens frequently in the media, due to the use of the work "anonymization" to mean "de-identification" in the US. I've seen this create problems in my previous role in a big tech company, which is partly why we decided to only call something "anonymization" if it reached that high bar of making it impossible to re-identify people. I strongly support changing "CompleteAnonymization" to simply "Anonymization", so that something is called "Anonymization" if and only if it leads to anonymized data, and the confusion disappears. Changing "Anonymization" to "AnonymizationMeasure" helps people understand that this might not be enough, so this definitely seems much better to me. It might not be enough, though. An alternative would be to call this "DeidentificationMeasure", and rename the process of removing identifiers something like "IdentifierRedaction" to avoid confusion. Yet another alternative, clearer but verbose, would be something lie "ReidentificationRiskMitigation", to better capture this idea of "measure towards making it harder to identify people". |
Thanks @TedTed ; I have updated the title on this issue to (re-)evaluate all names in tech/org measures with this perspective, and make changes where necessary. |
Hi All, thanks for the feedback. The structure is now as follows:
|
I fail to see the added value of introducing
By definition, any measure that reduces identifiability of data needs to "remove information", in some sense. Therefore, Otherwise, even though you renamed |
This discussion should probably be held in parallel with |
The GDPR (Recital 26) approach to anonymity is based on a rather risk-based "reasonable likeliness", based on
Hence, these factors should be represented more precisely in the respective Class descriptions. As all of this is an active area of research and (in my opinion) not conclusively addressed by courts, it might might make sense to mark these Classes as unstable or proposed, if that is possible? |
Hi.
Deidentification is a specific category of anonymisation techniques that focus on reducing identifiability. Anonymisation is broader than identifier removals because it also relates to potential re-combinations with other datasets to create identifiability.
Deidentification is a common term in this domain. E.g. there's even an ISO standard (20889:2018) about it https://www.iso.org/standard/69373.html. For HIPAA - the title explicitly states de-identification which is a strong argument to represent that concept. Further types of de-identification processes should be modelled as subclasses/sub-types of Deidentification, and not replace it. I would prefer the ISO terminology over HIPAA in this case as it is broader in scope and represents greater technical consensus in this case, with HIPAA concepts added later within the resulting hierarchy (if needed). Pseudonymisation is a declared as a DataAnonymisationTechnique (and not as a type of Anonymisation) for the sake of grouping anonymisation related concepts together under an umbrella term.
AnonymisedDataWithinScope has been changed to ContextuallyAnonymisedData, the note has been updated. Where SnythethicData is also a personal data, the data should be declared also as a subclass/type of PersonalData. The note states it can be personal or non-personal. The description is taken from ENISA guide on Data Protection Engineering,https://www.enisa.europa.eu/publications/data-protection-engineering
I see the value in representing this as a concept, but an unsure as to how it should be associated with processing information. My guess is to provide as an organisational measure, similar to policies and assessments. So |
We discussed in today's meeting and are okay with the current list. We're keeping this open in case there are further discussions. Other we will close this in the coming weeks as completed. |
For context, does the "current list" refer to this comment or to the state of the world prior to this issue? |
Current list as in the concepts that are in DPV as of now, after the comments. |
Sorry for the late response, but I continue to raise the argument that Pseudonymization is not an anonymisation technique. Thank you for your clarification of With respect to the Recital 26 criteria for anonymised data, I didn't propose to add these as organizational measures - even though that's a good idea - but simply to add a reference to Recital 26 and the mentioned criteria to the Class description or note, as they define what anonymised data is in the first place. |
Hi. Thanks for your comment, I understand your point, and the need to change this.
Yes, strictly speaking this is correct, though the concept
Instead of GDPR's recitals, the techniques have been linked to ISO 29100:2011 Security Techniques -- Privacy Framework definitions which are more broadly used. |
The following typos in IRIs were fixed using the new SHACL shapes from previous commit: - dpv:expiry relation instead of dpv:hasExpiry relation in consent - dpv:hasConsequenceOn was used as a parent even though it was proposed. The term has been promoted to accepted status - Typos in Technical measures where Crypto- was mistyped as Cryto- Errors in labels: - MaintainCreditCheckingDatabase - MaintainCreditRatingDatabase The following terms were updated: - GDPR's legal bases where text has been added from Art.6 and the parent terms have been aligned with main spec's legal bases (including creation of new terms to match granularity) - Anonymisation and Pseudonymisation have been changed to be types of Deidentification techniques (as the grouping parent concept) to distinguish them following discussions in #15 - DPV-LEGAL has laws and DPAs for USA from contributions by @JonathanBowker
The following typos in IRIs were fixed using the new SHACL shapes from previous commit: - dpv:expiry relation instead of dpv:hasExpiry relation in consent - dpv:hasConsequenceOn was used as a parent even though it was proposed. The term has been promoted to accepted status - Typos in Technical measures where Crypto- was mistyped as Cryto- Errors in labels: - MaintainCreditCheckingDatabase - MaintainCreditRatingDatabase The following terms were updated: - GDPR's legal bases where text has been added from Art.6 and the parent terms have been aligned with main spec's legal bases (including creation of new terms to match granularity) - Anonymisation and Pseudonymisation have been changed to be types of Deidentification techniques (as the grouping parent concept) to distinguish them following discussions in #15 - DPV-LEGAL has laws and DPAs for USA from contributions by @JonathanBowker
Reviewed and closed based on implementation in https://w3id.org/dpv#vocab-TOM-technical which contains the described structure. |
| Migrated ISSUE-33: The categorisation of Pseudoanonymisation and Encryption is not (semantically) correct
State: RAISED
Raised by: Harshvardhan J. Pandit
Opened on: 2019-11-26
Description: (from presentation to Kantara CISWG) Anonymisation is a subclass of Pseudoanonymisation which is conflicting in semantics as it specifies anonymisation is a type of pseudoanonymisation, which might not be intended. Also, Pseudoanonymisation and Encryption should not be grouping together (as a concept).
Reporter: Harsh
Notes: suggested to start a discussion on this issue.
The text was updated successfully, but these errors were encountered: