First version of the guide

LINCnil · Jun 11, 2020 · a0a06aa · a0a06aa
commit a0a06aa
Show file tree

Hide file tree

Showing 23 changed files with 2,844 additions and 0 deletions.
diff --git a/00-Develop in compliance with the GDPR.md b/00-Develop in compliance with the GDPR.md
@@ -0,0 +1,19 @@
+# Sheet n°0: Develop in compliance with the GDPR
+
+#### Whether you work alone, are part of a team developing a project, manage a development team, or are a service provider carrying out developments for third parties, it is essential to ensure that user data and all personal data processing are suffisiently protected throughout the lifecycle of the project.
+
+The following steps will help you in the developing privacy-friendly applications or websites:
+
+1. **Be aware of the GDPR core principles**. If you work in a team, we recommend that you identify a person responsible for monitoring compliance. If your company has a Data Protection Officer (DPO), then that person is a key asset in [understanding and meeting the GDPR obligations](https://www.cnil.fr/sites/default/files/atoms/files/guidelines_on_dpos_5_april_2017.pdf). The appointment of a DPO may also be mandatory in some cases, for example if your programs or applications process so-called "sensitive" data (see [examples](#Sheet_n°1:_Identify_personal_data)) on a large scale or conduct regular and systematic monitoring on a large scale.
+
+2. **Map and categorize the data and processing in your system**. Accurately mapping the data processing performed by your program or application will help you ensuring that they comply with legal requirements. Keeping a [record of processing activities](https://www.cnil.fr/en/record-processing-activities) (an example of which can be found on the [CNIL website](https://www.cnil.fr/sites/default/files/atoms/files/record-processing-activities.ods)), allows you to have an overall view of these data, and to identify and prioritize the associated risks. Indeed, personal data may be present in unexpected places such as in server logs, cache files, Excel files, etc., and may be stored in a number of different places. Such record-keeping is mandatory in most cases.
+
+3. **Prioritize the required actions**. On the basis of the data processing registry, identify the required actions to comply with the obligations of the GDPR in advance of the development and prioritize the attention points with regard to the risks for the data subjects by the processing. These points of attention concern in particular [the necessity and types of data collected and processed](#Sheet_n°7:_Minimize_the_data_collection) by your software, [the legal basis](#Sheet_n°15:_Take_into_account_the_legal_basis_in_the_technical_implementation) on which your data processing operations are based, [the information mentions](#Sheet_n°12:_Inform users) of your software or application, [the contractual clauses](#Sheet_n°5_:_Make_an_informed_choice_of_its_architecture) binding you to your contractors, the terms and conditions for [exercising rights](#Sheet_n°13:_Prepare_for_the_exercise_of_people_rights), the measures implemented to [secure your processing](#Sheet_n°6:_Secure_your_websites,_applications_and_servers).
+
+4. **Manage the risks**. When you identify that a processing of personal data is likely to create high risks for data subjects, make sure that you manage those risks appropriately in the context. A [Privacy Impact Assessment (PIA) ](https://www.cnil.fr/en/privacy-impact-assessment-pia) can help you manage those risks. The CNIL has developed a [method](https://www.cnil.fr/sites/default/files/atoms/files/cnil-pia-1-en-methodology.pdf), [model documents](https://www.cnil.fr/sites/default/files/atoms/files/cnil-pia-2-en-templates.pdf) and a [tool](https://www.cnil.fr/en/open-source-pia-software-helps-carry-out-data-protection-impact-assesment) that will help you to identify those risks, as well as a [catalogue of good practices](https://www.cnil.fr/sites/default/files/atoms/files/cnil-pia-3-en-knowledgebases.pdf) that will assist you in implementing measures to address the identified risks. Furthermore, a privacy impact assessment is mandatory for all processing operations that are likely to create high risks to the rights and freedoms of data subjects. The CNIL proposes, on its [website](https://www.cnil.fr/sites/default/files/atoms/files/liste-traitements-aipd-requise.pdf), a list of  types of processing operations for which a DPA is required or not.
+
+5. **put in place internal processes** to ensure compliance during all development stages, ensure that internal procedures guarantee that data protection is taken into account in all aspects of your project and into all events that may occur (e.g. security breach, requests for rectification or access fulfillment, modification of data collected, change of provider, data breach, etc.). The requirements of the [governance label](https://www.cnil.fr/sites/default/files/typo/document/CNIL_Privacy_Seal-Governance-EN.pdf) (even if this one no longer granted by the CNIL since the entry into force of the GDPR) can constitute a useful basis of inspiration to help you set up the necessary organization.
+
+6. **Document developments compliance** to prove your compliance with the GDPR at all times: the actions performed and the documents produced at each stage of development must be mastered. This implies in particular a regular review and update of the documentation of your developments so that it is constantly consistent with the features deployed on your program.
+
+The CNIL website provides numerous practical files which will assist you in setting up law-abiding treatments according to your sector of activity.
diff --git a/01-Identify personal data.md b/01-Identify personal data.md
@@ -0,0 +1,56 @@
+# Sheet n°1: Identify personal data
+
+#### Understanding the notions of "personal data", "purpose" and "processing" is essential for the development of law enforcement and user data. In particular, be careful not to confuse "anonymisation" and "pseudonymization", which have very precise definitions in the GDPR.
+
+## Definition
+* The notion of **personal data** is defined in the [General Data Protection Regulation](https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32016R0679) (GDPR) as "[any information relating to an identified or identifiable natural person (referred to as "data subject")](https://www.cnil.fr/en/personal-data-definition)". It covers a broad scope that includes both directly identifying data (e.g. first and last name) and indirectly identifying data (e.g. telephone number, license plate, terminal identifier, etc.).
+
+* Any operation on this type of data (collection, recording, transmission, modification, dissemination, etc.) constitutes **processing within the meaning of the GDPR** and must therefore meet the requirements laid down by that regulation. Such processing operations must be lawful and have a specified purpose. The personal data collected and processed must be relevant and limited to what is strictly necessary to achieve the purpose.
+
+## Examples of personal data
+
+* Where they relate to natural persons, **the following data are personal data**:
+    * Surname, first name, pseudonym, date of birth;
+    * photos, sound recordings of voices;
+    * fixed or mobile telephone number, postal address, email address;
+    * IP address, computer connection identifier or cookie identifier;
+    * Fingerprint, palm or venous network of the hand, retinal print;
+    * License plate number, social security number, ID number;
+    * Application usage data, comments, etc...
+
+* **Identification of natural persons can be carried out**:
+    * from a single piece of data (examples: surname and first name);
+    * from crossing of a set of data (example: a woman living at such and such an address, born on such and such a day and member of such and such an association).
+
+* Some data are considered **particularly sensitive**. The GDPR prohibits the collection or use of such data, unless, in particular, the data subject has given his/her express consent (active, explicit and preferably written consent, which must be free, specific and informed).
+
+* These requirements concern the following data:
+
+    * data relating to the **health of individuals**;
+    * data concerning **sexual life** or **sexual orientation**;
+    * data revealing an alleged **racial** or **ethnic** origin;
+    * political opinions, religious beliefs, philosophical beliefs or trade union membership;
+    * **genetic** and **biometric data used for the purpose of uniquely identifying an individual**.
+
+## Anonymisation of personal data
+
+* An **anonymisation process of personal data** aims at making impossible to identify individuals within data sets. It is therefore an irreversible process. When this anonymisation is effective, the data are no longer considered as personal data and the requirements of the GDPR are no longer applicable.
+
+* By default, we recommend that you **never consider raw datasets as anonymous**. Anonymisation results from processing personal data in order to irreversibly prevent  identification, whether by:
+
+    * _singling out_: it is not possible to isolate some or all records which identify an individual in the dataset;
+    * _linkability_: the dataset does not allow to link at least, two records concerning the same data  subject  or  a  group  of  data  subjects;
+    * _inference_:  it is  not  possible  to  deduce,  with  significant  probability, the value of an attribute from the values of a set of other attributes.
+
+* These data processing operations imply in most cases a **loss of quality on the produced dataset**. The Article 29 Working Party (Art. 29 WP) [opinion on anonymisation techniques](https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2014/wp216_en.pdf) describes the main anonymisation techniques used today, as well as examples of datasets wrongly considered anonymous. It is important to note that anonymisation techniques have short comings. The choice to anonymize or not the data as well as the selection of an anonymisation technique must be made on a case by case basis according to contexts of use and need (nature of the data, usefulness of the data, risks for people, etc.).
+
+
+## Pseudonymization of personal data
+
+* **Pseudonymization is a compromise between retaining raw data and producing anonymized datasets**.
+
+* It refers to the processing of personal data in such a way that **data relating to a natural person can no longer be attributed without additional information**. The GDPR insists that this additional information must be kept separately and be subject to technical and organisational measures to avoid re-identification of data subjects. Unlike anonymisation, pseudonymization can be a reversible process.
+
+* In practice, a pseudonymization process consists of **replacing directly identifying data (surname, first name, etc.) in a dataset with indirectly identifying data** (alias, number in a filing system, etc.) in order to reduce their sensitivity. They may result from a cryptographic hash of the data of individuals, such as their IP address, user ID, e-mail address.
+
+* Data resulting from pseudonymization are considered as **personal data and therefore remain subject to the obligations of the DPMR**. However, the European Regulation encourages the use of pseudonymization in the processing of personal data. Moreover, the GDPR considers that pseudonymization makes it possible to reduce the risks for data subjects and to contribute to compliance with the Regulation.
diff --git a/02-Prepare your development.md b/02-Prepare your development.md
@@ -0,0 +1,34 @@
+# Sheet n°2: Prepare your development
+
+#### The principles of personal data protection must be integrated into IT developments from the design phase onwards in order to protect the privacy of the people whose data you are going to process, to give them better control over their data and to limit errors, losses, unauthorised modifications or misuses of their data in applications.
+
+## Methodological choices
+
+* **Put privacy protection at the center of your developments** by adopting a [Privacy By Design](https://edpb.europa.eu/our-work-tools/public-consultations-art-704/2019/guidelines-42019-article-25-data-protection-design_en) methodology.
+
+* If you use agile methods for your developments, consider **integrating security at the center of your process**. The ANSSI has made available a guide ["digital security & agility"](https://www.ssi.gouv.fr/uploads/2018/11/guide-securite-numerique-agile-anssi-pa-v1.pdf) (in French only) which indicates how to conduct your developments within the framework of an agile method while taking into account the security aspects. Don't hesitate to draw inspiration from it.
+
+* For any development aimed at the general public, **consider the privacy settings**, and in particular the default settings, such as the characteristics and user content visible by default.
+
+* **Conduct a [Privacy Impact Assessment (PIA)](https://www.cnil.fr/en/privacy-impact-assessment-pia)**. For [certain processing operations](https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/accountability-and-governance/data-protection-impact-assessments/) it is mandatory. In other cases it is a good practice that will allow you to identify and deal with all the risks upstream of your developments. The CNIL has a special section on its website and provides a [free software](https://www.cnil.fr/en/open-source-pia-software-helps-carry-out-data-protection-impact-assesment) dedicated to this type of analysis.
+
+
+## Technological choices
+
+#### Architecture and features
+
+* **Include privacy protection, including data security requirements, at the design stage of the application or service**. These requirements should influence [architecture choices](#Sheet_n°5:_Making_an_informed_choice_of_architecture) (e.g. decentralized vs. centralized) or functionality (e.g. short term anonymization, data minimization). The default settings of the application must meet minimum security requirements and comply with the law. For example, the default complexity of passwords must comply at least with the [CNIL recommendation on passwords](https://www.cnil.fr/fr/node/23803).
+
+* **Maintain control of your system**. It is important to keep control of your system, both to ensure proper operation and to guarantee a high level of security. Keeping a system simple allows you to understand precisely how it works and to identify its weak points. If a certain complexity is required, it is advisable to start with a simple, correctly designed and secure system. Then, it is possible to increase the complexity little by little, while continuing to secure the new features that are added.
+
+* **Don't rely on a single line of defense**. In spite of all the steps taken to design a secure system, it may happen that some components added later may not be sufficiently secure. To minimize the risk to end users, it is advisable to defend the system in depth. For example, checking the data entered in an online form is part of the periphery defenses. If this defense is hijacked, database query protection can take over.
+
+#### Tools and practices
+
+* **Use programming standards that take into account safety**. Often, lists of standards, best practices or coding guides improving the security of your developments are already available. Ancillary tools can also be integrated into your integrated development environments ("**IDE**") in order to automatically check that your code complies with the various rules that are part of these standards or good practices. You can easily find lists of good practices for your favourite programming language on the Internet. For example [here](https://wiki.sei.cmu.edu/confluence/display/seccode/SEI+CERT+Coding+Standards) for C, C++ or Java. For web application development, specific good practice guides exist, such as those published by [OWASP](https://www.owasp.org).
+
+* **The technological choices are critical.** A few parameters need to be taken into account:
+    * Depending on the field of application or functionality developed, one language or technology may be more appropriate than another.
+    * Time-tested languages and technologies are safer. They have, in general, been audited to correct the most known vulnerabilities. However, you should be careful to use the latest versions of each of the technology building blocks you will be using.
+    * You must avoid coding your final solution in a language you have just learned and not yet mastered. Otherwise, you expose yourself to an increased risk of a security flaw due to lack of experience.
+* **Set up a secure development environment that allows versioning of the code** by following the [dedicated sheet](#Sheet_n°3:_Secure_your_development_environment) in this guide.
diff --git a/03-Secure your development environment.md b/03-Secure your development environment.md
@@ -0,0 +1,25 @@
+# Sheet n°3: Secure your development environment
+
+#### The security of production, development and continuous integration servers as well as developer workstations must be a priority because they centralize access to a large amount of data.
+
+## Assess your risks and adopt the appropriate security measures
+
+* **Assess the risks** in the tools and processes used for your developments. Make an inventory of your existing security measures and define an action plan to improve your risk coverage. Appoint a person responsible for its implementation.
+
+* Consider the risks on all the tools you use, including risks related to SaaS (Software as a Service) and collaborative tools in the cloud (such as [Slack](https://slack.com), [Trello](https://trello.com), [GitHub](https://github.com), etc.).
+
+## Secure your servers and workstations in a homogeneous and reproducible way
+
+* Lists of **recommendations** concerning the security of servers, workstations and internal networks are available in the [sheets n° 5 to 8](https://www.cnil.fr/sites/default/files/atoms/files/cnil_guide_securite_personnelle_gb_web.pdf) of the **security of personal data guide** of the CNIL.
+
+* Write a **document listing those measures and explaining their configuration** to ensure that security measures are implemented uniformly on servers and workstations. In order to reduce the workload, **configuration management tools**, such as [Ansible](https://github.com/ansible/ansible), [Puppet](https://github.com/puppetlabs/puppet) or [Chef](https://github.com/chef/chef), can be used.
+
+* Update servers and workstations, if possible automatically. You can set up a watchlist of the most important vulnerabilities, for example the [NVD Data Feeds](https://nvd.nist.gov/vuln/data-feeds).
+
+## Put special emphasis on access management and traceability of operations
+
+* Remember to document the management of your **SSH keys** (use of state of the art cryptography and key length algorithms, protection of private keys with a passphrase, key rotation). For examples of good practice, see [the document on the secure use of (open)SSH](https://www.ssi.gouv.fr/uploads/2014/01/NT_OpenSSH_en.pdf).
+
+* Encourage strong authentication on the services used by the development team.
+
+* **Trace** access to your machines and, if possible, implement **automated log analysis**. In order to keep reliable traces, the use of generic accounts is to be avoided.