PRACTITIONER’S NOTE DIGITAL ID AND THE DATA PROTECTION CHALLENGE 2 October 2019 Inclusive and trusted identification (ID) systems are crucial tools for achieving sustainable development, including the World Bank Group’s twin goals of ending extreme poverty and of boosting shared prosperity and greater equity in the developing world.3 Indeed, the essential role that identification plays in development is explicitly recognized in Sustainable Development Goal (SDG) Target 16.9, to “provide legal identity for all, including birth registration” by 2030.4 Traditionally, proof of identity has been provided through physical documents, such as birth certificates, passports, or ID cards. As the world becomes increasingly digitized, the next generation of ID systems use new technologies to provide digital proof of legal identity for in-person and remote transactions. These digital ID systems can help achieve multiple development goals, but also create challenges for digital privacy and data protection. This note describes these risks and then presents concrete steps to mitigate them while harnessing the full potential of digital ID for development. Digital ID for development In addition to helping achieve SDG Target 16.9 directly, digital ID systems that provide proof of legal identity can support multiple rights and development goals—such as financial and economic inclusion, social protection, healthcare, and education-for-all, gender equality, child protection, agriculture, good governance, and safe and orderly migration—through: Empowering individuals and facilitating their access to rights, services, and economic •  opportunities that require proof of identity. This includes social services, pension payments, banking, formal employment, property rights, voting, and more. 1 This note was prepared by Julia Clark and Conrad Daly as part of the Identification for Development (ID4D) Initiative, under the supervision of Vyjayanti Desai. This note benefited greatly from the inputs and reviews of World Bank Group staff including David Satola and Jonathan Marskell, as well as feedback from Kanwaljit Singh (Bill & Melinda Gates Foundation), CV Madhukar (Omidyar Network), and David Symington (Office of the UN Secretary General’s Special Advocate for Inclusive Finance for Development). 2 Much of the material in this Briefing Note draws from the ID4D Guide for Practitioners, available at http://id4d.worldbank.org/guide. 3 World Bank. 2017. Principles on Identification for sustainable Development: Toward the Digital Age. Washington, DC: World Bank Group. http://id4d.worldbank.org/principles. 4 UN General Assembly. 2015. A/RES/70/1. Transforming our world: the 2030 Agenda for Sustainable Development. SDG 16.9. https://www.refworld.org/docid/57b6e3e44.html. ID4D.WORLDBANK.ORG Strengthening the transparency, efficiency, and effectiveness of governance and service delivery. Digital ID •  systems can help the public sector reduce fraud and leakage in government-to-person (G2P) transfers, facilitate new modes of service delivery, and increase overall administrative efficiency.5 Supporting private sector development. In addition to the public sector, digital ID systems can also help private •  companies reduce operating costs associated with regulatory compliance (e.g., eKYC), widen customer bases, generate new markets, and foster a business-friendly environment more broadly.6 Enabling the digital economy. Combined with trust services like e-signatures, digital ID systems facilitate trusted •  transactions, streamline “doing business,” and create opportunities for innovation—providing a core platform for the digital economy. Digital ID and data protection—What are the risks? While digital ID systems can support multiple development goals, they also create risks to digital privacy and data protection. While such risks are inherent to any ID system, digitization can exacerbate their scale and frequency. These risks may have serious, often immeasurable, consequences for people, and therefore require appropriate protections. Box 1. Data protection and privacy international good practice standards Digital ID systems raise data privacy concerns because they collect personal data. Building upon existing principles7, the European Union’s (EU) General Data Protection Regulation (GDPR)8 sets a new, international good-practice standard for data protection and privacy. Here, we extract some working definitions. Data privacy differs from the fundamental right to privacy—commonly defined as the “right to be let alone”9— and should be understood as the appropriate and permissioned use and governance of personal data. In ID systems, data privacy does not necessarily mean that all data is kept secret at all times. Rather, it means that data should only be accessed, processed, or shared by and with authorized users for pre-specified purposes that have been agreed in advance. Data protection—which includes the legal, operational, and technical methods and controls for securing information and enforcing rules over access and use—is therefore fundamental to ensuring data privacy. Not all data merits the same level of protection. Personal data refers to “any information relating to an identified or identifiable natural person” (GDPR Article 4). An identifiable natural person (or “data subject”) is defined as a natural person “who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person” (GDPR Article 4). Sensitive personal data (or “special categories of data”) refers to “personal data that, by their nature, are particularly sensitive in relation to fundamental rights and freedoms and merits specific protection as the context of their processing could create significant risks to a person’s fundamental rights and freedoms.”10 They include data consisting of racial or ethnic origin, political opinions, religious or philosophical beliefs, health, life or sexual orientation, as well as biometric and genetic data. (GDPR Recital 51). 5 World Bank. 2018. Public Sector Savings and Revenue from Identification Systems: Opportunities and Constraints. Washington, DC: World Bank Group. http://id4d.worldbank.org/research. 6 World Bank. 2018. Private Sector Economic Impacts from Identification Systems. Washington, DC: World Bank Group. http://id4d.worldbank.org/research. 7 E.g., U.S. Federal Information Processing Standards, OECD Privacy Principles, ISO/IEC, and PbD standards (see below). 8 Regulation (EU) 2016/679 of 27 April 2016 (GDPR). 9 See Warren, Samuel and Louis Brandeis. 1890. “The Right to Privacy”. 4 Harvard Law Review. p. 193 and Cornell, Anna Jonsson. 2016. “Right to Privacy”, Max Planck Encyclopedia of Comparative Constitutional Law. In order to facilitate readability, the term “personal data” will be used throughout without distinguishing between notions of “personal data”, 10  “personally identifiable information” (PII), and “sensitive personal data”. 2 id4d.worldbank.org Data and privacy risks related to personal information Any activity that collects, stores, or processes personal data raises certain risks, including, but not limited to: • Security breaches: Physical or cyberattacks on data in transit or at rest. • Unauthorized disclosure: Inappropriate transfer of data between government agencies, foreign governments, private companies, or other third parties. Exposure of sensitive personal information: Disclosing sensitive personal information (e.g., biometrics, religion, •  ethnicity, gender, medical histories) for unauthorized purposes. • Function creep: The use (and even sharing) of data for purposes beyond those for which consent was given. • Identity theft: Identity theft in the digital world can lead to consequences that are at least as serious as those in the “real”, physical world, and, given the global, decentralized nature of the internet, damages that are often more difficult to repair. In a digitized world, impersonation can be undertaken by just about anyone.11, 12 • Surveillance risks: The ability to correlate identifying information across databases (e.g., via facial recognition) increases surveillance risks, particularly where biometrics are involved.13 Discrimination or persecution: Identity attributes might be used to discriminate or persecute particular people •  or groups. Relatedly, reputational attacks (professional and personal) can be launched with significantly greater ease—and to significantly greater effect.14 • Unjust treatment: Incomplete or inaccurate data can lead to mistakes or unjust treatment. Digitization: raising the stakes While the above-discussed risks are present in any ID system, digital ID systems may augment both the risks and the harms beyond traditional, paper-based systems because they enable: Ever-more massive data security breaches: Consolidation of data increases the impact of data breaches,15 while •  also making such databases more attractive targets. • Easy destruction of digital records: Digitization allows for the easy (or mass) deletion of data—as anyone who has had their phone wiped may readily attest to. Without appropriate data safeguards, entire records—and therefore individuals—might be made to “disappear” For instance, in healthcare, the use of electronic health records (EHRs) has raised concerns that loss of documentation integrity could compromise patient care, coordination, reporting and research, and even allow fraud and abuse.16 • Easy copying of digital records: When, in 1971, the so-called “Pentagon Papers” were stolen and leaked, one of the most significant obstacles was the physical copying and subsequent collation of some 7,000 pages. By contrast, in 2015, the so-called “Panama Papers” involved some 11.5 million digital files.17 Exposure of “hidden”-but-connected personal data: Automatic data processing, as supported through AI and •  machine learning, makes possible discovery of vast arrays of patterns and other information, such as by connecting disparate information about a person from disparate sources or using metadata about individuals or groups.18 11 Gercke, Marco. 2007. “Internet-Related Identity Theft”, Council of Europe Discussion Paper. https://rm.coe.int/16802fa3a0. 12 World Bank; United Nations. 2017. Combatting Cybercrime: Tools and Capacity Building for Emerging Economies. Washington, DC: World Bank Group. http://www.combattingcybercrime.org/. 13 Barber, Gregory. 2019. “San Francisco Bans Agency Use of Facial-Recognition Tech”. Wired.com. https://www.wired.com/story/ san-francisco-bans-use-facial-recognition-tech/. 14 Ibid. 15 Cameron F. Kerry. 2017. “Why protecting privacy is a losing game today—and how to change the game”, Brookings Institute. https://www.brookings. edu/research/why-protecting-privacy-is-a-losing-game-today-and-how-to-change-the-game/. 16 Arrowood, D, E Choate, E Curtis et al. 2013. “Integrity of the Healthcare Record: Best Practices for EHR Documentation”, Journal of AHIMA. pp.58- 62. https://library.ahima.org/doc?oid=300257#.XNt9so5JE2w. 17  See Chokshi, Niraj. 2017. “Behind the Race to Publish the Top-Secret Pentagon Papers”, New York Times. https://www.nytimes.com/2017/12/20/us/ pentagon-papers-post.html and Harding, Luke. 2016. “What are the Panama Papers? A guide to history’s biggest data leak”, The Guardian. https:// www.theguardian.com/news/2016/apr/03/what-you-need-to-know-about-the-panama-papers. 18 See, e.g., “Behind the Data: Investigating Metadata”, Exposing the Invisible. https://exposingtheinvisible.org/guides/ behind-the-data-metadata-investigations/. DIGITAL ID AND THE DATA PROTECTION CHALLENGE 3 Security benefits of going digital At the same time that the digitalization of identification systems creates or augments certain risks, however, it also presents new opportunities and technological means for greater protection. Specifically, digital ID systems may offer: • More accurate identification and authentication. As digital ID systems leverage computer processing and advanced technologies, they can offer a higher level of assurance and accuracy than manual, paper-based authentication processes that are subject to human error and discretion. Doing so increases trust, reduces costs, and supports sustainable, flexible systems. Improved data integrity. Although digital ID systems present new security risks they can—by adopting the data •  protection measures described above—also better assure the integrity and use of collected data compared with paper-based records systems that can be easily destroyed, damaged, or altered. Furthermore, automated, tamper- proof transaction logging provides an auditable records of data processing, thereby improving accountability and helping to address security breaches. Better and more nuanced data privacy guarantees. Digital technology enables new privacy-enhancing features •  that were previously not possible. In systems using non-digital credentials, transaction typically involves presenting a physical ID card to a service provider, and therefore revealing all the displayed information (e.g., presenting a physical credential as proof-of-age reveals additional information, such as full name, date-of-birth and, often, address). Digital technology can help resolve this issue through digital credentials that obscure or selectively present only the data necessary. • Increased agency and control. New technologies and design strategies give individuals greater control over their personal data, including access portals that allow users to verify accuracy of their data and monitor data usage, and which automate data-breach notifications. Further, emerging digital ID ecosystems provide users with greater choice of ID providers. Protecting data and privacy in a digital world Data privacy and security measures should be integrated throughout the ID lifecycle—that is, data protection must become an organizational norm. This requires a “privacy-and-security-by-design approach”19 that builds upon the following foundational principles: 1. Developing proactive—not reactive—systems that take a preventative not approach; 2. Making privacy the default setting, rather than requiring affirmative action; 3. Embedding privacy into the technical design from the start rather than retrofitting it; 4. Construing privacy in a positive-sum manner (“win-win”), and not as a zero-sum (“either/or”); 5. Developing end-to-end security with a view to full-lifecycle protection; 6. Building-in visibility and transparency and keeping systems open and accountable; and 7. Keeping the system user-centric, with an eye to respecting user data privacy. In practice, implementing a privacy-and-security-by-design approach require a series of complementary controls: • Legal controls, including comprehensive legal and institutional frameworks safeguarding data and assuring user rights, especially their consent to use and control of personal data; • Management controls for monitoring and oversight; • Operational controls that promote security awareness, training, and detection; and • Technology controls that limit and protect the processing of personal data and ensure the physical and virtual security of systems that process personal data. 19 First conceptualized by Ann Cavoukian as “Privacy by Design” or PbD. See Cavoukian, Ann. 2011. Privacy by Design. https://iab.org/wp-content/IAB-uploads/2011/03/fred_carter.pdf. 4 id4d.worldbank.org Data protection begins with a comprehensive legal framework A legal and institutional framework requires a series of interlocking instruments. To begin, data protection and privacy need to be enshrined in cross-cutting laws and principles, such as those enumerated in Box 2. These legal instruments should include explicit applications to ID systems and be policed by strong, high-capacity, institutional actors (e.g., data protection agencies). Larger policy instruments, such as national cybersecurity strategies, help to assure whole- of-government approaches and should apply across actors. Enabling laws should also be technology neutral, and not require legislative revision to adapt to technological progress. Certain groups—such as ethnic, racial, or religious minorities—may also face particular concerns regarding the collection and use of data that indicates their group identity, and which could be used to profile or discriminate against them. Practitioners should carefully consider risks to these groups from collecting sensitive information and adopt sufficient legal and procedural protections against discrimination. Box 2. Principles for processing personal data According to the UN Personal Data Protection and Privacy Principles20 personal data should be: 1. Processed in a fair and legitimate manner, taking into account the person’s consent and best interests, as well as larger legal bases. 2. Processed and retained consistent with specified purposes, taking into account the balancing of relevant rights, freedoms and interests.  roportional to the need, by being relevant, limited and adequate to what is necessary to the specified 3. P purposes. Retained only for the time necessary for the specified purposes. 4.  5. Kept accurate and up-to-date in order to fulfill the specified purposes. 6. Processed with due regard to confidentiality. 7. Secured by appropriate safeguards (organizational, administrative, physical, technical) and procedures should be implemented to protect the security of personal data, including against or from unauthorized or accidental access, damage, loss or other risks presented by data processing. 8. Processed with transparency to the data subjects, as appropriate and whenever possible. Only transferred given appropriate protections to a third party. 9.  Done accountably, with adequate policies and mechanisms in place to adhere to these Principles. 10.  Designing systems that implement data protection principles While legal frameworks are vital to protecting personal data in ID systems, they must be put into practice with organizational, management, and technology safeguards. ID systems must translate laws and regulations into their technical and operating specifications, including limits on data collection and usage. This includes the use of operational controls—e.g., detailed operational manuals, staff training, physical and cybersecurity measures, etc.— and privacy-enhancing technologies (PETs). These technologies and controls work to implement privacy principles through various strategies, including minimizing data processing; hiding, separating, or aggregating personal data; informing individuals and giving them control over data use; and enforcing and demonstrating compliance with legal requirements (see Table 1). Although developed specifically to govern data processing with UN organizations, these principles embody international good practice. See UN 20  High-Level Committee on Management. 2018. UN Personal Data Protection and Privacy Principles. DIGITAL ID AND THE DATA PROTECTION CHALLENGE 5 Table 1. Examples of PETs and operational controls Strategy Example solutions (not exhaustive) Minimize the collection and processing of Collecting and sharing minimal data •  personal data to limit the impact to privacy of Anonymization and use of pseudonyms when •  the system data is processed Hide personal data and their interrelationships • Encrypt data when stored or in transit from plain view to achieve unlinkability and • End-to-end encryption unobservability, minimizing potential abuse • Key management/key obfuscation Process-oriented Data-oriented • Anonymization and use of pseudonyms or tokenization for data processing • “Zero semantics” or randomly generated ID numbers • Attribute-based credentials (ABCs) Separate, compartmentalize, or distribute the Tokenization or pseudonimization by sector •  processing of personal data whenever possible Logical and physical data separation (e.g., of •  to achieve purpose limitation and avoid the biographic vs. biometrics) ability to make complete profiles of individuals Federated or decentralized verification •  Aggregate personal data to the highest- Anonymize data using k-anonymity, differential •  level possible when processing to restrict the privacy and other techniques (e.g., aggregate amount of personal data that remains data over time, reduce the granularity of location data, etc.) Inform individuals whenever their data is • Transaction notifications processed, for what purpose, and by which • Data breach notifications means Give individuals tools to control the processing • User-centric identity services of their data and to implement data protection • Attribute-based credentials rights and improve the quality and accuracy of data Enforce a privacy and data protection policy • Role-based access control with two-factor that complies with legal requirements authentication • Remote access • Physical and cyber-security measures Demonstrate compliance with the privacy • Tamper-proof logs policy and applicable legal requirements • Audits Source: Table adapted from the ID4D Practitioner’s Guide (www.id4d.worldbank.org/guide). Original framework adapted from https://www.enisa.europa.eu/publications/privacy-and-data-protection-by-design to fit the ID system context. Note: this table is meant to be illustrative of common privacy-enhancing technologies and operational controls, but it is not exhaustive. 6 id4d.worldbank.org In addition to designing systems with these operational and technical controls, practitioners should consider additional policy measures to identity and mitigate key data privacy and security risks, including: Pro-active consultation and communication: Frequent engagement with the public and civil society is crucial for •  identifying and mitigating data protection threats and building trust in the system. Practitioners should implement outreach and education campaigns early-on to consult with the public on privacy and data protection issues and ensure effective and transparent communication about the purpose and use of these systems and the protections they offer. Where threats or breaches are identified, they should be treated promptly and transparently. Identifying risks to be mitigated through a privacy impact assessment (PIA): Conducting a PIA is recommended •  to evaluate the impact of the ID system on personal privacy and data and articulate how various controls will help mitigate these risks. Undertaking threat modeling exercises: Before finalizing the design of an ID system and beginning procurement, •  practitioners should undertake a threat modeling exercise to assess potential internal and external threats throughout the identity lifecycle (see Table 16 for examples of potential vulnerability at different stages of data processing). This is crucial not only for the security of the system, but to ensure uptake—people are less likely to participate in an ID system if they fear that their data will be misused or mismanaged. Protecting data is a cornerstone of good practices for ID The ten Principles on Identification for Sustainable Development21 (see Table 2) enshrine many of the legal, operational, and technical controls discussed above, such as the need to protect users’ data privacy and assure their control of their personal data from the design stage (Principle 6); to ensure that data is accurate (Principle 3); to develop a comprehensive legal framework (Principle 8); and create mechanisms for independent oversight, grievance redress, and enforcement (Principle 10). Combined with other measures to ensure that ID systems are inclusive, designed in an interoperable and sustainable way, and meet the needs of a variety of users, proactive data protection measures are therefore essential for building ID systems that can meet development goals in the digital era. Table 2. Principles for developing ID systems PRINCIPLES INCLUSION: 1. Ensuring universal coverage for individuals from birth to death, free from UNIVERSAL discrimination. COVERAGE AND 2. Removing barriers to access and usage and disparities in the availability of ACCESSIBILITY information and technology. DESIGN: 3. Establishing a robust—unique, secure, and accurate—identity. ROBUST, SECURE, 4. Creating a platform that is interoperable and responsive to the needs of various RESPONSIVE AND users. SUSTAINABLE 5. Using open standards and ensuring vendor and technology neutrality. 6. Protecting user privacy and control through system design 7. Planning for financial and operational sustainability without compromising accessibility GOVERNANCE: 8. Safeguarding data privacy, security, and user rights through a comprehensive BUILDING TRUST legal and regulatory framework. BY PROTECTING 9. Establishing clear institutional mandates and accountability. PRIVACY AND 10.  Enforcing legal and trust frameworks through independent oversight and USER RIGHTS adjudication of grievances. These Principles have now been endorsed by more than 20 organizations. See World Bank. 2017. Principles on Identification for Sustainable 21  Development: Toward the Digital Age. Washington, DC: World Bank Group. http://id4d.worldbank.org/principles. DIGITAL ID AND THE DATA PROTECTION CHALLENGE 7 Conclusion Digital ID systems can offer new possibilities for achieving sustainable development goals—if they are inclusive and trustworthy. When designed appropriately, digital ID systems can be more secure than analogue systems, with stronger, more intelligent, and more easily monitorable data protection measures, which in turn offer better guarantees of data privacy. Taking advantage of these benefits, however, requires purposeful preventative action and an ongoing commitment to identifying and mitigating potential threats. This involves adopting a privacy-and-security- by-design approach from the beginning—not as an afterthought—starting with the development of a legal and institutional framework. This framework should provide extensive data protection and privacy guarantees through the application of international principles and effective oversight and be supported by comprehensive and complementary organizational and technical controls. Only by taking data protection seriously will digital ID systems live up to their transformative potential. About ID4D The World Bank Group’s Identification for Development (ID4D) Initiative uses global knowledge and expertise across sectors to help countries realize the transformational potential of digital identification systems to achieve the Sustainable Development Goals. It operates across the World Bank Group with global practices and units working on digital development, social protection, health, financial inclusion, governance, gender, legal, and among others. The mission of ID4D is to enable all people to access services and exercise their rights by increasing the number of people who have an official form of identification. ID4D makes this happen through its three pillars of work: thought leadership and analytics to generate evidence and fill knowledge gaps; global platforms and convening to amplify good practices, collaborate, and raise awareness; and country and regional engagement to provide financial and technical assistance for the implementation of inclusive and responsible digital identification systems that are integrated with civil registration. The work of ID4D is made possible with support from the World Bank Group, Bill & Melinda Gates Foundation, the UK Government, the Australian Government and the Omidyar Network. To find out more about ID4D, visit id4d.worldbank.org. To participate in the conversation on social media, use the hashtag #ID4D. DIGITAL ID AND THE DATA PROTECTION CHALLENGE 8 ID4D.WORLDBANK.ORG