What is PII?

April 3, 2023

What is PII? Definitions vary, but broadly speaking personally Identifiable Information, or PII, is data that can be used to identify an individual. For a user of your product, their PII is the data that they input to identify themselves when creating an account, such as an email address or a phone number.

PII data consists of “linkable” and “unlinkable” data. Generalized data that describes a person’s identity, such as gender, age, date of birth, geolocation, income, or anything else that cannot be used to directly identify a specific individual is called "unlinkable data." Unique data elements such as a name, address, email, social security number, or other information that can be used to directly identify a specific individual is called "linkable data."

*A Diagram Illustrating the Differences between Linkable and Unlinkable Data*

The concept of PII has been around for awhile, but recently it has become more important than ever for businesses to manage and secure the PII that they handle.

In this post, we’ll look at the different types of PII, which types of information can be considered PII, why protecting the privacy and security of PII is more important now than ever before, and how Skyflow Data Privacy Vault can help.

Not All PII is Created Equal

Various authorities define PII in different ways, so a good place to start is to consider what someone can do with a piece of information. For example, while a date of birth is considered “unlinkable” data, it could be used in conjunction with “linkable” data to bypass authentication methods and gain access to vital online resources like bank accounts and health records.

To this point, the US Department of Energy defines “High Risk” PII as information that “if lost, compromised, or disclosed without authorization, could result in substantial harm, embarrassment, inconvenience, or unfairness to an individual.” This includes data such as Social Security Numbers, health and medical information, biometric records (like fingerprints and DNA), financial information (for example, credit card numbers, credit reports, and bank account numbers) and information used for security clearances. In other words, high risk PII is any data that poses a much greater risk than other data to an individual if it falls into the hands of a malicious actor.

Moreover, regulations and laws often refer to PII by many other terms such as “personal information,” “personal data,” or “sensitive information.” Ultimately, however, they all approximate the same thing: data about an individual that, because of its potential to cause harm, needs to be especially protected.

So as companies work to secure PII, these questions remain: Which data needs to be protected? And, what qualifies as PII?

PII Becomes a Priority for Regulators

As businesses have started to collect more data through the proliferation of cloud-based apps and services, governments have introduced new data regulations to ensure that companies are taking measures to protect this data. The EU’s GDPR and California’s CPRA created regulations around how PII is collected, stored, and transacted. These regulations have broadened the definition of PII and require companies to rethink how they collect, store, and process PII.

In many cases, companies have had to make significant investments, build completely new processes, and adopt new technologies to better understand what PII they collect and how it travels through their organization. Oftentimes this means rearchitecting their data stores or data lakes to gain more fine-grained control over how they access and use PII. For many companies, this has resulted in a patchwork of data privacy solutions that creates more overhead without offering much additional protection for PII.

Different Regulations Define PII Differently

Different privacy laws and regulations define PII differently and have different requirements for how PII is stored, manipulated, distributed, and audited. There are also rules around the rights of an individual to request details on which data is collected, how it is used, and its deletion.

However, because technology and cloud-based infrastructure has broadened the horizons of businesses, in order to compete in a global market, companies usually decide to adhere to the strictest policies in order to meet all policies.

PII Under CPRA

While the United States has not yet passed a comprehensive privacy law similar to the EU’s GDPR, individual states have adopted their own laws and requirements. California’s privacy laws are among the strictest in the country. And, thanks to the state’s size, economy, and influence as an epicenter of technological innovation, companies in other parts of the US tend to align their data privacy practices with those of California.

The new California Privacy Rights Act (CPRA) – which amends 2018’s California Consumer Privacy Act (CCPA) – defines a new level of data known as “Sensitive Personal Information” (SPI). Sensitive Personal Information includes login ID and password, precise geolocation, race and ethnicity, sexual orientation, and genetic data. CPRA also expands the protection of California residents’ personal information beyond just the consumer (“business to consumer” or “B2C”) data that’s within the scope of CCPA. With CPRA, business to business (B2B) data, HR data, and personal information held in other contexts have the same protections as consumer data.

*A Diagram Illustrating How CPRA Considers Salary to be Sensitive Personal Information*

PII Under GDPR

The General Data Protection Regulation (GDPR) is the EU’s set of privacy requirements governing both companies conducting business within the EU, or with EU residents. Similar to CPRA, because of the economic importance of EU markets and the strictness of the GDPR, it has become the data privacy standard for companies all over the world.

GDPR expands the definition of PII (or “personal data” as it is referred to in the legislation itself) to include financial information, login credentials, biometric data, photographs that clearly present an individual, geographic location data, race, gender, political opinions, religious or philosophical beliefs, cookie IDs, and IP addresses. This makes GDPR’s definition much more extensive than the basic definition of personal data that includes such basics as first and last name and home address.

More specifically, Article 4 of the GDPR defines personal data as: “any information relating to an identified or identifiable natural person (“data subject”); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.”

Here’s an illustration of some of the types of data that quality as personal data under:

*A Diagram Showing Two Examples of Personal Data Regulated by GDPR: Gender and Salary*

PII According to the NIST

The National Institute of Science and Technology (NIST) compiled a Guide to Protecting the Confidentiality of PII that serves as a de-facto standard for science and technology organizations that work with the US government.

The NIST includes the following as PII:

Name, such as full name, maiden name, mother's maiden name, or alias
Personal identification number, such as social security number (SSN), passport number, driver's license number, taxpayer identification number, or financial account or credit card number
Address information, such as street address or email address
Personal characteristics, including photographic image (especially of face or other identifying characteristic), fingerprints, handwriting, or other biometric data (e.g., retina scan, voice signature, facial geometry)
Information about an individual that is linked or linkable to one of the above (e.g., date of birth, place of birth, race, religion, weight, activities, geographical indicators, employment information, medical information, education information, financial information)

*A Diagram Illustrating the NIST’s Classification of Gender, Date of Birth, and Salary as PII*

So, What Really Counts as PII?

As you can see from reading this summary of how PII (or “personal information”) is defined by various laws and organizations, there is a lot of variation between these definitions.

And that means that the safest approach for organizations who want to protect this vital data is to use a broad definition of PII so they can ease compliance with current regulations and position themselves to easily comply with future regulations.

The Age of Data Privacy

Because PII is a fundamental part of how customers use your products, you need an efficient way to use PII without putting customer data at risk. You need a new approach that re-thinks how PII data is stored and used throughout your organization. And you should be able to protect your customers’ most sensitive data without sacrificing data utility. It adds up to a new paradigm: The Age of Data Privacy.

At Skyflow, we’ve developed a zero trust data privacy vault that uses privacy-preserving technology to protect PII data, while allowing you to use this data to drive business growth. With Skyflow, you can store your sensitive data in an isolated, centralized data store and reduce the amount of PII you store in other applications and systems. This approach reduces the overall exposure of PII within your systems, and isolates it to a central store that you can govern with access control policies.

Skyflow was developed to help companies keep up with the ever-evolving patchwork of privacy laws and regulations. Skyflow Data Privacy Vault not only helps you isolate, protect, and govern PII, it also eases compliance with data privacy laws and regulations, including requirements like data residency.