No items found.

Isolate and Protect: PII Is Special

No items found.
April 29, 2022

Businesses run on data and they’re collecting more and more all the time. But not all data is equally important. Some data, like sensitive user data, requires better protection than others. Some data is special.

In the backends of many products, there’s a database with a users table (or equivalent) containing columns labeled name, email, phone number, and address. This data is treated and protected the same as any other application data. 

A security perimeter is put around it, but from within the perimeter, applications (and sometimes employees) have full access. It’s the opening chapter of the story of any data breach.

That’s because this design — treating users’ PII data like any other application data — is fundamentally flawed. Users' data is special and must be treated that way. 

Just as I wouldn’t throw my passport, kid’s birth certificates, and marriage license in the junk drawer in my kitchen with my flashlights and batteries, user data doesn’t belong in your application storage intermixed with your other data. User data must be isolated and protected.

In this article, we explore this topic in detail, making a case that the only way to meet consumer and regulatory demands for data privacy is to fundamentally change our mindset about how to store and manage PII.

Saying Goodbye to the Wild West of Data Privacy

For years, businesses have been living in the “Wild West” of data privacy. As a consumer, I’ve created thousands of user accounts where I’ve entrusted businesses with my personal information. This includes data like my name, email, home address, and sometimes even more sensitive information like my credit card, bank information, and social security number. I’m trusting that these businesses are properly protecting this data.

Yet, as a software engineer, I also know that my data is often not that well protected. For example, Yahoo! had over 1 billion usernames, email addresses, telephone numbers, hashed passwords, and security questions/answers stolen. And we all remember when Equifax lost 145.5 million social security numbers. 

In 2020, $56 billion was lost to fraud and identity theft. Data breaches and leaks are happening more and more as the volumes of data that organizations collect grows exponentially. PII is one of the most common targets for cybercriminals, but it is one of the least regulated.

Data Privacy is No Longer Optional

Technology has been moving faster than regulations, empowering businesses to do what they want with user data. Data privacy has been an option for businesses, not a requirement. 

Historically, businesses have considered user data to be company property and that has led to businesses treating customer data the same as any other data. It’s not often encrypted or de-identified, access isn’t tightly controlled, and it’s duplicated everywhere within the company’s infrastructure.

But times are changing. There’s an increase in consumer and regulatory demands that not only makes data privacy not “optional”, it also fundamentally shifts the notion of data ownership from the business to the user. 

This history of treating all data the same makes it nearly impossible to adapt to these new demands and ways of thinking. The mistake was to not treat this data as special from the start. But companies must now shift their way of thinking to recognize this data is special, and as such, it must be handled differently. Your customers are counting on you to make this shift.

The Problem with Intermixing User and Application Data

In any business of a reasonable size, there are certain applications that are critical to the operation of that business. For example, a business will likely have an HR application for their employees, an applicant tracking system (ATS) for hiring, and a CRM for managing prospects and sales. 

Each of these applications has the equivalent of a users table. The HR application’s users table stores employee information while the ATS’s users table stores candidate details and applications. To make use of the data spread across these independent systems, data might be collated within a data lake and a data warehouse as shown below:

Example Applications for Business Operations

Over time, a business might add additional applications to support business operations. For example, a talent app, billing, and machine learning might be introduced, creating more dependencies between all of these systems. 

Example Evolution of Applications for Business Operations

Since the sensitive user data from all of these applications is being treated the same as regular application data, all of that sensitive user data is copied and replicated throughout this infrastructure. There’s no single “source of truth” for sensitive data.

Additionally, the data is fragmented between different application databases. This makes it extremely difficult to control access or even know which services and users can see this sensitive data.

Isolating and Protecting PII

Data duplication and fragmentation is at the heart of many of the challenges companies face when trying to protect user data. These issues make it nearly impossible to answer questions like: 

  • What data are we storing and where are we storing it? 
  • Can we be sure we’re meeting regulatory requirements? 
  • Can we be sure our systems are locked down and our customers’ data is safe?

The way to solve these issues is by recognizing that PII is special and that all companies must isolate and protect it. When you look at some of the leading technology companies in the world like Apple, Netflix, Google, Shopify, and others, they’ve all taken this approach. They recognized that customer data, which is core to their business, can’t be treated like regular application data. Instead, it must be isolated, protected, stored, and managed within a zero trust Data Privacy Vault.

Protecting PII in a Data Privacy Vault 

A data privacy vault provides a logical and technical separation from your application and application data. It serves as a single source of truth for your customer’s most sensitive data, significantly simplifying the effort it takes to lock down data access.

Adapting the prior example, once the vault is introduced, the web of interdependencies is significantly reduced. Every application that needs PII goes to the vault.

Solving Data Privacy By Isolating and Protecting PII

Access can be controlled through policies rather than data duplication. 

In the example below, the vault is storing the customer’s email address while the application database is storing a tokenized version of that email. Some services will need access to the plain-text value of the email (“Readers that need PII”) while others only need partial information and many won't need the email at all.

Single Source of Truth, Access Controlled Through Policies

By having one place for all PII, fine-grained access control is easy to manage. Additionally, it significantly reduces the scale of a potential breach. Even if someone stole the credentials of a service, they only have access to the data that the particular service needs. Most services in your application don’t need sensitive data at all, and no service should need access to all user PII.

A Failed Experiment

Treating user PII as regular application data is a failed experiment. It makes solving the challenges of data privacy — like governing access, complying with data residency requirements, and protecting user data through encryption and tokenization — intractably complex problems to solve.

For example, in the 2016 Uber data breach hackers discovered credentials accidentally committed to GitHub. Those credentials gave them access to Uber’s network that contained sensitive data hosted on AWS, resulting in PII from 600,000 drivers being compromised. There’s no reason for engineers to have credentials that give them that level of access. It’s a failure caused by not having the right mindset and the right architectural approach.

Protect to Earn their Respect

You can prevent these types of failures by shifting your thinking from a naive view — that all data is the same — to recognizing that PII should be treated differently than other data. You must recognize that PII needs to be isolated and protected. 

All businesses have a responsibility to their customers to understand this and act accordingly. Data privacy isn’t purely about rules and regulations. It’s also about respect for those that use your products. And the best way to respect your customers is to protect their data. To learn more about how Skyflow Data Privacy Vault can help you to isolate and protect your customers’ data, contact us.

Keep Reading

HIPAA
PHI
Healthcare
Compliance
December 7, 2020

Build Fast and Don’t Break Privacy

Skyflow announces its Series A raise of $17.5 million, led by Canvas Ventures.
Secure Analytics
PII
April 6, 2021

Auth0 Was Destined to Fail. What Happened?

Learn how the authentication and authorization solution provider, Auth0, was so successful despite so many obstacles working against them.
AI, LLM & Privacy
July 25, 2024

What is Polymorphic Encryption?

Polymorphic encryption is ideal for use cases where you need to secure data without removing access to it. Learn more about how it works.