No items found.

Products

Solutions

Company

Vaults

PII

Secure PII across your tech stack

Payments

Streamline PCI compliance

Healthcare

Protect PHI and satisfy HIPAA

GenAI

Keep sensitive data out of AI models

News

Skyflow Enables Global Data Residency for ServiceNow Customers

May 7, 2025

By Use Case

By Compliance

Live Webinar

WEBINAR

Privacy-Aware Data Collaboration with Snowflake and Skyflow

Wednesday, July 23rd at 9 AM PT / 12 PM ET

Led by Skyflow CPO Amruta Moktali and Snowflake Healthcare & Life Sciences Principal Lisa Arbogast, this webinar digs into the challenge of protecting sensitive data while ensuring secure data collaboration and interoperability.

Company

Community

Next Event

Databricks Data + AI Summit

Stop by our booth at the Data + AI Summit to discover how Skyflow protect sensitive data and helps Databricks customers build faster with built-in data residency and analytics solutions that put your customers’ privacy first.

June 9-12, 2025

San Francisco, CA

Meet Us There

Resources

Blog

Transforming Global Clinical Trials with Snowflake and Skyflow

A new approach for conducting global clinical trials, Snowflake’s AI Data Cloud with Skyflow’s Data Privacy Vault enable interoperability and collaboration on sensitive health data.

June 6, 2025

This is also a heading
This is a heading

Become DPDP Compliant by Protecting Personal Data with a Privacy Vault

Watch our webinars

No items found.

This is some text inside of a div block.

Heading

Securing Your Databases Is Good, Securing Your Data Is Better

August 16, 2021

[Earth had] a problem, which was this: most of the people living on it were unhappy for pretty much all of the time. Many solutions were suggested for this problem, but most of these were largely concerned with the movements of small green pieces of paper, which is odd because on the whole it wasn’t the small green pieces of paper that were unhappy. — Hitchhiker's Guide to the Galaxy

You’ve heard about data breaches and what they do to company (and employee) fortunes, so you’re working hard to secure your database — upgrade, firewall, encryption, auditing, etc. Oh yes, and remember to change the default password. What about access control? Do you have the right policies? It feels like I'm forgetting something!

As a security professional, I like more security. You can, and should, nay, must, do all of the above; but not just for your databases. You have to look at your backups, secure the servers that process this data, the data pipelines that move this data around, the logs your applications generate into which data can leak, and so on.

As a security professional, I like more job security. However, you don’t have to be a security professional to see that this is, at best, an indirect path to the most important thing that you were trying to secure — the data itself.

What is Tokenization?

When securing data, the first step is to encrypt it. If you’re following all the best practices — choosing the appropriate encryption algorithms, IV construction, chaining modes, key generation, key management, etc. — you’ve essentially replaced your sensitive data with ciphertext. Ciphertext is a bunch of bytes which are meaningless to anyone without access to decryption keys, and which leak very little information. This is a great start!

But, there is no lunch without small green pieces of paper. Apart from hiring a cryptography nerd, you now have folks complaining you broke their system — customer service was using the last four digits of social security numbers (SSNs) to validate users and now they don’t have that; the analytics stack was built on software that assumes the email field (which analytics may never read) will always look like an email and now they need to fix their stack. You can solve this by giving everyone access to the decryption keys, but if you do that, you’re not much better off than when you started.

The main point of the encryption-based approach — replacing the sensitive data with “something else” — is exactly right. What you need is to replace the sensitive data with a “something else” that solves these new problems, as well.

What you need is tokenization. With tokenization, just like with encryption, you rip out your sensitive data and replace it with a placeholder — a token. Unlike encryption, you don't have to worry about an “adaptive chosen ciphertext attack” (or some cryptographic attack not yet discovered), because a good token generation scheme will neutralize these attacks.

Tokenization is not encryption, which means that de-tokenization is not decryption! So, you don't have to hand over access to your keys to anyone who wants to work on the tokenized data because of:

Format preservation: Your tokens can be “format preserving” — they can look like email addresses, or social security numbers, etc. so that old code that was used to seeing email addresses doesn’t crash
Partial detokenization: You don’t have to give customer service access to the entire SSN just so that they can compare the last four. You can, given the right tokenization solution, make sure customer service can detokenize ONLY the last four digits. This allows you to follow the security design principle of “least privilege” — that every process or user should be able to access only the information needed to do their job.

In some cases, you can choose to have tokens that allow some computations (e.g., find common users across different datasets) that might otherwise have required access to sensitive data. These tokens expose just enough information to be useful (e.g. whether or not two records refer to the same user). The key point is that you get to control the exact tradeoff between the security and usability of your data using simple token configuration; not complex cryptography. To learn more about how tokenization works at Skyflow, take a look at our documentation.

This is really just scratching the surface. We haven’t talked about your other problems around compliance, data-residency, governance, etc. We also have barely discussed the various kinds of tokens and their information-hiding properties.
‍
If you’d like to dig deeper into the pros and cons of encryption vs. tokenization, I encourage you to check out our data vault white paper. If you want to know more about how to use a data vault to solve these problems, please contact us.

Securing Your Databases Is Good, Securing Your Data Is Better

August 16, 2021

What is Tokenization?

Tokenization is not encryption, which means that de-tokenization is not decryption! So, you don't have to hand over access to your keys to anyone who wants to work on the tokenized data because of:

Format preservation: Your tokens can be “format preserving” — they can look like email addresses, or social security numbers, etc. so that old code that was used to seeing email addresses doesn’t crash
Partial detokenization: You don’t have to give customer service access to the entire SSN just so that they can compare the last four. You can, given the right tokenization solution, make sure customer service can detokenize ONLY the last four digits. This allows you to follow the security design principle of “least privilege” — that every process or user should be able to access only the information needed to do their job.

Table of Contents

Become DPDP Compliant by Protecting Personal Data with a Privacy Vault

Heading

Securing Your Databases Is Good, Securing Your Data Is Better

What is Tokenization?

Related Content

India’s DPDP Rules 2025: Critical Highlights & How to Comply

India’s DPDP Rules 2025: Critical Highlights & How to Comply

Increase Flexibility and Boost Profits with a Modernized Payment Stack

What if privacy had an API?

April 16, 2025

Securing Your Databases Is Good, Securing Your Data Is Better

What is Tokenization?