Column encryption

solutions and ideas

Peter Eisentraut

peter@eisentraut.org
@petereisentraut@mastodon.social

https://www.enterprisedb.com/

Why encrypt?

Database encryption variants

Vendor terminology varies!

Standards and regulations

Client-side column-level encryption

          client                                      DB
+-------------------------+                      +-------------------+
|                         |                      | id| data          |
| INSERT         +--------| (1, '\xd3b07384d1')  | --+-------------  |
| VALUES         | libpq  |--------------------->| 1 | \xd3b07384d1  |
| (1, 'secret')  +--------|                      |   | ...           |
|                         |                      |                   |
+-------------------------+                      +-------------------+

Client-side column-level encryption

-- must use parameters!

INSERT INTO tbl VALUES (1, 'secret');     -- no, bad!
INSERT INTO tbl VALUES ($1, $2);          -- ok

SELECT * FROM tbl WHERE id = 1;           -- ok, not encrypted
SELECT * FROM tbl WHERE id = $1;          -- ok, not encrypted

-- lookup by encrypted fields?

SELECT * FROM tbl WHERE data = 'secret';  -- bad!
SELECT * FROM tbl WHERE data = $2;        -- maybe? (see later)
SELECT * FROM tbl WHERE substr(lower(data, ...));  -- probably not

Client-side column-level encryption

Client-side column-level encryption

Pro:

Con:

Solution: Client code only

from somewhere import encrypt

key = ...

conn = somedriver.connect(...)
cur = conn.cursor()
cur.execute("INSERT INTO t1 (name, creditcard) VALUES (%s, %s)",
            ('Jan', encrypt(key, '1234')))

cur.execute("SELECT creditcard FROM t1 WHERE name = %s", ('Jan',))
row = cur.fetchone()
print(decrypt(key, row[0]))

Client code — Pros and cons

Pros:

Cons:

Solution: pgcrypto (dubious!)

CREATE EXTENSION pgcrypto;

-- bad
INSERT INTO t1 (name, creditcard)
  VALUES ('Jan', encrypt($1, 'KEYHERE', 'aes-cbc'));

SELECT decrypt(creditcard, 'KEYHERE', 'aes-cbc')
  FROM t1 WHERE name = 'Jan';
-- slightly better?
INSERT INTO t1 (name, creditcard)
  VALUES ('Jan', encrypt($1, pg_read_file('keyfile'), 'aes-cbc'));

SELECT decrypt(creditcard, pg_read_file('keyfile'), 'aes-cbc')
  FROM t1 WHERE name = 'Jan';

pgcrypto — Pros and cons

Pros:

Cons:

Solution: framework/ORM

example Django

from django.db import models

from django_cryptography.fields import encrypt

class MyModel(models.Model):
    name = models.CharField(max_length=50)
    sensitive_data = encrypt(models.CharField(max_length=50))

Framework/ORM — Pros and cons

Pros:

Cons:

Interlude: Encryption methods

Encryption terminology

randomized:

encrypt(key, 'foo') = 'd3b073'
encrypt(key, 'foo') = '978e5b'

deterministic:

encrypt(key, 'foo') = 'd3b073'
encrypt(key, 'foo') = 'd3b073'

homomorphic:

encrypt(key, 'foo') = 'iga'
encrypt(key, 'bar') = 'Ogo'
encrypt(key, 'foobar') = 'igaOgo'

Encryption: padding

-- dubious
encrypt(key, 'none') =               '7e5b152f'
encrypt(key, 'some text') =          'b12ae13f59e1dc69b4'
encrypt(key, 'very long ... text') = '5372b3............66dd'
-- better
encrypt(key, 'none') =               '46366f48e18fe88d1b51affe5c5e5048'
encrypt(key, 'some text') =          '0bf4a9eda093afe048d11eec6edd7988'
encrypt(key, 'very long ... text') = 'bd8dc6c34d2235dfa21f355547b5b99d ....'

revealing length cannot be avoided completely

Encryption modes

mode standard padding nonce reuse res. integrity FIPS OpenSSL JDK
AES-CBC + HMAC no yes yes yes yes yes yes
AES-SIV yes no yes no no 3.0 no
AES-GCM yes no no yes yes yes yes
AES-GCM-SIV yes no yes yes no 3.2 no
ChaCha20-Poly1305 yes no no yes no yes yes
XChaCha20-Poly1305 no no no yes no no no
XChaCha20-SIV no no yes yes no no no
your own no ? ? yes? ? ? ?

Solution: pgsodium

CREATE EXTENSION pgsodium;

SELECT * FROM pgsodium.create_key();

CREATE TABLE private.users (
  id bigserial PRIMARY KEY,
  secret text
);

SECURITY LABEL FOR pgsodium ON COLUMN private.users.secret
  IS 'ENCRYPT WITH KEY ID dfc44293-fa78-4a1a-9ef9-7e600e63e101';

pgsodium — Pros and cons

Pros:

Cons:

Idea: built-in

CREATE COLUMN MASTER KEY cmk1;

CREATE COLUMN ENCRYPTION KEY cek1
  WITH VALUES (column_master_key = cmk1,
               algorithm = 'RSAES_OAEP_SHA_1',
               encrypted_value = '\x53cd6a6c91...');

CREATE TABLE employees (
  id int PRIMARY KEY,
  name text NOT NULL,
  ...
  ssn text ENCRYPTED WITH (column_encryption_key = cek1,
                           encryption_type = deterministic)
);

Built-in — Pros and cons

Pros:

Cons:

Key management

          client                                      DB
+-------------------------+                      +----------+
|                +--------| (1, '\xd3b07384d1')  |          |
| (1, 'secret')  | libpq  |--------------------->|          |
|                +--------|                      |          |
+-------------------------+                      +----------+
                     ^
                     |
                 +--------+
                 |  KMS   |
                 +--------+

Key management

Other feature ideas

Other caveats

Summary