If you are a builder, it’s an amazing time to be creating web applications, there are so many one-click hosting platforms - Vercel, Netlify, Render, Fly, and about a dozen others that I’ve forgotten about. Heroku pioneered the “global deployment with one cli command” for Rails apps all the way back in 2007, but platforms today can support multiple languages/frameworks, native and container runtimes, and hook directly into SCM systems like Github to eliminate the need for a cli. (In fact, if you are reading this on JumpWire’s blog, you are looking at Netlify, and we release new versions of our site simply by merging PRs into a Github repo.)

However serverless SQL databases, particularly PostgreSQL, have been lagging a bit, at least in a pure “serverless” configuration. Sure, Heroku and AWS provide database instances on demand, but they are “always on” processes with time-based pricing instead of utilization-based pricing. (DynamoDB fits the utilization-based model, but good luck trying to use SQL!)

This is in spite of SQL RDBMS still being the persistence layer of choice for most frameworks, and most developers! We love to model data relationally, it’s easy to reason about, and plenty of people know how to use SQL.

But now this is starting to change! Startups like Neon.tech have launched in the last year to fill this gap, providing serverless PostgreSQL to the masses. Vercel also recently announced that they are partnering with Neon to offer turnkey PG databases to apps running on the Vercel platform.

Servers are dead
Run my own database servers? naaaah

What’s the catch?

One immediate concern of this architecture marvel is with security, as your database must be connected directly to the internet. This is by design, it’s how you as the customer is going to connect to the db. But it comes with immense risk - if by mistake your credentials are leaked (who hasn’t committed passwords into code before?), it’s trivial for someone to run off with all of your data! Unfortunately, these sort of leaks aren’t so uncommon, here’s one from May 2022 - https://www.scmagazine.com/analysis/breach/digital-health-company-mynurse-reports-data-access-will-stop-operations

This is in contrast to the “always on” database server, which can be deployed into a private VPC. A VPC creates a network segmentation from the Internet, offering another layer of security defense around your data.

So what’s to be done, do we need to manage our own database infrastructure just to protect the data inside?

Let’s use JumpWire, this is a great use-case for field encryption!

Setup

In this example, our app collects customer personal information like names, addresses, and social security numbers. We’re going to deploy an API on Fly.io using express.js, and it’s backed by a database running on Neon.tech.

To get started, head to Neon and set up an account. They have a generous free tier (at the time of writing) which is great for small projects. When you log into the dashboard, you’ll see connection details for your database instance.

Neon dashboard
Neon dashboard with connection info

You can even connect a psql client using Neon’s “passwordless” connections, just pass pg.neon.tech as the host -

oh-hello:~ $ psql -h pg.neon.tech
NOTICE:  Welcome to Neon!
Authenticate by visiting:
    https://console.neon.tech/psql_session/178...

NOTICE:  Connecting to database.
psql (13.10 (Ubuntu 13.10-1.pgdg20.04+1), server 15.2)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

ryan=>

The schema for this example is very simple, Accounts and AccountOwners. Here’s the initial schema file in sql -

-- ./schema.sql
SET client_encoding = 'UTF8';
SET time zone 'UTC';
SET standard_conforming_strings = off;
SET client_min_messages = warning;
SET escape_string_warning = off;
SET search_path = public, pg_catalog;

CREATE TABLE IF NOT EXISTS accounts (
  id uuid PRIMARY KEY NOT NULL,
  account_owner_id TEXT NOT NULL,
  account_number TEXT NOT NULL,
  routing_number TEXT NOT NULL,
  account_status TEXT NOT NULL,
  name TEXT NOT NULL,
  created_at timestamp with time zone DEFAULT CURRENT_TIMESTAMP,
  updated_at timestamp with time zone DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE IF NOT EXISTS account_owners (
  id uuid PRIMARY KEY NOT NULL,
  first_name TEXT NOT NULL,
  last_name TEXT NOT NULL,
  email TEXT NOT NULL,
  ssn TEXT NOT NULL,
  cip_status TEXT NOT NULL,
  date_of_birth DATE NOT NULL,
  gender TEXT NOT NULL,
  created_at timestamp with time zone DEFAULT CURRENT_TIMESTAMP,
  updated_at timestamp with time zone DEFAULT CURRENT_TIMESTAMP
);

We can create the schema from the cli -

ryan=> i ./schema.sql
SET
SET
SET
SET
SET
SET
CREATE TABLE
CREATE TABLE

The API using Neon

Our express API is also pretty straight-forward, there’s an endpoint to create a new Account with an AccountOwner. It uses Sequelize to define the models, and connects to the Neon database using the connection string from the Neon dashboard. Note: for Sequelize to use SSL when connecting to Neon, we have to specify the attribute in the connection string -

//./server.js
import express from 'express'
import { Sequelize } from 'sequelize'
// factory method for defining models
import loadModels from './models.js'
// ... a bunch more imports

const sequelize = new Sequelize(
  'postgres://ryan:..........@ep-spring-sea-761357.us-east-2.aws.neon.tech/neondb?ssl=true'
)
const { Account, AccountOwner } = loadModels(sequelize)
 
const app = express()
const port = process.env.PORT || 3000
const host = process.env.HOST || 'http://localhost:3000'
// ... more app setup

app.post("/account-owners", async (req, res) => {
  console.log(
    `Creating new account: ${Object.entries(req.body).map(([k, v]) => `${k}=${v}`).join(", ")}`
  );
  const accountOwner = await AccountOwner.create(req.body);
  const account = await Account.create({
    accountOwnerId: accountOwner.id,
    routingNumber: "00100001",
    accountNumber: generateAccountNumber(),
  });
  res.send({ accountOwner, account });
});

// ... more endpoints

I spy a 🐛

Can you spot the security bug in the code above? It’s logging the request body, which contains customer data! This is very risky, as it’s likely that server logs are getting shipped to other systems which might not have strict access controls. But it’s any easy bug to miss, just by looking at this code a reviewer might not be aware that the full request parameters contains sensitive info.

JumpWire is going to rescue us in this situation! Using JumpWire’s HTTP proxy, we can clone our API by uploading an OpenAPI spec file. Any JSON property that contains PII as part of an endpoint request or response is labeled appropriately.

POST request field labels
POST request fields with PII labels

We also label fields in the GET response. This ensures the encryption is reversed (decrypted) when we need to fetch details on the AccountOwner.

GET response field labels
GET response fields with PII labels

You might notice one difference though, we’ve removed the label for the social security number (ssn) on the GET response. Why do this? For extra data protection, we never want to show a ssn in our application. It was initially collected for identity verification, and we are storing it for audit purposes, but it should never be exposed through our APIs.

We are using a policy to encrypt/decrypt PII properties as they pass through the proxy. This means that data is protected with AES encryption before it is stored in Neon. In fact, here’s what the real data looks like:

$ curl "https://backend-demo.fly.io/1658d1bf-7cde-46af-8f62-f0611d431cc2/account-owners/1a451a3a-c9bc-4e08-8b58-6fc32f83b467" -H 'Authorization: Bearer SFMyXXX.g2g...ZsLXXX' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   770  100   770    0     0   4638      0 --:--:-- --:--:-- --:--:--  4638
{
  "cipStatus": "UNVERIFIED",
  "createdAt": "2023-04-13T02:06:56.348Z",
  "dateOfBirth": "1999-01-01T00:00:00.000Z",
  "email": "jumpwire_AQNwaWkBK0FFUy5HQ00uVjEuMURCNEI2NUY3Q0ZDMzE1QjZCNTdGOEZGRDg0MTUxNTnC5lb6QSfbgEKQoXehS7ESgIh4Ly8+GKheMHSeDWq5Uc8g603MOZKjtj54RCUIL+o=",
  "firstName": "jumpwire_AQNwaWkBK0FFUy5HQ00uVjEuMURCNEI2NUY3Q0ZDMzE1QjZCNTdGOEZGRDg0MTUxNTm3dBVWg/owZi9Nf27lqWcYo3OtQaBye5lhcG096QrhmA==",
  "gender": "Male",
  "id": "1a451a3a-c9bc-4e08-8b58-6fc32f83b467",
  "lastName": "jumpwire_AQNwaWkBK0FFUy5HQ00uVjEuMURCNEI2NUY3Q0ZDMzE1QjZCNTdGOEZGRDg0MTUxNTmnQcPx4edyY+GHdU+sxfcl/EBc00eqQCDXY/CKLdk1Y/Y=",
  "ssn": "jumpwire_AQZzZWNyZXQBK0FFUy5HQ00uVjEuMURCNEI2NUY3Q0ZDMzE1QjZCNTdGOEZGRDg0MTUxNTn9Cs7xgPa0m5uVl1OXA52G5vh5uXfezZxIeGq2WAseX5aPjbxN8yQ=",
  "updatedAt": "2023-04-13T02:06:56.348Z"
}

Of course there are instances where we need to show the original data, like in the account details screen below. The HTTP proxy will reverse the encryption for the GET request, so the app receives plain text for display.

Data decrypted for display
Fields are decrypted before display in an app

Secure data flows

Putting it all together, the diagram below shows how the data flows through our application backend. Because the encryption/decryption of data is happening on the “edge” of our API, data stays secure through its entire lifecycle in our backend and database.

We’ve instantly upgraded the security of our serverless application, without running our own database server or modifying any code in our backend API!

Data flow
Instantly adds a layer of security

Of course, there are situations where we do need our backend API to operate on the plain text. For example, if we want to send a programmatic email to welcome our new customer, we’ll need their email address and name decrypted. Don’t worry, JumpWire can handle these use-cases as well, which we’ll show in a follow up post.

  • Ryan