Go to menu

DASCore, a library for databases

January 2021

At my secondary school, a part of programming classes was doing larger projects over the whole school year. In the third year, this took the form of a group project. We decided to build a new system for managing the school canteen and I took this as an opportunity to turn what would normally be a pretty boring CRUD system into something more interesting by developing custom database abstractions and a new web frontend toolkit. In the end, we didn’t manage to build a commercial-grade system, but we did get a significant amount of experience with actually building a usable system on these abstractions.

The ideas and paradigms used in DASCore, the database abstraction layer, are perhaps the most interesting.

Direct access and security

DASCore is a loosely-defined collection of database usage paradigms and PostgreSQL scripts. At the core is the notion that the database should be directly accessible and should therefore contain everything needed to ensure security and consistency.

“Direct access” in this case means actually creating a network service that allows users (or software running on the users’ computers) to run SQL queries on the database server. This by itself goes against most database best practices and necessitates a lot of further changes to the approach of building a database.

Security is the biggest immediate problem with this design. DASCore solves it with its own implementation of row-level security, which can be described as an implicit WHERE clause describing which rows the user is allowed to see/modify. This not only allows secure direct database access, but also moves the definitions of permissions closer to the data definitions, allowing both to be tackled at the same time. This isn’t a big deal if all you have are two read/write endpoints per table, but if you have a complex API with a large number of endpoints per table, you’ll need to implement the security model consistently, which can be problematic even if the endpoints are implemented by the same person at the same time.

Traditional RLS implementations use database users for end-user identities, which is manageable, but can be bothersome, especially if users are stored in an external system. I’ve seen approaches with signed user tokens, but I find these too heavy, since they can potentially compute cryptographic operations many times per query. DASCore uses a simpler design that relies on PL/Perl, (ab)using the fact that users can’t create trusted functions and that global variables are stored per-session.

When the application server identifies a user, it calls session_user_set() with a user ID and a secret logout key. Any SQL code can then call session_user_get(), which returns the previously set user ID. Until the user is logged out, any calls to session_user_set() fail, so that users couldn’t impersonate others. After the user’s commands are executed, the server calls session_logout() with the same logout key, which returns the session to the initial state.

This “secondary login” is relatively lightweight (the cost is a perl interpreter for every session, which could be replaced with a C extension) and is compatible with connection pooling.

Time travel

Another aspect of DASCore is its integration with the temporal_tables PostgreSQL extension, which basically provides “versioned” tables, with triggers saving every change to a dedicated history table. DASCore builds upon this by hiding both the current table and history table behind a view, which shows either the contents of the current table, or the state of the table at a globally specified “timepoint”. This allows the user to see the state of the whole database at a specific point without having to write specific queries targeting the history table.

This same view also implements the row-level permission checks from above.

Implementation and helper functions

Doing all this manually for each table would result in a ridiculous amount of code. DASCore deals with this problem by providing a set of functions that create the history table and create the view while setting permissions.

This is how a sample table from our dining system is created:

First we need to create the current and history tables in a migration:

CREATE TABLE IF NOT EXISTS diner_transactions_current (
	id serial PRIMARY KEY,
	id_diner integer NOT NULL REFERENCES diners_current,
	-- Positive for money added to the diner's balance, negative for expenses
	-- taken from the balance
	amount decimal NOT NULL,
	-- Not null if a food order caused the transaction
	cause_food_day date,
	-- Transactions caused by food orders should always be expenses or free
	CHECK (NOT (cause_food_day IS NOT NULL AND (amount > 0)))
);
SELECT version_table('diner_transactions');

Then, in a separate file, the view is created with the permissions checks:

SELECT dascore_setup_table('diner_transactions',
	select_perm := $$
		perm('diner_transactions.select.all')
		OR (perm('diner_transactions.select.self')
			AND ROW.id_diner = session_person_get())
	$$,
	modify_perm := $$
		perm('diner_transactions.modify.all')
	$$);

perm() simply checks if the current user has the given permission. The advantage of separating these two operations is that permissions can be changed without having to create a migration.

A sample session for user 1 could look something like this:

-- Added by the application server
SELECT session_user_set(1, 'SecretKey!');
-- Will only sum the user's own transactions unless they have
-- 'diner_transactions.select.all' permission
SELECT SUM(amount) FROM diner_transactions WHERE amount > 0;
-- Added by the application server
SELECT session_logout('SecretKey!');

Where to next?

I think the ideas of DASCore are worth exploring and eventually using in production and I’m glad that I got to try them out on something more than a trivial example project. That said, DASCore itself has quite a few rough corners in its current state. The permission system is powerful, but maybe too powerful. Since it’s not reasonably introspectible, the frontend has to reimplement much of the rules so that users’ actions don’t end with hard-to-trace database errors. This isn’t a security problem, but ease of development suffers. Furthermore, the way temporal tables are handled doesn’t allow for the creation of a simple changelist.

I think that these faults can be fixed while maintaining the basic spirit of a “library for databases”, I’ve already got some specific ideas that I hope to put to the test in the future.