#postgres

bkoehn@diaspora.koehn.com

Creative ways to optimize for cloud: offsite backups edition

Backblaze, like many providers, charges not only for storage, but also for API usage. In my case that meant that I was spending more on API usage than I was on storage (3-4x).

I spent some time debugging the issue and determined that it was #wal-g, the software I use for streaming #postgres backups that was causing the issue, running 20K+ ListObjects calls per day.

So I configured postgres to backup to a bucket on my local #S3 server (#Minio), and then configured Minio to sync that bucket to #backblaze. That incurs exactly zero charged API calls, and the problem is solved.

What I thought was the problem (#kopia, which I use for filesystem backups) is actually quite efficient wrt/charged API calls.

bkoehn@diaspora.koehn.com

I created an iSCSI volume on my #MyCloud #NAS to hold the #postgres data that the pod uses. ISCSI is fairly fast (85MB/s writes), and more importantly it’s not stored locally on any of the #k3s nodes, so the server can move to any node and still access the data on the volume. That means that I can drain that node, upgrade it, and the downtime is limited to a few seconds.

Basically I’m creating a redundant array of inexpensive compute (I’m running on three $150 mini PCs) to go with my redundant array of inexpensive disks. If a single node or disk fails, the other nodes/disks will pick up the load until the bad unit is replaced.

bkoehn@diaspora.koehn.com

I use a single #Postgres instance in my #k8s clusters, with separate users, passwords, and databases in that instance for each application. It’s simpler to manage and reduces complexity, and my databases don’t benefit from being spread across nodes (they’re small and place few demands on the system). So the same instance hosts the databases for synapse and diaspora, for example.

Today I discovered that objects in the public schema of each database (which is where they’re created by default) are accessible to all users on the database, unless they’re specifically revoked. So the system wasn’t as secure as I thought it was.

You can change this behavior with REVOKE ALL ON DATABASE <name> FROM PUBLIC;. Then only users granted access (or the owner) will have access. You can also create objects in non-public schemas, but that can be challenging.

One more thing to automate.

bkoehn@diaspora.koehn.com

I started messing about with #postgres FTS in #Diaspora.

First, create a search column:

alter table posts add column fts_text_en tsvector generated always as (to_tsvector ('english', text )) stored;

Then, create a GIN a index on the new column:

create index posts_fts_text_en_idx on posts using gin(fts_text_en);

Next, query the column:

select id from posts where fts_text_en @@ websearch_to_tsquery('english', 'Linux');

Fast, simple. Prolly should integrate it into the search bar.