π οΈ Three backup methods. Pick the right one.
PostgreSQL gives you three options. Each solves a different problem.
1. pg_dump β The snapshot
bash# Full database dump, compressed
pg_dump -Fc -Z 6 mydb > backup_$(date +%Y%m%d).dump
# Restore
pg_restore -d mydb backup_20260210.dump
Good for: Small databases (<10 GB), simple projects, quick exports.
Bad for: Large databases (slow), point-in-time recovery (impossible).
2. pg_basebackup β The full copy
bash# Complete cluster backup
pg_basebackup -D /backups/base \
-Ft -z -P -v \
-U replicator
Good for: Full server restore, setting up replicas.
Bad for: Frequent backups (copies everything every time).
3. WAL archiving β The timeline
sql-- Enable in postgresql.conf
ALTER SYSTEM SET wal_level = replica;
ALTER SYSTEM SET archive_mode = on;
ALTER SYSTEM SET archive_command =
'cp %p /backups/wal/%f';
ALTER SYSTEM SET archive_timeout = 300;
SELECT pg_reload_conf();
Good for: Point-in-time recovery, minimal data loss, large databases.
Bad for: Nothing. This is what you want.
The solo dev answer:
Database < 5 GB?
β pg_dump daily + WAL archiving
Database 5-50 GB?
β pg_basebackup weekly + WAL archiving
Database > 50 GB?
β pg_basebackup weekly + WAL archiving + incremental strategy
WAL archiving is the common thread. It gives you something the others can't: go back to any point in time.
Accidentally deleted production data at 2:47 PM?
Restore base backup β replay WAL β stop at 2:46 PM
Everything back. Like it never happened.
Tomorrow: the complete system that ties all three together.
Copy-paste ready. 3β.
@postgres
PostgreSQL gives you three options. Each solves a different problem.
1. pg_dump β The snapshot
bash# Full database dump, compressed
pg_dump -Fc -Z 6 mydb > backup_$(date +%Y%m%d).dump
# Restore
pg_restore -d mydb backup_20260210.dump
Good for: Small databases (<10 GB), simple projects, quick exports.
Bad for: Large databases (slow), point-in-time recovery (impossible).
2. pg_basebackup β The full copy
bash# Complete cluster backup
pg_basebackup -D /backups/base \
-Ft -z -P -v \
-U replicator
Good for: Full server restore, setting up replicas.
Bad for: Frequent backups (copies everything every time).
3. WAL archiving β The timeline
sql-- Enable in postgresql.conf
ALTER SYSTEM SET wal_level = replica;
ALTER SYSTEM SET archive_mode = on;
ALTER SYSTEM SET archive_command =
'cp %p /backups/wal/%f';
ALTER SYSTEM SET archive_timeout = 300;
SELECT pg_reload_conf();
Good for: Point-in-time recovery, minimal data loss, large databases.
Bad for: Nothing. This is what you want.
The solo dev answer:
Database < 5 GB?
β pg_dump daily + WAL archiving
Database 5-50 GB?
β pg_basebackup weekly + WAL archiving
Database > 50 GB?
β pg_basebackup weekly + WAL archiving + incremental strategy
WAL archiving is the common thread. It gives you something the others can't: go back to any point in time.
Accidentally deleted production data at 2:47 PM?
Restore base backup β replay WAL β stop at 2:46 PM
Everything back. Like it never happened.
Tomorrow: the complete system that ties all three together.
Copy-paste ready. 3β.
@postgres
π1
This media is not supported in the widget
VIEW IN TELEGRAM
PostgreSQL Pro | Database Mastery pinned Β«π Copy-Paste Backup & Recovery System What's inside: π¦ COMPLETE SYSTEM (3 β) 1. AUTOMATED BACKUP SCRIPT - pg_dump for daily snapshots - pg_basebackup for weekly full copies - WAL archiving for continuous protection - Rotation (keep 7 daily, 4β¦Β»
π§ͺ Your backup is worthless until you test the restore.
The number one backup mistake: assuming it works.
Here's a dead simple restore test you can run right now:
#!/bin/bash
# test_restore.sh β Run this weekly
TEST_DB="restore_test_$(date +%s)"
BACKUP_FILE=$(ls -t /backups/dumps/*.dump | head -1)
echo "Testing restore of: $BACKUP_FILE"
# Create temporary database
createdb $TEST_DB
# Restore into it
pg_restore -d $TEST_DB $BACKUP_FILE 2>/dev/null
# Verify: count tables
TABLE_COUNT=$(psql -t -A -d $TEST_DB \
-c "SELECT count(*) FROM information_schema.tables
WHERE table_schema = 'public'")
# Verify: count rows in your main table
ROW_COUNT=$(psql -t -A -d $TEST_DB \
-c "SELECT count(*) FROM users" 2>/dev/null || echo "0")
# Cleanup
dropdb $TEST_DB
# Report
echo "Tables restored: $TABLE_COUNT"
echo "Users found: $ROW_COUNT"
if [ "$TABLE_COUNT" -gt 0 ] && [ "$ROW_COUNT" -gt 0 ]; then
echo "β Backup is valid"
else
echo "β BACKUP FAILED β investigate immediately"
# Add your alert here: curl webhook, send email, etc.
fi
Schedule it:
# Every Sunday at 3 AM
echo "0 3 * * 0 /usr/local/bin/test_restore.sh \
>> /var/log/backup_test.log 2>&1" | crontab -
Three things to check when testing:
1. Can you restore at all?
Sounds obvious. Many dumps silently fail (permissions, disk space, version mismatch).
2. Is the data complete?
Row counts should match production within the backup window.
3. How long does restore take?
If your 50 GB database takes 4 hours to restore, that's your minimum downtime. Plan accordingly.
The script above takes 2 minutes to set up. Runs automatically. Tells you immediately if something breaks.
Two minutes now vs finding out during an emergency at 3 AM.
When was the last time you tested a restore? π
@postgres
The number one backup mistake: assuming it works.
Here's a dead simple restore test you can run right now:
#!/bin/bash
# test_restore.sh β Run this weekly
TEST_DB="restore_test_$(date +%s)"
BACKUP_FILE=$(ls -t /backups/dumps/*.dump | head -1)
echo "Testing restore of: $BACKUP_FILE"
# Create temporary database
createdb $TEST_DB
# Restore into it
pg_restore -d $TEST_DB $BACKUP_FILE 2>/dev/null
# Verify: count tables
TABLE_COUNT=$(psql -t -A -d $TEST_DB \
-c "SELECT count(*) FROM information_schema.tables
WHERE table_schema = 'public'")
# Verify: count rows in your main table
ROW_COUNT=$(psql -t -A -d $TEST_DB \
-c "SELECT count(*) FROM users" 2>/dev/null || echo "0")
# Cleanup
dropdb $TEST_DB
# Report
echo "Tables restored: $TABLE_COUNT"
echo "Users found: $ROW_COUNT"
if [ "$TABLE_COUNT" -gt 0 ] && [ "$ROW_COUNT" -gt 0 ]; then
echo "β Backup is valid"
else
echo "β BACKUP FAILED β investigate immediately"
# Add your alert here: curl webhook, send email, etc.
fi
Schedule it:
# Every Sunday at 3 AM
echo "0 3 * * 0 /usr/local/bin/test_restore.sh \
>> /var/log/backup_test.log 2>&1" | crontab -
Three things to check when testing:
1. Can you restore at all?
Sounds obvious. Many dumps silently fail (permissions, disk space, version mismatch).
2. Is the data complete?
Row counts should match production within the backup window.
3. How long does restore take?
If your 50 GB database takes 4 hours to restore, that's your minimum downtime. Plan accordingly.
The script above takes 2 minutes to set up. Runs automatically. Tells you immediately if something breaks.
Two minutes now vs finding out during an emergency at 3 AM.
When was the last time you tested a restore? π
@postgres
284
π Week 6 done. Your backups should be smarter now.
Quick recap:
β Monday β The honest truth about backup failures
β Tuesday β Three methods: pg_dump, pg_basebackup, WAL archiving
β Wednesday β π° Complete backup system (3β)
β Thursday β Restore testing script
The one thing to take away this week: backups without tested restores are just wasted disk space.
NEXT WEEK: MULTI-TENANCY PATTERNS
Week 7 tackles the question every SaaS builder faces: how do you serve multiple customers from one database?
Shared tables with row-level security
Schema-per-tenant
When to pick which
The copy-paste RLS setup
One of the most requested topics. See you Monday.
What did you implement this week? π
@postgres
Quick recap:
β Monday β The honest truth about backup failures
β Tuesday β Three methods: pg_dump, pg_basebackup, WAL archiving
β Wednesday β π° Complete backup system (3β)
β Thursday β Restore testing script
The one thing to take away this week: backups without tested restores are just wasted disk space.
NEXT WEEK: MULTI-TENANCY PATTERNS
Week 7 tackles the question every SaaS builder faces: how do you serve multiple customers from one database?
Shared tables with row-level security
Schema-per-tenant
When to pick which
The copy-paste RLS setup
One of the most requested topics. See you Monday.
What did you implement this week? π
@postgres
π’ Every SaaS has the same question: how do you serve 100 customers from one database?
Three approaches. Each with tradeoffs.
1οΈβ£ Shared tables + Row-Level Security
One set of tables. Every row has a tenant_id.
PostgreSQL enforces who sees what.
2οΈβ£ Schema-per-tenant
Same database, separate schema for each customer.
tenant_alice.users, tenant_bob.users
3οΈβ£ Database-per-tenant
Fully isolated. One database per customer.
Maximum separation, maximum overhead.
Most solo devs should start with #1.
It's the simplest, cheapest, and scales surprisingly far.
This week:
π Tuesday β Shared tables with RLS (the setup)
π Wednesday β π° Complete multi-tenant system (3β)
π Thursday β When to switch approaches
π Friday β Check-in
If you're building a SaaS, this is your week.
Which approach are you using right now? π
@postgres
Three approaches. Each with tradeoffs.
1οΈβ£ Shared tables + Row-Level Security
One set of tables. Every row has a tenant_id.
PostgreSQL enforces who sees what.
2οΈβ£ Schema-per-tenant
Same database, separate schema for each customer.
tenant_alice.users, tenant_bob.users
3οΈβ£ Database-per-tenant
Fully isolated. One database per customer.
Maximum separation, maximum overhead.
Most solo devs should start with #1.
It's the simplest, cheapest, and scales surprisingly far.
This week:
π Tuesday β Shared tables with RLS (the setup)
π Wednesday β π° Complete multi-tenant system (3β)
π Thursday β When to switch approaches
π Friday β Check-in
If you're building a SaaS, this is your week.
Which approach are you using right now? π
@postgres
β€3
π Row-Level Security in 10 minutes. Your tenants can never see each other's data.
The idea: PostgreSQL checks every query automatically. No WHERE tenant_id = ? scattered through your code.
Step 1: Add tenant_id to every table
ALTER TABLE users ADD COLUMN tenant_id UUID NOT NULL;
ALTER TABLE orders ADD COLUMN tenant_id UUID NOT NULL;
ALTER TABLE products ADD COLUMN tenant_id UUID NOT NULL;
CREATE INDEX idx_users_tenant ON users(tenant_id);
CREATE INDEX idx_orders_tenant ON orders(tenant_id);
CREATE INDEX idx_products_tenant ON products(tenant_id);
Step 2: Create an app user (not the superuser)
CREATE USER app_user WITH PASSWORD 'strong_password';
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO app_user;
Step 3: Enable RLS
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
ALTER TABLE products ENABLE ROW LEVEL SECURITY;
Step 4: Create policies
CREATE POLICY tenant_isolation ON users
USING (tenant_id = current_setting('app.tenant_id')::UUID);
CREATE POLICY tenant_isolation ON orders
USING (tenant_id = current_setting('app.tenant_id')::UUID);
CREATE POLICY tenant_isolation ON products
USING (tenant_id = current_setting('app.tenant_id')::UUID);
Step 5: Set tenant in your app (every request)
-- At the start of each API request:
SET LOCAL app.tenant_id = 'uuid-of-current-tenant';
-- Now every query is automatically filtered:
SELECT * FROM users;
-- Only returns rows where tenant_id matches. Automatically.
That's it. No middleware. No query wrappers. PostgreSQL handles it at the database level.
Tomorrow: the complete system with signup, onboarding, admin access, cross-tenant queries, and all the edge cases. 3β.
@postgres
The idea: PostgreSQL checks every query automatically. No WHERE tenant_id = ? scattered through your code.
Step 1: Add tenant_id to every table
ALTER TABLE users ADD COLUMN tenant_id UUID NOT NULL;
ALTER TABLE orders ADD COLUMN tenant_id UUID NOT NULL;
ALTER TABLE products ADD COLUMN tenant_id UUID NOT NULL;
CREATE INDEX idx_users_tenant ON users(tenant_id);
CREATE INDEX idx_orders_tenant ON orders(tenant_id);
CREATE INDEX idx_products_tenant ON products(tenant_id);
Step 2: Create an app user (not the superuser)
CREATE USER app_user WITH PASSWORD 'strong_password';
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO app_user;
Step 3: Enable RLS
ALTER TABLE users ENABLE ROW LEVEL SECURITY;
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
ALTER TABLE products ENABLE ROW LEVEL SECURITY;
Step 4: Create policies
CREATE POLICY tenant_isolation ON users
USING (tenant_id = current_setting('app.tenant_id')::UUID);
CREATE POLICY tenant_isolation ON orders
USING (tenant_id = current_setting('app.tenant_id')::UUID);
CREATE POLICY tenant_isolation ON products
USING (tenant_id = current_setting('app.tenant_id')::UUID);
Step 5: Set tenant in your app (every request)
-- At the start of each API request:
SET LOCAL app.tenant_id = 'uuid-of-current-tenant';
-- Now every query is automatically filtered:
SELECT * FROM users;
-- Only returns rows where tenant_id matches. Automatically.
That's it. No middleware. No query wrappers. PostgreSQL handles it at the database level.
Tomorrow: the complete system with signup, onboarding, admin access, cross-tenant queries, and all the edge cases. 3β.
@postgres
This media is not supported in the widget
VIEW IN TELEGRAM
PostgreSQL Pro | Database Mastery pinned Β«π Complete Multi-Tenant System β Copy-Paste Ready What's inside: π¦ COMPLETE SYSTEM (3 β) 1. TENANT MANAGEMENT - Tenant signup & provisioning - Tenant settings table - Plan/tier tracking - Soft delete & data retention 2. ROW-LEVEL SECURITY (FULLβ¦Β»
π€ Shared tables vs schema-per-tenant vs database-per-tenant. When to switch.
Here's the honest breakdown.
SHARED TABLES + RLS (start here)
β One schema, one migration, one backup
β Works with connection pooling out of the box
β Scales to thousands of tenants easily
β Simplest to build and maintain
β Noisy neighbor risk (one tenant's huge query slows everyone)
β Tenant data export is a filtered query, not a dump
Best for: Most SaaS apps. 1 to 10,000 tenants.
SCHEMA-PER-TENANT
β Stronger isolation (separate tables)
β Per-tenant backup/restore possible
β Can customize schema per tenant
β Migrations run N times (once per tenant)
β Connection pooling gets complicated
β 500+ schemas starts feeling heavy
Best for: Apps where tenants need different configurations. Regulated industries where you need to prove data separation.
DATABASE-PER-TENANT
β Maximum isolation
β Independent backup, restore, scaling
β Can run different PostgreSQL versions
β Connection management nightmare
β $20-50/month per tenant for managed DBs
β Migrations across 200 databases = pain
Best for: Enterprise customers who contractually require separate databases. Very few solo devs need this.
THE DECISION TREE:
Building your first SaaS?
β Shared tables + RLS. Don't overthink it.
Have a paying customer demanding isolation?
β Schema-per-tenant for that one customer.
Selling to banks/hospitals with compliance requirements?
β Database-per-tenant, but charge accordingly.
ONE TRICK: You can mix approaches.
Most tenants on shared tables.
Premium/enterprise tenants on dedicated schemas.
Same app code. RLS handles shared. search_path handles schema.
-- For shared tenants:
SET LOCAL app.tenant_id = 'uuid';
-- For premium tenants:
SET search_path TO tenant_acme, public;
Start simple. Upgrade individual tenants when they pay for it.
What's your current setup? Thinking about changing? π
@postgres
Here's the honest breakdown.
SHARED TABLES + RLS (start here)
β One schema, one migration, one backup
β Works with connection pooling out of the box
β Scales to thousands of tenants easily
β Simplest to build and maintain
β Noisy neighbor risk (one tenant's huge query slows everyone)
β Tenant data export is a filtered query, not a dump
Best for: Most SaaS apps. 1 to 10,000 tenants.
SCHEMA-PER-TENANT
β Stronger isolation (separate tables)
β Per-tenant backup/restore possible
β Can customize schema per tenant
β Migrations run N times (once per tenant)
β Connection pooling gets complicated
β 500+ schemas starts feeling heavy
Best for: Apps where tenants need different configurations. Regulated industries where you need to prove data separation.
DATABASE-PER-TENANT
β Maximum isolation
β Independent backup, restore, scaling
β Can run different PostgreSQL versions
β Connection management nightmare
β $20-50/month per tenant for managed DBs
β Migrations across 200 databases = pain
Best for: Enterprise customers who contractually require separate databases. Very few solo devs need this.
THE DECISION TREE:
Building your first SaaS?
β Shared tables + RLS. Don't overthink it.
Have a paying customer demanding isolation?
β Schema-per-tenant for that one customer.
Selling to banks/hospitals with compliance requirements?
β Database-per-tenant, but charge accordingly.
ONE TRICK: You can mix approaches.
Most tenants on shared tables.
Premium/enterprise tenants on dedicated schemas.
Same app code. RLS handles shared. search_path handles schema.
-- For shared tenants:
SET LOCAL app.tenant_id = 'uuid';
-- For premium tenants:
SET search_path TO tenant_acme, public;
Start simple. Upgrade individual tenants when they pay for it.
What's your current setup? Thinking about changing? π
@postgres
π Week 7 done. Multi-tenancy demystified.
This week:
β Monday β Three approaches compared
β Tuesday β RLS setup in 10 minutes
β Wednesday β π° Complete multi-tenant system (3β)
β Thursday β When to switch approaches
The takeaway: start with shared tables + RLS. Upgrade individual tenants when they pay enough to justify it.
---
RUNNING TALLY
Week 1 | Auth | 3β | 1 β
Week 2 | Jobs | 4β | 0
Week 3 | Performance | 3β | ?
Week 4 | Search | 3β | ?
Week 5 | Real-time | 3β | ?
Week 6 | Backups | 3β | ?
Week 7 | Multi-tenancy | 3β | ?
---
NEXT WEEK: YOU DECIDE
We've covered a lot of ground. What do you need most?
π΄ Caching with PostgreSQL (replace Redis for sessions, app cache)
π‘ File storage & uploads (store and serve files without S3)
π’ Database migrations done right (zero-downtime, rollback-safe)
π΅ Monitoring & observability (know what your database is doing)
Vote below π
See you Monday with whatever wins.
@postgres
This week:
β Monday β Three approaches compared
β Tuesday β RLS setup in 10 minutes
β Wednesday β π° Complete multi-tenant system (3β)
β Thursday β When to switch approaches
The takeaway: start with shared tables + RLS. Upgrade individual tenants when they pay enough to justify it.
---
RUNNING TALLY
Week 1 | Auth | 3β | 1 β
Week 2 | Jobs | 4β | 0
Week 3 | Performance | 3β | ?
Week 4 | Search | 3β | ?
Week 5 | Real-time | 3β | ?
Week 6 | Backups | 3β | ?
Week 7 | Multi-tenancy | 3β | ?
---
NEXT WEEK: YOU DECIDE
We've covered a lot of ground. What do you need most?
π΄ Caching with PostgreSQL (replace Redis for sessions, app cache)
π‘ File storage & uploads (store and serve files without S3)
π’ Database migrations done right (zero-downtime, rollback-safe)
π΅ Monitoring & observability (know what your database is doing)
Vote below π
See you Monday with whatever wins.
@postgres
π1
π You don't need S3 for file uploads. PostgreSQL handles it.
Controversial? Maybe. But hear me out.
If your app has:
- User avatars
- PDF invoices
- CSV exports
- Document attachments
- Images under 10 MB
You don't need a separate file storage service.
PostgreSQL has a built-in data type called bytea. It stores binary data directly in your database. And a feature called Large Objects for bigger files.
Why this matters for solo devs:
S3 + CloudFront: $5-25/month + complexity
Cloudinary: $0-89/month + API limits
DigitalOcean Spaces: $5/month minimum
PostgreSQL bytea: $0 extra (already in your database)
One less service. One less API key. One less thing that breaks at 2 AM.
This week:
π Tuesday β How bytea and Large Objects work
π Wednesday β π° Complete file system (3β)
π Thursday β When PostgreSQL storage isn't enough
π Friday β Check-in
Let's build a file system that lives next to your data. Where it belongs.
How do you handle file uploads today? π
@postgres
Controversial? Maybe. But hear me out.
If your app has:
- User avatars
- PDF invoices
- CSV exports
- Document attachments
- Images under 10 MB
You don't need a separate file storage service.
PostgreSQL has a built-in data type called bytea. It stores binary data directly in your database. And a feature called Large Objects for bigger files.
Why this matters for solo devs:
S3 + CloudFront: $5-25/month + complexity
Cloudinary: $0-89/month + API limits
DigitalOcean Spaces: $5/month minimum
PostgreSQL bytea: $0 extra (already in your database)
One less service. One less API key. One less thing that breaks at 2 AM.
This week:
π Tuesday β How bytea and Large Objects work
π Wednesday β π° Complete file system (3β)
π Thursday β When PostgreSQL storage isn't enough
π Friday β Check-in
Let's build a file system that lives next to your data. Where it belongs.
How do you handle file uploads today? π
@postgres
ποΈ Two ways to store files in PostgreSQL. Here's when to use each.
METHOD 1: bytea (Binary Data)
Store files directly in a column. Simple. Works for files up to ~50 MB.
-- Create a files table
CREATE TABLE files (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
filename TEXT NOT NULL,
mime_type TEXT NOT NULL,
size_bytes BIGINT NOT NULL,
data BYTEA NOT NULL,
uploaded_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_files_tenant ON files(tenant_id);
-- Insert a file (from your app)
INSERT INTO files (tenant_id, filename, mime_type, size_bytes, data)
VALUES (
'tenant-uuid',
'invoice-001.pdf',
'application/pdf',
48230,
'\x255044462d312e34...' -- binary content
);
-- Retrieve it
SELECT filename, mime_type, data
FROM files WHERE id = 'file-uuid';
Pros: Simple. Backed up with your database. RLS works on files too.
Cons: Large files bloat your database. TOAST compression helps but has limits.
METHOD 2: Large Objects
PostgreSQL's built-in file system. Better for files over 50 MB.
-- Store a large file
SELECT lo_import('/path/to/bigfile.zip');
-- Returns an OID (object ID)
-- Track it in a table
CREATE TABLE large_files (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
filename TEXT NOT NULL,
mime_type TEXT NOT NULL,
size_bytes BIGINT NOT NULL,
loid OID NOT NULL,
uploaded_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Retrieve
SELECT lo_export(loid, '/tmp/download.zip')
FROM large_files WHERE id = 'file-uuid';
Pros: Handles files up to 4 TB. Streaming support.
Cons: Slightly more complex API. Cleanup needs lo_unlink().
THE SIMPLE ANSWER:
Files < 10 MB? β bytea (avatars, thumbnails, small docs)
Files 10-100 MB? β Large Objects or consider external storage
Files > 100 MB? β External storage (S3, disk) with URL in DB
Most SaaS apps deal with files under 10 MB. bytea covers it.
Tomorrow: the complete file system with upload API, download endpoints, image resizing, and storage quotas. 3β.
@postgres
METHOD 1: bytea (Binary Data)
Store files directly in a column. Simple. Works for files up to ~50 MB.
-- Create a files table
CREATE TABLE files (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
filename TEXT NOT NULL,
mime_type TEXT NOT NULL,
size_bytes BIGINT NOT NULL,
data BYTEA NOT NULL,
uploaded_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_files_tenant ON files(tenant_id);
-- Insert a file (from your app)
INSERT INTO files (tenant_id, filename, mime_type, size_bytes, data)
VALUES (
'tenant-uuid',
'invoice-001.pdf',
'application/pdf',
48230,
'\x255044462d312e34...' -- binary content
);
-- Retrieve it
SELECT filename, mime_type, data
FROM files WHERE id = 'file-uuid';
Pros: Simple. Backed up with your database. RLS works on files too.
Cons: Large files bloat your database. TOAST compression helps but has limits.
METHOD 2: Large Objects
PostgreSQL's built-in file system. Better for files over 50 MB.
-- Store a large file
SELECT lo_import('/path/to/bigfile.zip');
-- Returns an OID (object ID)
-- Track it in a table
CREATE TABLE large_files (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
filename TEXT NOT NULL,
mime_type TEXT NOT NULL,
size_bytes BIGINT NOT NULL,
loid OID NOT NULL,
uploaded_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Retrieve
SELECT lo_export(loid, '/tmp/download.zip')
FROM large_files WHERE id = 'file-uuid';
Pros: Handles files up to 4 TB. Streaming support.
Cons: Slightly more complex API. Cleanup needs lo_unlink().
THE SIMPLE ANSWER:
Files < 10 MB? β bytea (avatars, thumbnails, small docs)
Files 10-100 MB? β Large Objects or consider external storage
Files > 100 MB? β External storage (S3, disk) with URL in DB
Most SaaS apps deal with files under 10 MB. bytea covers it.
Tomorrow: the complete file system with upload API, download endpoints, image resizing, and storage quotas. 3β.
@postgres
This media is not supported in the widget
VIEW IN TELEGRAM
PostgreSQL Pro | Database Mastery pinned Β«π Complete File Storage System β No S3 Required What's inside: π¦ COMPLETE SYSTEM (3 β) 1. FILE STORAGE SCHEMA - Files table with metadata + binary data - Folder/directory support - File versioning (keep history) - Soft delete with retentionβ¦Β»
β οΈ When NOT to store files in PostgreSQL. Honest take.
PostgreSQL file storage works great for most solo dev apps. But it's not always the right call. Here's where the line is.
KEEP FILES IN POSTGRESQL WHEN:
β Files are mostly small (<10 MB each)
Avatars, PDFs, CSVs, small images
β Files are tightly coupled to data
Invoice PDF belongs to invoice row
Deleting a user should delete their files
β You need transactional consistency
File upload + database insert succeed or fail together
No orphaned files, no missing references
β Your total file storage is under 50 GB
Fits comfortably on a standard VPS
β You want one backup for everything
Database backup = data backup + file backup
MOVE FILES OUT WHEN:
β You're serving files to many users simultaneously
PostgreSQL connections are expensive for streaming
A CDN serves static files much more efficiently
β Files are large (video, high-res images, datasets)
100 MB+ files bloat your database and slow backups
β Total storage exceeds 100 GB
Database size affects backup time and restore speed
β You need edge caching / global CDN
PostgreSQL is one server, not a global network
THE HYBRID APPROACH (best of both):
-- Store metadata in PostgreSQL
CREATE TABLE files (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
filename TEXT NOT NULL,
mime_type TEXT NOT NULL,
size_bytes BIGINT NOT NULL,
storage_type TEXT NOT NULL DEFAULT 'db',
-- For DB storage:
data BYTEA,
-- For external storage:
external_url TEXT,
uploaded_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Small files: store in data column
-- Large files: upload to S3, store URL in external_url
-- Your app checks storage_type and serves accordingly
Start with everything in PostgreSQL.
Move to hybrid when you have a reason to.
Not before.
Most solo dev apps never reach the point where PostgreSQL file storage becomes a problem. Don't optimize for scale you don't have.
What's your file storage situation? π
@postgres
PostgreSQL file storage works great for most solo dev apps. But it's not always the right call. Here's where the line is.
KEEP FILES IN POSTGRESQL WHEN:
β Files are mostly small (<10 MB each)
Avatars, PDFs, CSVs, small images
β Files are tightly coupled to data
Invoice PDF belongs to invoice row
Deleting a user should delete their files
β You need transactional consistency
File upload + database insert succeed or fail together
No orphaned files, no missing references
β Your total file storage is under 50 GB
Fits comfortably on a standard VPS
β You want one backup for everything
Database backup = data backup + file backup
MOVE FILES OUT WHEN:
β You're serving files to many users simultaneously
PostgreSQL connections are expensive for streaming
A CDN serves static files much more efficiently
β Files are large (video, high-res images, datasets)
100 MB+ files bloat your database and slow backups
β Total storage exceeds 100 GB
Database size affects backup time and restore speed
β You need edge caching / global CDN
PostgreSQL is one server, not a global network
THE HYBRID APPROACH (best of both):
-- Store metadata in PostgreSQL
CREATE TABLE files (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL,
filename TEXT NOT NULL,
mime_type TEXT NOT NULL,
size_bytes BIGINT NOT NULL,
storage_type TEXT NOT NULL DEFAULT 'db',
-- For DB storage:
data BYTEA,
-- For external storage:
external_url TEXT,
uploaded_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Small files: store in data column
-- Large files: upload to S3, store URL in external_url
-- Your app checks storage_type and serves accordingly
Start with everything in PostgreSQL.
Move to hybrid when you have a reason to.
Not before.
Most solo dev apps never reach the point where PostgreSQL file storage becomes a problem. Don't optimize for scale you don't have.
What's your file storage situation? π
@postgres
π Week 8 done. Files without S3 β covered.
This week:
β Monday β Why you probably don't need S3
β Tuesday β bytea vs Large Objects
β Wednesday β π° Complete file storage system (3β)
β Thursday β When to move files out (honest limits)
The takeaway: store files in PostgreSQL until you have a specific reason not to. For most solo dev SaaS apps, that reason never comes.
---
8 WEEKS IN
We've built a lot together:
Auth β Jobs β Performance β Search β Real-time β Backups β Multi-tenancy β File storage
That's a complete SaaS backend. All PostgreSQL. No external services.
---
WHAT'S NEXT?
We're entering month 3. A few directions we could go:
π΄ Caching (replace Redis for sessions and app cache)
π‘ Database migrations (zero-downtime, rollback-safe)
π’ Email & notifications (send from PostgreSQL, template engine)
π΅ Monitoring & observability (know what your database is doing)
Vote below π
Or tell me what you're struggling with right now. That's what we'll build.
See you Monday.
@postgres
This week:
β Monday β Why you probably don't need S3
β Tuesday β bytea vs Large Objects
β Wednesday β π° Complete file storage system (3β)
β Thursday β When to move files out (honest limits)
The takeaway: store files in PostgreSQL until you have a specific reason not to. For most solo dev SaaS apps, that reason never comes.
---
8 WEEKS IN
We've built a lot together:
Auth β Jobs β Performance β Search β Real-time β Backups β Multi-tenancy β File storage
That's a complete SaaS backend. All PostgreSQL. No external services.
---
WHAT'S NEXT?
We're entering month 3. A few directions we could go:
π΄ Caching (replace Redis for sessions and app cache)
π‘ Database migrations (zero-downtime, rollback-safe)
π’ Email & notifications (send from PostgreSQL, template engine)
π΅ Monitoring & observability (know what your database is doing)
Vote below π
Or tell me what you're struggling with right now. That's what we'll build.
See you Monday.
@postgres
β‘ You probably don't need Redis.
I know. Every tutorial says "add Redis for caching." Every boilerplate ships with it. Every "production-ready" checklist includes it.
But if you're a solo dev or small team, Redis means:
- Another service to run and monitor
- Another thing that crashes at 3 AM
- Another connection string to manage
- $10-30/month on managed hosting (Upstash, Redis Cloud)
- Session data in one place, everything else in another
PostgreSQL can do it all:
β Session storage
β Application cache (API responses, computed values)
β Rate limiting counters
β Feature flags
β Temporary data with auto-expiry
And your data stays in one place. One backup. One connection.
This week:
π Tuesday β How PostgreSQL caching works (UNLOGGED tables, materialized views)
π Wednesday β π° Complete caching system (3β)
π Thursday β When you actually need Redis (honest take)
π Friday β Check-in
Let's kill another unnecessary service.
What are you using Redis for right now? π
@postgres
I know. Every tutorial says "add Redis for caching." Every boilerplate ships with it. Every "production-ready" checklist includes it.
But if you're a solo dev or small team, Redis means:
- Another service to run and monitor
- Another thing that crashes at 3 AM
- Another connection string to manage
- $10-30/month on managed hosting (Upstash, Redis Cloud)
- Session data in one place, everything else in another
PostgreSQL can do it all:
β Session storage
β Application cache (API responses, computed values)
β Rate limiting counters
β Feature flags
β Temporary data with auto-expiry
And your data stays in one place. One backup. One connection.
This week:
π Tuesday β How PostgreSQL caching works (UNLOGGED tables, materialized views)
π Wednesday β π° Complete caching system (3β)
π Thursday β When you actually need Redis (honest take)
π Friday β Check-in
Let's kill another unnecessary service.
What are you using Redis for right now? π
@postgres
π§° Three PostgreSQL caching tools you already have.
TOOL 1: UNLOGGED TABLES
Regular tables write to WAL (write-ahead log) for crash safety. UNLOGGED tables skip that. Faster writes, faster reads. Data lost on crash β which is fine for cache.
CREATE UNLOGGED TABLE cache (
key TEXT PRIMARY KEY,
value JSONB NOT NULL,
expires_at TIMESTAMPTZ NOT NULL DEFAULT now() + interval '1 hour',
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_cache_expires ON cache(expires_at);
-- Write cache
INSERT INTO cache (key, value, expires_at)
VALUES ('user:123:profile', '{"name":"John"}', now() + interval '15 minutes')
ON CONFLICT (key) DO UPDATE
SET value = EXCLUDED.value, expires_at = EXCLUDED.expires_at;
-- Read cache (only if not expired)
SELECT value FROM cache
WHERE key = 'user:123:profile' AND expires_at > now();
-- That's it. That's Redis GET/SET with TTL.
TOOL 2: MATERIALIZED VIEWS
Pre-compute expensive queries. Refresh on schedule.
CREATE MATERIALIZED VIEW dashboard_stats AS
SELECT
count(*) as total_users,
count(*) FILTER (WHERE created_at > now() - interval '7 days') as new_this_week,
count(*) FILTER (WHERE last_login > now() - interval '24 hours') as active_today
FROM users;
-- Refresh (takes milliseconds for reasonable tables)
REFRESH MATERIALIZED VIEW CONCURRENTLY dashboard_stats;
-- Query is instant β reads pre-computed results
SELECT * FROM dashboard_stats;
TOOL 3: GENERATED COLUMNS
Cache computed values directly in the row. Updated automatically.
ALTER TABLE products ADD COLUMN search_text TEXT
GENERATED ALWAYS AS (
name ' ' coalesce(description, '') ' ' coalesce(category, '')
) STORED;
-- No cache invalidation needed. PostgreSQL updates it on every write.
Tomorrow: the complete system β sessions, API cache, rate limiting, feature flags, all in PostgreSQL. 3β.
@postgres
TOOL 1: UNLOGGED TABLES
Regular tables write to WAL (write-ahead log) for crash safety. UNLOGGED tables skip that. Faster writes, faster reads. Data lost on crash β which is fine for cache.
CREATE UNLOGGED TABLE cache (
key TEXT PRIMARY KEY,
value JSONB NOT NULL,
expires_at TIMESTAMPTZ NOT NULL DEFAULT now() + interval '1 hour',
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_cache_expires ON cache(expires_at);
-- Write cache
INSERT INTO cache (key, value, expires_at)
VALUES ('user:123:profile', '{"name":"John"}', now() + interval '15 minutes')
ON CONFLICT (key) DO UPDATE
SET value = EXCLUDED.value, expires_at = EXCLUDED.expires_at;
-- Read cache (only if not expired)
SELECT value FROM cache
WHERE key = 'user:123:profile' AND expires_at > now();
-- That's it. That's Redis GET/SET with TTL.
TOOL 2: MATERIALIZED VIEWS
Pre-compute expensive queries. Refresh on schedule.
CREATE MATERIALIZED VIEW dashboard_stats AS
SELECT
count(*) as total_users,
count(*) FILTER (WHERE created_at > now() - interval '7 days') as new_this_week,
count(*) FILTER (WHERE last_login > now() - interval '24 hours') as active_today
FROM users;
-- Refresh (takes milliseconds for reasonable tables)
REFRESH MATERIALIZED VIEW CONCURRENTLY dashboard_stats;
-- Query is instant β reads pre-computed results
SELECT * FROM dashboard_stats;
TOOL 3: GENERATED COLUMNS
Cache computed values directly in the row. Updated automatically.
ALTER TABLE products ADD COLUMN search_text TEXT
GENERATED ALWAYS AS (
name
) STORED;
-- No cache invalidation needed. PostgreSQL updates it on every write.
Tomorrow: the complete system β sessions, API cache, rate limiting, feature flags, all in PostgreSQL. 3β.
@postgres
This media is not supported in the widget
VIEW IN TELEGRAM
PostgreSQL Pro | Database Mastery pinned Β«π Complete Caching System β Replace Redis With PostgreSQL What's inside: π¦ COMPLETE SYSTEM (3 β) 1. KEY-VALUE CACHE - UNLOGGED cache table with TTL - GET/SET/DELETE functions (same mental model as Redis) - Batch get/set - Auto-cleanup of expiredβ¦Β»