Self-Host Paperless-ngx: Complete Document Management System Guide
Learn how to self-host Paperless-ngx, a powerful open-source document management system. Complete guide with Docker setup, OCR, and automation.
Self-Host Paperless-ngx: Complete Document Management System Guide
Going paperless isn’t just about scanning documents — it’s about creating a searchable, organized digital archive that you actually control. Paperless-ngx is the gold standard for self-hosted document management: it automatically organizes your scanned documents, extracts text with OCR, and makes everything searchable.
Unlike cloud services like Evernote, Google Drive, or Dropbox that lock you into their ecosystem and read your documents for advertising, Paperless-ngx runs on your own server and keeps your sensitive documents private.
In this guide, you’ll learn how to deploy Paperless-ngx with Docker, configure OCR for multiple languages, set up automatic imports, and integrate with your existing self-hosted apps.
What is Paperless-ngx?
Paperless-ngx is an open-source document management system (DMS) that automatically:
- Scans and imports documents from various sources
- Extracts text using OCR (optical character recognition)
- Organizes with tags, correspondents, and document types
- Searches through full-text content instantly
- Manages metadata, creation dates, and custom fields
- Archives originals while creating searchable PDFs
- Exports to multiple formats
It’s a complete replacement for physical filing cabinets and expensive enterprise DMS solutions. Originally created as “Paperless,” the project was forked and significantly improved by the community as “Paperless-ng,” and recently rebranded as Paperless-ngx with active development.
Why Self-Host Paperless-ngx?
Privacy: Your tax documents, medical records, and legal papers stay on your server. No third-party scanning or indexing.
Control: You decide retention policies, backup schedules, and who has access.
Cost: Free forever. No per-user fees, storage limits, or feature paywalls.
Flexibility: Integrate with your scanner, mobile phone, email, or any other input source.
Compliance: Keep sensitive business documents on-premises for GDPR, HIPAA, or other regulatory requirements.
Prerequisites
Before you start, you’ll need:
-
A server or VPS with at least:
- 2GB RAM (4GB recommended for OCR performance)
- 20GB storage + space for your documents
- Docker and Docker Compose installed
-
A domain name (optional but recommended for HTTPS access)
-
Basic familiarity with Docker and command line
Need a VPS? Check out these reliable providers:
- Hetzner — Best value (€4.15/month for 2GB RAM)
- DigitalOcean — Great documentation ($12/month)
- Vultr — Global coverage ($6/month)
All three offer one-click Docker deployments and excellent uptime.
Quick Start: Deploy Paperless-ngx with Docker Compose
The fastest way to get Paperless-ngx running is with Docker Compose. This method handles all dependencies (PostgreSQL database, Redis cache, Tika for document parsing) automatically.
Step 1: Create the Project Directory
mkdir -p ~/paperless-ngx
cd ~/paperless-ngx
Step 2: Create docker-compose.yml
Create a docker-compose.yml file with this configuration:
version: "3.8"
services:
broker:
image: docker.io/library/redis:7
restart: unless-stopped
volumes:
- redisdata:/data
db:
image: docker.io/library/postgres:16
restart: unless-stopped
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_DB: paperless
POSTGRES_USER: paperless
POSTGRES_PASSWORD: paperless
webserver:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
restart: unless-stopped
depends_on:
- db
- broker
ports:
- "8000:8000"
volumes:
- data:/usr/src/paperless/data
- media:/usr/src/paperless/media
- ./export:/usr/src/paperless/export
- ./consume:/usr/src/paperless/consume
environment:
PAPERLESS_REDIS: redis://broker:6379
PAPERLESS_DBHOST: db
PAPERLESS_DBNAME: paperless
PAPERLESS_DBUSER: paperless
PAPERLESS_DBPASS: paperless
PAPERLESS_SECRET_KEY: change-this-to-a-random-string
PAPERLESS_URL: https://paperless.yourdomain.com
PAPERLESS_TIME_ZONE: Europe/Paris
PAPERLESS_OCR_LANGUAGE: eng
PAPERLESS_ADMIN_USER: admin
PAPERLESS_ADMIN_PASSWORD: changeme
PAPERLESS_TIKA_ENABLED: 1
PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000
PAPERLESS_TIKA_ENDPOINT: http://tika:9998
gotenberg:
image: docker.io/gotenberg/gotenberg:8
restart: unless-stopped
command:
- "gotenberg"
- "--chromium-disable-javascript=true"
- "--chromium-allow-list=file:///tmp/.*"
tika:
image: ghcr.io/paperless-ngx/tika:latest
restart: unless-stopped
volumes:
data:
media:
pgdata:
redisdata:
Step 3: Configure Environment Variables
Important: Before deploying, customize these settings:
- Generate a secret key:
openssl rand -base64 32
Replace change-this-to-a-random-string with the output.
-
Change the admin password from
changemeto something secure. -
Set your domain in
PAPERLESS_URL(or use your server IP for testing). -
Set your timezone in
PAPERLESS_TIME_ZONE. See the full list. -
Configure OCR language: Set
PAPERLESS_OCR_LANGUAGEto your language code(s). Examples:eng— Englishfra— Frenchdeu— Germanspa— Spanisheng+fra— Multiple languages (English + French)
Step 4: Create Directories
mkdir -p consume export
- consume/ — Drop files here for automatic import
- export/ — Exported documents appear here
Step 5: Launch Paperless-ngx
docker compose up -d
This will:
- Pull all required images (~2-3 GB)
- Create the database and apply migrations
- Start all services
- Create your admin user
Check the logs to ensure everything started correctly:
docker compose logs -f webserver
Wait for the message: Application startup complete.
Step 6: Access the Web Interface
Open your browser and navigate to:
http://your-server-ip:8000
Log in with:
- Username: admin
- Password: (the password you set in docker-compose.yml)
First login tip: You’ll see a setup wizard. The defaults are fine for most users.
Configure OCR and Document Processing
Paperless-ngx’s killer feature is automatic OCR — it extracts text from scanned documents and makes them fully searchable.
OCR Languages
If you work with documents in multiple languages, configure all of them:
PAPERLESS_OCR_LANGUAGE: eng+fra+deu+spa
This enables English, French, German, and Spanish OCR. The first language (eng) is the default.
Important: More languages increase RAM usage during OCR. Monitor performance if you enable many languages.
OCR Mode
Control how Paperless-ngx handles documents that already contain text:
PAPERLESS_OCR_MODE: skip
Options:
skip— Skip OCR if text is detected (fastest, recommended)redo— Always run OCR, discard existing textforce— Run OCR and append to existing textskip_noarchive— Skip OCR and don’t create an archived version
Recommendation: Use skip for best performance. Most PDFs from digital sources already contain text.
OCR Performance Tuning
If OCR is slow on your server:
- Limit parallel tasks:
PAPERLESS_TASK_WORKERS: 1
PAPERLESS_THREADS_PER_WORKER: 1
- Reduce OCR quality (faster, less accurate):
PAPERLESS_OCR_MODE: skip
PAPERLESS_OCR_PAGES: 1 # Only OCR first page
- Disable TIKA if you don’t need advanced document parsing:
PAPERLESS_TIKA_ENABLED: 0
Remove the gotenberg and tika services from docker-compose.yml.
Set Up Automatic Document Import
Paperless-ngx can automatically import documents from multiple sources.
1. Local Folder (Consume Directory)
The easiest method: drop files into the consume/ folder.
Paperless-ngx checks this folder every few seconds and automatically imports new files.
Create subdirectories for organization:
mkdir -p consume/receipts
mkdir -p consume/invoices
mkdir -p consume/personal
Files in subdirectories are tagged automatically based on the folder name.
2. Email Import
Forward emails with attachments to Paperless-ngx:
PAPERLESS_CONSUMER_ENABLE_IMAP_MAILBOX: true
PAPERLESS_CONSUMER_IMAP_SERVER: imap.gmail.com
PAPERLESS_CONSUMER_IMAP_PORT: 993
PAPERLESS_CONSUMER_IMAP_USER: [email protected]
PAPERLESS_CONSUMER_IMAP_PASSWORD: your-app-password
PAPERLESS_CONSUMER_IMAP_FOLDER: Paperless
How it works:
- Create a Gmail label/folder called “Paperless”
- Forward emails with documents to your Gmail
- Apply the “Paperless” label
- Paperless-ngx imports attachments automatically
Security tip: Use an app password instead of your real Gmail password.
3. Mobile Upload (Paperless Mobile App)
Install the Paperless Mobile app (iOS / Android):
- Configure your Paperless-ngx URL
- Generate an API token (Settings → My Profile → API Token)
- Scan documents with your phone camera
- Upload directly to Paperless-ngx
Perfect for receipts, business cards, and paper mail.
4. Network Scanner Integration
Most modern scanners support “scan to network folder.” Two approaches:
Option A: SMB Share
Mount your consume folder as a network share:
sudo apt install samba
sudo nano /etc/samba/smb.conf
Add:
[paperless]
path = /home/yourusername/paperless-ngx/consume
writable = yes
guest ok = yes
create mask = 0644
Restart Samba:
sudo systemctl restart smbd
Configure your scanner to save to \\your-server-ip\paperless.
Option B: FTP Server
Some scanners only support FTP:
docker run -d --name paperless-ftp \
-p 21:21 -p 21000-21010:21000-21010 \
-v ~/paperless-ngx/consume:/home/ftpuser/upload \
-e FTP_USER=scanner \
-e FTP_PASS=your-password \
fauria/vsftpd
Configure your scanner to upload via FTP.
Organize Documents with Tags, Types, and Correspondents
Paperless-ngx uses tags, document types, and correspondents to organize your archive.
Tags
Tags are flexible labels you can assign to any document:
receipttax-2026warrantycontractmedical
Create tags: Dashboard → Tags → Add Tag
Auto-tagging: Set up rules to automatically tag documents based on filename or content.
Example rule:
- Name: Auto-tag receipts
- Match: Contains “receipt” or “invoice” (case-insensitive)
- Action: Add tag
receipt
Document Types
Document types categorize broad document classes:
- Invoice
- Receipt
- Contract
- Letter
- Report
- Manual
- Photo
Create types: Dashboard → Document Types → Add Type
Use document types for high-level filtering, tags for granular organization.
Correspondents
Correspondents identify who sent or issued the document:
- Amazon
- IRS
- Electric Company
- Bank of America
- Dr. Smith
Create correspondents: Dashboard → Correspondents → Add Correspondent
Auto-correspondent matching: Paperless-ngx can automatically detect correspondents from document content using regex patterns.
Example:
- Name: Amazon
- Match:
Amazon\.com|AMAZON\.COM|Order #\d+
Custom Fields
Need more metadata? Create custom fields:
Dashboard → Custom Fields → Add Field
Examples:
- Invoice Number (text)
- Due Date (date)
- Amount (monetary)
- Project (select)
Custom fields appear in search filters and can be bulk-edited.
Search and Retrieve Documents
The whole point of going paperless is finding documents instantly.
Full-Text Search
Paperless-ngx indexes every word in your documents. Search from the main dashboard:
tax return 2025— Find all tax documents from 2025warranty laptop— Find laptop warranty paperscontract smith— Find contracts related to someone named Smith
Search is instant and searches OCR’d text, filenames, and metadata.
Advanced Search Filters
Combine filters for precise queries:
- Tags:
receiptANDtax-2026 - Correspondent:
Amazon - Document Type:
Invoice - Date Range: 2025-01-01 to 2025-12-31
- Custom Field: Project = “Website Redesign”
Saved views: Create saved searches for common queries:
- “2026 Tax Documents”
- “Unpaid Invoices”
- “Medical Records — Last Year”
Access saved views from the sidebar for instant filtering.
Download and Export
Download documents in multiple formats:
- Original — The file as it was uploaded
- Archive — OCR’d PDF with searchable text layer
- Thumbnail — Preview image
- Metadata (JSON) — All document metadata
Bulk export: Select multiple documents and export as a ZIP file.
Backup Your Paperless-ngx Installation
Your Paperless-ngx archive contains irreplaceable documents. Back it up.
What to Back Up
- Documents:
/usr/src/paperless/media(inside the container) - Database: PostgreSQL data
- Configuration:
docker-compose.ymland.envfiles - Export folder: Your exported documents
Automated Backup Script
Create backup-paperless.sh:
#!/bin/bash
BACKUP_DIR="/mnt/backups/paperless"
DATE=$(date +%Y%m%d-%H%M%S)
# Create backup directory
mkdir -p "$BACKUP_DIR"
# Backup documents
docker compose exec -T webserver \
document_exporter ../export/ --zip
# Move export
mv ~/paperless-ngx/export/*.zip "$BACKUP_DIR/documents-$DATE.zip"
# Backup database
docker compose exec -T db pg_dump -U paperless paperless | \
gzip > "$BACKUP_DIR/database-$DATE.sql.gz"
# Keep only last 30 days
find "$BACKUP_DIR" -type f -mtime +30 -delete
echo "Backup completed: $BACKUP_DIR"
Make it executable:
chmod +x backup-paperless.sh
Schedule daily backups:
crontab -e
Add:
0 3 * * * /home/yourusername/paperless-ngx/backup-paperless.sh
Runs every day at 3 AM.
Off-site backups: Copy your backup folder to a remote VPS or cloud storage. See our complete backup guide for detailed strategies.
Secure Your Paperless-ngx Instance
Your documents are sensitive. Lock down your Paperless-ngx installation.
1. Use HTTPS with a Reverse Proxy
Never expose Paperless-ngx directly to the internet over HTTP.
Use a reverse proxy (Nginx Proxy Manager, Traefik, or Caddy) with Let’s Encrypt SSL certificates.
Quick setup with Nginx Proxy Manager:
See our Nginx Proxy Manager guide for step-by-step instructions.
Example Nginx config:
server {
listen 443 ssl http2;
server_name paperless.yourdomain.com;
ssl_certificate /etc/letsencrypt/live/paperless.yourdomain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/paperless.yourdomain.com/privkey.pem;
location / {
proxy_pass http://localhost:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
2. Enable Two-Factor Authentication (2FA)
Paperless-ngx supports TOTP-based 2FA:
- Go to Settings → My Profile
- Click Enable Two-Factor Authentication
- Scan the QR code with your authenticator app (Aegis, Authy, Google Authenticator)
- Enter the 6-digit code to confirm
Require 2FA for all users:
PAPERLESS_ENABLE_2FA: true
3. Restrict Access by IP (Optional)
If you only access Paperless-ngx from specific locations, whitelist IPs:
PAPERLESS_ALLOWED_HOSTS: paperless.yourdomain.com
PAPERLESS_CORS_ALLOWED_HOSTS: https://paperless.yourdomain.com
Add firewall rules to block all except your home/office IP:
sudo ufw allow from 203.0.113.0/24 to any port 8000
4. Regular Security Updates
Keep Paperless-ngx updated:
cd ~/paperless-ngx
docker compose pull
docker compose up -d
Enable automatic security updates on your server:
sudo apt install unattended-upgrades
sudo dpkg-reconfigure -plow unattended-upgrades
See our VPS hardening guide for comprehensive security.
Advanced Configuration
Multi-User Setup
Paperless-ngx supports multiple users with different permission levels:
- Admin — Full access to everything
- User — Can view and manage own documents
- Limited — Read-only access
Create additional users:
Dashboard → Users → Add User
Share documents between users:
- Enable Permissions for a document
- Assign users or groups
- Set read/write permissions
Custom Workflows
Automate document processing with saved views and mail rules.
Example workflow: Automatic invoice processing
- Create a tag:
unpaid-invoice - Create a saved view: “Unpaid Invoices” filtered by this tag
- Set up a consumption rule:
- Match: Filename contains “invoice”
- Action: Add tag
unpaid-invoice
- When paid, remove the tag and add
paid-invoice
Check the “Unpaid Invoices” view daily to see what needs payment.
API Integration
Paperless-ngx has a full REST API for automation:
Get API token: Settings → My Profile → API Token
Upload a document via API:
curl -X POST https://paperless.yourdomain.com/api/documents/post_document/ \
-H "Authorization: Token your-api-token" \
-F "document=@/path/to/file.pdf" \
-F "title=My Document" \
-F "tags=1,2"
Search documents:
curl -X GET "https://paperless.yourdomain.com/api/documents/?query=tax" \
-H "Authorization: Token your-api-token"
Use the API to integrate with:
- n8n or Activepieces for workflow automation
- Home Assistant for smart home integration
- Custom scripts for bulk operations
Troubleshooting Common Issues
OCR Not Working
Symptom: Documents upload but text isn’t extracted.
Fix:
- Check OCR language is installed:
docker compose exec webserver ocrmypdf --list-langs
- Ensure OCR mode is correct:
PAPERLESS_OCR_MODE: skip
- Test OCR manually:
docker compose exec webserver \
ocrmypdf --language eng /tmp/test.pdf /tmp/output.pdf
Database Connection Errors
Symptom: could not connect to server: Connection refused
Fix:
- Ensure the database is running:
docker compose ps
- Check database logs:
docker compose logs db
- Verify credentials match in all services.
High Memory Usage
Symptom: Server runs out of RAM during OCR.
Fix:
- Limit parallel workers:
PAPERLESS_TASK_WORKERS: 1
- Reduce OCR quality:
PAPERLESS_OCR_MODE: skip
PAPERLESS_OCR_PAGES: 1
- Increase server RAM (upgrade your VPS).
Slow Web Interface
Symptom: Dashboard loads slowly, search is sluggish.
Fix:
- Rebuild search index:
docker compose exec webserver document_index reindex
- Optimize PostgreSQL:
# Add to db service environment
POSTGRES_SHARED_BUFFERS: 256MB
POSTGRES_WORK_MEM: 16MB
- Enable Redis caching (already enabled in our config).
Performance and Resource Usage
Resource Requirements
Minimum (light usage, <1000 documents):
- 2GB RAM
- 2 CPU cores
- 20GB storage
Recommended (active use, 5000+ documents):
- 4GB RAM
- 4 CPU cores
- 50GB+ storage
Heavy usage (10,000+ documents, multiple users):
- 8GB RAM
- 6+ CPU cores
- 100GB+ storage
Storage Planning
Document storage:
- Paperless-ngx keeps two copies of each document:
- Original file (as uploaded)
- Archive file (OCR’d PDF with text layer)
Average sizes:
- Scanned page (300 DPI): 200-500 KB
- Office document: 50-200 KB
- OCR’d PDF: +50% of original
Estimate: 1GB stores ~2000-5000 documents, depending on size and type.
Enable compression to save space:
PAPERLESS_OCR_OUTPUT_TYPE: pdfa # PDF/A format, optimized
Performance Benchmarks
Import speeds (on 4GB/2-core VPS):
- Simple PDF (text already present): ~5 seconds
- Scanned page (OCR required): ~15-30 seconds
- 10-page scanned document: ~2-5 minutes
Search performance:
- 1000 documents: Instant (<100ms)
- 10,000 documents: Fast (~200-500ms)
- 50,000+ documents: Still fast (~1-2 seconds)
PostgreSQL full-text search scales well up to 100,000+ documents.
Migrating from Paperless or Paperless-ng
Already running an older version? Migration is straightforward.
From Paperless-ng
Paperless-ngx is a direct continuation of Paperless-ng. Just update the image:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
Run migrations:
docker compose down
docker compose up -d
docker compose exec webserver python3 manage.py migrate
Your documents and database migrate automatically.
From Original Paperless
-
Backup everything first.
-
Export documents:
docker exec paperless document_exporter /export
-
Set up Paperless-ngx (this guide).
-
Import documents:
Place exported files in the consume/ folder. Paperless-ngx reimports and OCRs them.
Note: Metadata (tags, correspondents) must be recreated. The export includes JSON metadata files you can use for reference.
Paperless-ngx vs. Alternatives
How does Paperless-ngx compare to other document management solutions?
| Feature | Paperless-ngx | Nextcloud | Docspell | Commercial DMS |
|---|---|---|---|---|
| OCR | ✅ Excellent (Tesseract + Tika) | ❌ No built-in | ✅ Good | ✅ Varies |
| Full-Text Search | ✅ Fast (PostgreSQL) | ⚠️ Limited | ✅ Good | ✅ Excellent |
| Auto-Tagging | ✅ Advanced rules | ❌ No | ⚠️ Basic | ✅ AI-powered |
| Mobile App | ✅ Excellent | ✅ Excellent | ❌ No | ✅ Usually |
| Email Import | ✅ IMAP support | ⚠️ Via plugins | ✅ Built-in | ✅ Advanced |
| Multi-User | ✅ Yes | ✅ Excellent | ✅ Yes | ✅ Enterprise |
| API | ✅ Full REST API | ✅ WebDAV + REST | ✅ REST API | ✅ Usually |
| Cost | 🆓 Free | 🆓 Free | 🆓 Free | 💰 $$$$ |
Verdict: Paperless-ngx is the best self-hosted DMS for most users. It’s purpose-built for document management, while Nextcloud is a general file sync tool. Docspell is good but less mature.
Conclusion
Paperless-ngx transforms your chaotic pile of scanned documents into a searchable, organized digital archive. With automatic OCR, smart tagging, and powerful search, finding any document takes seconds instead of minutes.
Self-hosting keeps your sensitive documents private and gives you complete control over retention, backups, and access. You’re not dependent on a cloud service that could shut down, raise prices, or get acquired.
Next steps:
- Deploy Paperless-ngx using our docker-compose.yml
- Secure it with HTTPS and 2FA
- Start scanning your paper backlog
- Set up automatic imports from email and mobile
- Create tags and rules to organize automatically
Once you go paperless, you’ll wonder how you ever managed with physical filing cabinets.
Ready to host your own VPS? Check out Hetzner for rock-solid performance at unbeatable prices.
Looking for more self-hosted apps? Browse our app recommendations and complete self-hosting guide.
Found this guide helpful? Share it with someone drowning in paperwork. 📄🔥
Stay in the loop 📬
Get self-hosting tutorials, tool reviews, and infrastructure tips delivered to your inbox. No spam, unsubscribe anytime.
Join 0 self-hosters. Free forever.