Self-Host Paperless-ngx: Complete Document Management System Guide

Self-Host Paperless-ngx: Complete Document Management System Guide

Learn how to self-host Paperless-ngx, a powerful open-source document management system. Complete guide with Docker setup, OCR, and automation.

Self-Host Paperless-ngx: Complete Document Management System Guide

Going paperless isn’t just about scanning documents — it’s about creating a searchable, organized digital archive that you actually control. Paperless-ngx is the gold standard for self-hosted document management: it automatically organizes your scanned documents, extracts text with OCR, and makes everything searchable.

Unlike cloud services like Evernote, Google Drive, or Dropbox that lock you into their ecosystem and read your documents for advertising, Paperless-ngx runs on your own server and keeps your sensitive documents private.

In this guide, you’ll learn how to deploy Paperless-ngx with Docker, configure OCR for multiple languages, set up automatic imports, and integrate with your existing self-hosted apps.

What is Paperless-ngx?

Paperless-ngx is an open-source document management system (DMS) that automatically:

  • Scans and imports documents from various sources
  • Extracts text using OCR (optical character recognition)
  • Organizes with tags, correspondents, and document types
  • Searches through full-text content instantly
  • Manages metadata, creation dates, and custom fields
  • Archives originals while creating searchable PDFs
  • Exports to multiple formats

It’s a complete replacement for physical filing cabinets and expensive enterprise DMS solutions. Originally created as “Paperless,” the project was forked and significantly improved by the community as “Paperless-ng,” and recently rebranded as Paperless-ngx with active development.

Why Self-Host Paperless-ngx?

Privacy: Your tax documents, medical records, and legal papers stay on your server. No third-party scanning or indexing.

Control: You decide retention policies, backup schedules, and who has access.

Cost: Free forever. No per-user fees, storage limits, or feature paywalls.

Flexibility: Integrate with your scanner, mobile phone, email, or any other input source.

Compliance: Keep sensitive business documents on-premises for GDPR, HIPAA, or other regulatory requirements.

Prerequisites

Before you start, you’ll need:

  1. A server or VPS with at least:

    • 2GB RAM (4GB recommended for OCR performance)
    • 20GB storage + space for your documents
    • Docker and Docker Compose installed
  2. A domain name (optional but recommended for HTTPS access)

  3. Basic familiarity with Docker and command line

Need a VPS? Check out these reliable providers:

  • Hetzner — Best value (€4.15/month for 2GB RAM)
  • DigitalOcean — Great documentation ($12/month)
  • Vultr — Global coverage ($6/month)

All three offer one-click Docker deployments and excellent uptime.

Quick Start: Deploy Paperless-ngx with Docker Compose

The fastest way to get Paperless-ngx running is with Docker Compose. This method handles all dependencies (PostgreSQL database, Redis cache, Tika for document parsing) automatically.

Step 1: Create the Project Directory

mkdir -p ~/paperless-ngx
cd ~/paperless-ngx

Step 2: Create docker-compose.yml

Create a docker-compose.yml file with this configuration:

version: "3.8"

services:
  broker:
    image: docker.io/library/redis:7
    restart: unless-stopped
    volumes:
      - redisdata:/data

  db:
    image: docker.io/library/postgres:16
    restart: unless-stopped
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: paperless
      POSTGRES_USER: paperless
      POSTGRES_PASSWORD: paperless

  webserver:
    image: ghcr.io/paperless-ngx/paperless-ngx:latest
    restart: unless-stopped
    depends_on:
      - db
      - broker
    ports:
      - "8000:8000"
    volumes:
      - data:/usr/src/paperless/data
      - media:/usr/src/paperless/media
      - ./export:/usr/src/paperless/export
      - ./consume:/usr/src/paperless/consume
    environment:
      PAPERLESS_REDIS: redis://broker:6379
      PAPERLESS_DBHOST: db
      PAPERLESS_DBNAME: paperless
      PAPERLESS_DBUSER: paperless
      PAPERLESS_DBPASS: paperless
      PAPERLESS_SECRET_KEY: change-this-to-a-random-string
      PAPERLESS_URL: https://paperless.yourdomain.com
      PAPERLESS_TIME_ZONE: Europe/Paris
      PAPERLESS_OCR_LANGUAGE: eng
      PAPERLESS_ADMIN_USER: admin
      PAPERLESS_ADMIN_PASSWORD: changeme
      PAPERLESS_TIKA_ENABLED: 1
      PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000
      PAPERLESS_TIKA_ENDPOINT: http://tika:9998

  gotenberg:
    image: docker.io/gotenberg/gotenberg:8
    restart: unless-stopped
    command:
      - "gotenberg"
      - "--chromium-disable-javascript=true"
      - "--chromium-allow-list=file:///tmp/.*"

  tika:
    image: ghcr.io/paperless-ngx/tika:latest
    restart: unless-stopped

volumes:
  data:
  media:
  pgdata:
  redisdata:

Step 3: Configure Environment Variables

Important: Before deploying, customize these settings:

  1. Generate a secret key:
openssl rand -base64 32

Replace change-this-to-a-random-string with the output.

  1. Change the admin password from changeme to something secure.

  2. Set your domain in PAPERLESS_URL (or use your server IP for testing).

  3. Set your timezone in PAPERLESS_TIME_ZONE. See the full list.

  4. Configure OCR language: Set PAPERLESS_OCR_LANGUAGE to your language code(s). Examples:

    • eng — English
    • fra — French
    • deu — German
    • spa — Spanish
    • eng+fra — Multiple languages (English + French)

Full list of language codes.

Step 4: Create Directories

mkdir -p consume export
  • consume/ — Drop files here for automatic import
  • export/ — Exported documents appear here

Step 5: Launch Paperless-ngx

docker compose up -d

This will:

  1. Pull all required images (~2-3 GB)
  2. Create the database and apply migrations
  3. Start all services
  4. Create your admin user

Check the logs to ensure everything started correctly:

docker compose logs -f webserver

Wait for the message: Application startup complete.

Step 6: Access the Web Interface

Open your browser and navigate to:

http://your-server-ip:8000

Log in with:

  • Username: admin
  • Password: (the password you set in docker-compose.yml)

First login tip: You’ll see a setup wizard. The defaults are fine for most users.

Configure OCR and Document Processing

Paperless-ngx’s killer feature is automatic OCR — it extracts text from scanned documents and makes them fully searchable.

OCR Languages

If you work with documents in multiple languages, configure all of them:

PAPERLESS_OCR_LANGUAGE: eng+fra+deu+spa

This enables English, French, German, and Spanish OCR. The first language (eng) is the default.

Important: More languages increase RAM usage during OCR. Monitor performance if you enable many languages.

OCR Mode

Control how Paperless-ngx handles documents that already contain text:

PAPERLESS_OCR_MODE: skip

Options:

  • skip — Skip OCR if text is detected (fastest, recommended)
  • redo — Always run OCR, discard existing text
  • force — Run OCR and append to existing text
  • skip_noarchive — Skip OCR and don’t create an archived version

Recommendation: Use skip for best performance. Most PDFs from digital sources already contain text.

OCR Performance Tuning

If OCR is slow on your server:

  1. Limit parallel tasks:
PAPERLESS_TASK_WORKERS: 1
PAPERLESS_THREADS_PER_WORKER: 1
  1. Reduce OCR quality (faster, less accurate):
PAPERLESS_OCR_MODE: skip
PAPERLESS_OCR_PAGES: 1  # Only OCR first page
  1. Disable TIKA if you don’t need advanced document parsing:
PAPERLESS_TIKA_ENABLED: 0

Remove the gotenberg and tika services from docker-compose.yml.

Set Up Automatic Document Import

Paperless-ngx can automatically import documents from multiple sources.

1. Local Folder (Consume Directory)

The easiest method: drop files into the consume/ folder.

Paperless-ngx checks this folder every few seconds and automatically imports new files.

Create subdirectories for organization:

mkdir -p consume/receipts
mkdir -p consume/invoices
mkdir -p consume/personal

Files in subdirectories are tagged automatically based on the folder name.

2. Email Import

Forward emails with attachments to Paperless-ngx:

PAPERLESS_CONSUMER_ENABLE_IMAP_MAILBOX: true
PAPERLESS_CONSUMER_IMAP_SERVER: imap.gmail.com
PAPERLESS_CONSUMER_IMAP_PORT: 993
PAPERLESS_CONSUMER_IMAP_USER: [email protected]
PAPERLESS_CONSUMER_IMAP_PASSWORD: your-app-password
PAPERLESS_CONSUMER_IMAP_FOLDER: Paperless

How it works:

  1. Create a Gmail label/folder called “Paperless”
  2. Forward emails with documents to your Gmail
  3. Apply the “Paperless” label
  4. Paperless-ngx imports attachments automatically

Security tip: Use an app password instead of your real Gmail password.

3. Mobile Upload (Paperless Mobile App)

Install the Paperless Mobile app (iOS / Android):

  1. Configure your Paperless-ngx URL
  2. Generate an API token (Settings → My Profile → API Token)
  3. Scan documents with your phone camera
  4. Upload directly to Paperless-ngx

Perfect for receipts, business cards, and paper mail.

4. Network Scanner Integration

Most modern scanners support “scan to network folder.” Two approaches:

Option A: SMB Share

Mount your consume folder as a network share:

sudo apt install samba
sudo nano /etc/samba/smb.conf

Add:

[paperless]
path = /home/yourusername/paperless-ngx/consume
writable = yes
guest ok = yes
create mask = 0644

Restart Samba:

sudo systemctl restart smbd

Configure your scanner to save to \\your-server-ip\paperless.

Option B: FTP Server

Some scanners only support FTP:

docker run -d --name paperless-ftp \
  -p 21:21 -p 21000-21010:21000-21010 \
  -v ~/paperless-ngx/consume:/home/ftpuser/upload \
  -e FTP_USER=scanner \
  -e FTP_PASS=your-password \
  fauria/vsftpd

Configure your scanner to upload via FTP.

Organize Documents with Tags, Types, and Correspondents

Paperless-ngx uses tags, document types, and correspondents to organize your archive.

Tags

Tags are flexible labels you can assign to any document:

  • receipt
  • tax-2026
  • warranty
  • contract
  • medical

Create tags: Dashboard → Tags → Add Tag

Auto-tagging: Set up rules to automatically tag documents based on filename or content.

Example rule:

  • Name: Auto-tag receipts
  • Match: Contains “receipt” or “invoice” (case-insensitive)
  • Action: Add tag receipt

Document Types

Document types categorize broad document classes:

  • Invoice
  • Receipt
  • Contract
  • Letter
  • Report
  • Manual
  • Photo

Create types: Dashboard → Document Types → Add Type

Use document types for high-level filtering, tags for granular organization.

Correspondents

Correspondents identify who sent or issued the document:

  • Amazon
  • IRS
  • Electric Company
  • Bank of America
  • Dr. Smith

Create correspondents: Dashboard → Correspondents → Add Correspondent

Auto-correspondent matching: Paperless-ngx can automatically detect correspondents from document content using regex patterns.

Example:

  • Name: Amazon
  • Match: Amazon\.com|AMAZON\.COM|Order #\d+

Custom Fields

Need more metadata? Create custom fields:

Dashboard → Custom Fields → Add Field

Examples:

  • Invoice Number (text)
  • Due Date (date)
  • Amount (monetary)
  • Project (select)

Custom fields appear in search filters and can be bulk-edited.

Search and Retrieve Documents

The whole point of going paperless is finding documents instantly.

Paperless-ngx indexes every word in your documents. Search from the main dashboard:

  • tax return 2025 — Find all tax documents from 2025
  • warranty laptop — Find laptop warranty papers
  • contract smith — Find contracts related to someone named Smith

Search is instant and searches OCR’d text, filenames, and metadata.

Advanced Search Filters

Combine filters for precise queries:

  • Tags: receipt AND tax-2026
  • Correspondent: Amazon
  • Document Type: Invoice
  • Date Range: 2025-01-01 to 2025-12-31
  • Custom Field: Project = “Website Redesign”

Saved views: Create saved searches for common queries:

  • “2026 Tax Documents”
  • “Unpaid Invoices”
  • “Medical Records — Last Year”

Access saved views from the sidebar for instant filtering.

Download and Export

Download documents in multiple formats:

  • Original — The file as it was uploaded
  • Archive — OCR’d PDF with searchable text layer
  • Thumbnail — Preview image
  • Metadata (JSON) — All document metadata

Bulk export: Select multiple documents and export as a ZIP file.

Backup Your Paperless-ngx Installation

Your Paperless-ngx archive contains irreplaceable documents. Back it up.

What to Back Up

  1. Documents: /usr/src/paperless/media (inside the container)
  2. Database: PostgreSQL data
  3. Configuration: docker-compose.yml and .env files
  4. Export folder: Your exported documents

Automated Backup Script

Create backup-paperless.sh:

#!/bin/bash
BACKUP_DIR="/mnt/backups/paperless"
DATE=$(date +%Y%m%d-%H%M%S)

# Create backup directory
mkdir -p "$BACKUP_DIR"

# Backup documents
docker compose exec -T webserver \
  document_exporter ../export/ --zip

# Move export
mv ~/paperless-ngx/export/*.zip "$BACKUP_DIR/documents-$DATE.zip"

# Backup database
docker compose exec -T db pg_dump -U paperless paperless | \
  gzip > "$BACKUP_DIR/database-$DATE.sql.gz"

# Keep only last 30 days
find "$BACKUP_DIR" -type f -mtime +30 -delete

echo "Backup completed: $BACKUP_DIR"

Make it executable:

chmod +x backup-paperless.sh

Schedule daily backups:

crontab -e

Add:

0 3 * * * /home/yourusername/paperless-ngx/backup-paperless.sh

Runs every day at 3 AM.

Off-site backups: Copy your backup folder to a remote VPS or cloud storage. See our complete backup guide for detailed strategies.

Secure Your Paperless-ngx Instance

Your documents are sensitive. Lock down your Paperless-ngx installation.

1. Use HTTPS with a Reverse Proxy

Never expose Paperless-ngx directly to the internet over HTTP.

Use a reverse proxy (Nginx Proxy Manager, Traefik, or Caddy) with Let’s Encrypt SSL certificates.

Quick setup with Nginx Proxy Manager:

See our Nginx Proxy Manager guide for step-by-step instructions.

Example Nginx config:

server {
    listen 443 ssl http2;
    server_name paperless.yourdomain.com;

    ssl_certificate /etc/letsencrypt/live/paperless.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/paperless.yourdomain.com/privkey.pem;

    location / {
        proxy_pass http://localhost:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

2. Enable Two-Factor Authentication (2FA)

Paperless-ngx supports TOTP-based 2FA:

  1. Go to Settings → My Profile
  2. Click Enable Two-Factor Authentication
  3. Scan the QR code with your authenticator app (Aegis, Authy, Google Authenticator)
  4. Enter the 6-digit code to confirm

Require 2FA for all users:

PAPERLESS_ENABLE_2FA: true

3. Restrict Access by IP (Optional)

If you only access Paperless-ngx from specific locations, whitelist IPs:

PAPERLESS_ALLOWED_HOSTS: paperless.yourdomain.com
PAPERLESS_CORS_ALLOWED_HOSTS: https://paperless.yourdomain.com

Add firewall rules to block all except your home/office IP:

sudo ufw allow from 203.0.113.0/24 to any port 8000

4. Regular Security Updates

Keep Paperless-ngx updated:

cd ~/paperless-ngx
docker compose pull
docker compose up -d

Enable automatic security updates on your server:

sudo apt install unattended-upgrades
sudo dpkg-reconfigure -plow unattended-upgrades

See our VPS hardening guide for comprehensive security.

Advanced Configuration

Multi-User Setup

Paperless-ngx supports multiple users with different permission levels:

  1. Admin — Full access to everything
  2. User — Can view and manage own documents
  3. Limited — Read-only access

Create additional users:

Dashboard → Users → Add User

Share documents between users:

  1. Enable Permissions for a document
  2. Assign users or groups
  3. Set read/write permissions

Custom Workflows

Automate document processing with saved views and mail rules.

Example workflow: Automatic invoice processing

  1. Create a tag: unpaid-invoice
  2. Create a saved view: “Unpaid Invoices” filtered by this tag
  3. Set up a consumption rule:
    • Match: Filename contains “invoice”
    • Action: Add tag unpaid-invoice
  4. When paid, remove the tag and add paid-invoice

Check the “Unpaid Invoices” view daily to see what needs payment.

API Integration

Paperless-ngx has a full REST API for automation:

Get API token: Settings → My Profile → API Token

Upload a document via API:

curl -X POST https://paperless.yourdomain.com/api/documents/post_document/ \
  -H "Authorization: Token your-api-token" \
  -F "document=@/path/to/file.pdf" \
  -F "title=My Document" \
  -F "tags=1,2"

Search documents:

curl -X GET "https://paperless.yourdomain.com/api/documents/?query=tax" \
  -H "Authorization: Token your-api-token"

Full API documentation.

Use the API to integrate with:

  • n8n or Activepieces for workflow automation
  • Home Assistant for smart home integration
  • Custom scripts for bulk operations

Troubleshooting Common Issues

OCR Not Working

Symptom: Documents upload but text isn’t extracted.

Fix:

  1. Check OCR language is installed:
docker compose exec webserver ocrmypdf --list-langs
  1. Ensure OCR mode is correct:
PAPERLESS_OCR_MODE: skip
  1. Test OCR manually:
docker compose exec webserver \
  ocrmypdf --language eng /tmp/test.pdf /tmp/output.pdf

Database Connection Errors

Symptom: could not connect to server: Connection refused

Fix:

  1. Ensure the database is running:
docker compose ps
  1. Check database logs:
docker compose logs db
  1. Verify credentials match in all services.

High Memory Usage

Symptom: Server runs out of RAM during OCR.

Fix:

  1. Limit parallel workers:
PAPERLESS_TASK_WORKERS: 1
  1. Reduce OCR quality:
PAPERLESS_OCR_MODE: skip
PAPERLESS_OCR_PAGES: 1
  1. Increase server RAM (upgrade your VPS).

Slow Web Interface

Symptom: Dashboard loads slowly, search is sluggish.

Fix:

  1. Rebuild search index:
docker compose exec webserver document_index reindex
  1. Optimize PostgreSQL:
# Add to db service environment
POSTGRES_SHARED_BUFFERS: 256MB
POSTGRES_WORK_MEM: 16MB
  1. Enable Redis caching (already enabled in our config).

Performance and Resource Usage

Resource Requirements

Minimum (light usage, <1000 documents):

  • 2GB RAM
  • 2 CPU cores
  • 20GB storage

Recommended (active use, 5000+ documents):

  • 4GB RAM
  • 4 CPU cores
  • 50GB+ storage

Heavy usage (10,000+ documents, multiple users):

  • 8GB RAM
  • 6+ CPU cores
  • 100GB+ storage

Storage Planning

Document storage:

  • Paperless-ngx keeps two copies of each document:
    1. Original file (as uploaded)
    2. Archive file (OCR’d PDF with text layer)

Average sizes:

  • Scanned page (300 DPI): 200-500 KB
  • Office document: 50-200 KB
  • OCR’d PDF: +50% of original

Estimate: 1GB stores ~2000-5000 documents, depending on size and type.

Enable compression to save space:

PAPERLESS_OCR_OUTPUT_TYPE: pdfa  # PDF/A format, optimized

Performance Benchmarks

Import speeds (on 4GB/2-core VPS):

  • Simple PDF (text already present): ~5 seconds
  • Scanned page (OCR required): ~15-30 seconds
  • 10-page scanned document: ~2-5 minutes

Search performance:

  • 1000 documents: Instant (<100ms)
  • 10,000 documents: Fast (~200-500ms)
  • 50,000+ documents: Still fast (~1-2 seconds)

PostgreSQL full-text search scales well up to 100,000+ documents.

Migrating from Paperless or Paperless-ng

Already running an older version? Migration is straightforward.

From Paperless-ng

Paperless-ngx is a direct continuation of Paperless-ng. Just update the image:

image: ghcr.io/paperless-ngx/paperless-ngx:latest

Run migrations:

docker compose down
docker compose up -d
docker compose exec webserver python3 manage.py migrate

Your documents and database migrate automatically.

From Original Paperless

  1. Backup everything first.

  2. Export documents:

docker exec paperless document_exporter /export
  1. Set up Paperless-ngx (this guide).

  2. Import documents:

Place exported files in the consume/ folder. Paperless-ngx reimports and OCRs them.

Note: Metadata (tags, correspondents) must be recreated. The export includes JSON metadata files you can use for reference.

Paperless-ngx vs. Alternatives

How does Paperless-ngx compare to other document management solutions?

FeaturePaperless-ngxNextcloudDocspellCommercial DMS
OCR✅ Excellent (Tesseract + Tika)❌ No built-in✅ Good✅ Varies
Full-Text Search✅ Fast (PostgreSQL)⚠️ Limited✅ Good✅ Excellent
Auto-Tagging✅ Advanced rules❌ No⚠️ Basic✅ AI-powered
Mobile App✅ Excellent✅ Excellent❌ No✅ Usually
Email Import✅ IMAP support⚠️ Via plugins✅ Built-in✅ Advanced
Multi-User✅ Yes✅ Excellent✅ Yes✅ Enterprise
API✅ Full REST API✅ WebDAV + REST✅ REST API✅ Usually
Cost🆓 Free🆓 Free🆓 Free💰 $$$$

Verdict: Paperless-ngx is the best self-hosted DMS for most users. It’s purpose-built for document management, while Nextcloud is a general file sync tool. Docspell is good but less mature.

Conclusion

Paperless-ngx transforms your chaotic pile of scanned documents into a searchable, organized digital archive. With automatic OCR, smart tagging, and powerful search, finding any document takes seconds instead of minutes.

Self-hosting keeps your sensitive documents private and gives you complete control over retention, backups, and access. You’re not dependent on a cloud service that could shut down, raise prices, or get acquired.

Next steps:

  1. Deploy Paperless-ngx using our docker-compose.yml
  2. Secure it with HTTPS and 2FA
  3. Start scanning your paper backlog
  4. Set up automatic imports from email and mobile
  5. Create tags and rules to organize automatically

Once you go paperless, you’ll wonder how you ever managed with physical filing cabinets.

Ready to host your own VPS? Check out Hetzner for rock-solid performance at unbeatable prices.

Looking for more self-hosted apps? Browse our app recommendations and complete self-hosting guide.


Found this guide helpful? Share it with someone drowning in paperwork. 📄🔥

Stay in the loop 📬

Get self-hosting tutorials, tool reviews, and infrastructure tips delivered to your inbox. No spam, unsubscribe anytime.

Join 0 self-hosters. Free forever.