docspell
A personal document management system that uses OCR, full-text search, and ML to automatically organize scanned documents, emails, and files.
Smart Download
Download Download Version
v0.43.0 · 372.6 MB
A self-hosted smart document organizer that auto-tags your digital papers using machine learning.
Core Features
- Automatically extract metadata (correspondent, tags, dates) via ML (Stanford CoreNLP)
- Built-in OCR for scanned images and PDFs
- Full-text search and email integration
- Mobile-friendly SPA web UI plus Android client
- Custom fields, batch editing, multi-user support
What It Can't Do
- •Requires Docker familiarity; OCR depends on Tesseract and ocrmypdf (may need extra language packs for non-English); first run downloads ~1GB ML model; steeper learning curve compared to simpler alternatives like Paperless-ngx.
Use Cases
- Home document archiving (bills, contracts, medical records)
- Small office shared document library with email attachment auto-classification
- Personal knowledge management – digitize paper notes into searchable archive
Detailed Introduction
Docspell is a self-hosted document management system (DMS) designed for home users, families, and small groups. It helps you organize scanned papers, emails, and other digital files by automatically extracting metadata such as correspondents, tags, and dates using machine learning (Stanford CoreNLP). It supports OCR for images, full-text search, email integration, custom fields, and a mobile-friendly web interface. Backend written in Scala (pure functional stack), frontend in Elm with Tailwind CSS. Can be deployed via Docker, Debian packages, ZIP, Nix, or Helm.
Troubleshooting & FAQ (2)
TroubleshootingWhy does unoconv fail with 'Failed to connect to soffice.bin in 6 seconds' when converting XLSX to PDF?
This error occurs because unoconv times out trying to connect to a LibreOffice listener, often due to stale or accumulated soffice.bin processes. Workaround: kill all existing soffice processes and start unoconv in permanent listener mode. Run: pkill -f soffice.bin && unoconv --listener &. Then retry your conversion. Ensure the listener is started before processing. For persistent environments (like Docker), consider moving to unoserver as a more reliable replacement for unoconv.
How-toHow to pass custom metadata when uploading documents to Mayan EDMS?
The upload API currently does not accept arbitrary custom metadata. As a workaround, encode the metadata in the document filename (e.g., using an absolute path with embedded info) during upload. In your addon or job-done hook, extract and parse the filename to retrieve the original metadata. Note that filenames may not be unique and could be exposed in the DMS. A feature request to support a dedicated metadata field exists in issue #2334.
Tags
Getting Started
Install the software
Install the appropriate package for your distro (dpkg / rpm / AppImage)
Step 1: Clone the Docker repo: git clone https://github.com/docspell/docker docspell-docker
Step 2: Enter docker-compose directory and start: cd docspell-docker/docker-compose && docker-compose up -d
Step 3: Open http://localhost:7880, sign up, and start importing documents
- Step 1: Clone the Docker repo: git clone https://github.com/docspell/docker docspell-docker
- Step 2: Enter docker-compose directory and start: cd docspell-docker/docker-compose && docker-compose up -d
- Step 3: Open http://localhost:7880, sign up, and start importing documents
Checksum not available
This project has not published a SHA-256 checksum on its GitHub Release page
SHA256 Checksum
No checksum available
Download directly from GitHub Releases and verify file integrity yourself
All SHA-256 checksums on this platform are extracted from the project's official GitHub Release page, without any modification. You can independently verify them on the GitHub Releases page.
Open Source Transparency
View GitHub SourceUninstall Info
Stop containers: docker-compose down; remove directory: rm -rf docspell-docker; manually clean up persistent volumes if needed (docker volume prune).
No Extra Dependencies
Ready to use after download. No additional runtime required.
Having issues? Check the FAQ below
2 FAQs
Similar Projects
Paperless-ngx
An open-source document management system that turns physical papers into a searchable, organized digital archive. Self-hosted, OCR-powered, and built to eliminate paper clutter forever.
copyparty
copyparty turns any device into a file server with resumable uploads/downloads using any web browser. Supports HTTP, WebDAV, SFTP, FTP, TFTP, SMB. Only requires Python (2 or 3).
Nextcloud Server
Nextcloud Server is a free, self-hosted productivity platform that puts you in control of your files, contacts, calendars, and communication.