OpenSource-Hub
D

docspell

2.2k stars·File Management·SHA-256 checksum verified

A personal document management system that uses OCR, full-text search, and ML to automatically organize scanned documents, emails, and files.

A self-hosted smart document organizer that auto-tags your digital papers using machine learning.

Core Features

  • Automatically extract metadata (correspondent, tags, dates) via ML (Stanford CoreNLP)
  • Built-in OCR for scanned images and PDFs
  • Full-text search and email integration
  • Mobile-friendly SPA web UI plus Android client
  • Custom fields, batch editing, multi-user support

What It Can't Do

  • Requires Docker familiarity; OCR depends on Tesseract and ocrmypdf (may need extra language packs for non-English); first run downloads ~1GB ML model; steeper learning curve compared to simpler alternatives like Paperless-ngx.

Use Cases

  • Home document archiving (bills, contracts, medical records)
  • Small office shared document library with email attachment auto-classification
  • Personal knowledge management – digitize paper notes into searchable archive

Detailed Introduction

Docspell is a self-hosted document management system (DMS) designed for home users, families, and small groups. It helps you organize scanned papers, emails, and other digital files by automatically extracting metadata such as correspondents, tags, and dates using machine learning (Stanford CoreNLP). It supports OCR for images, full-text search, email integration, custom fields, and a mobile-friendly web interface. Backend written in Scala (pure functional stack), frontend in Elm with Tailwind CSS. Can be deployed via Docker, Debian packages, ZIP, Nix, or Helm.

Troubleshooting & FAQ (2)

Troubleshooting
Why does unoconv fail with 'Failed to connect to soffice.bin in 6 seconds' when converting XLSX to PDF?

This error occurs because unoconv times out trying to connect to a LibreOffice listener, often due to stale or accumulated soffice.bin processes. Workaround: kill all existing soffice processes and start unoconv in permanent listener mode. Run: pkill -f soffice.bin && unoconv --listener &. Then retry your conversion. Ensure the listener is started before processing. For persistent environments (like Docker), consider moving to unoserver as a more reliable replacement for unoconv.

GitHub Issue #3293
How-to
How to pass custom metadata when uploading documents to Mayan EDMS?

The upload API currently does not accept arbitrary custom metadata. As a workaround, encode the metadata in the document filename (e.g., using an absolute path with embedded info) during upload. In your addon or job-done hook, extract and parse the filename to retrieve the original metadata. Note that filenames may not be unique and could be exposed in the DMS. A feature request to support a dedicated metadata field exists in issue #2334.

GitHub Issue #2334

Tags

document-managementocrfulltext-searchmachine-learningself-hostedscalaelm

Getting Started

1

Download installer

Click the button above to download the installer for your system

2

Install the software

Install the appropriate package for your distro (dpkg / rpm / AppImage)

3

Step 1: Clone the Docker repo: git clone https://github.com/docspell/docker docspell-docker

4

Step 2: Enter docker-compose directory and start: cd docspell-docker/docker-compose && docker-compose up -d

5

Step 3: Open http://localhost:7880, sign up, and start importing documents

Install Guide
  1. Step 1: Clone the Docker repo: git clone https://github.com/docspell/docker docspell-docker
  2. Step 2: Enter docker-compose directory and start: cd docspell-docker/docker-compose && docker-compose up -d
  3. Step 3: Open http://localhost:7880, sign up, and start importing documents
File Integrity

Checksum not available

This project has not published a SHA-256 checksum on its GitHub Release page

SHA256 Checksum

No checksum available

Download directly from GitHub Releases and verify file integrity yourself

All SHA-256 checksums on this platform are extracted from the project's official GitHub Release page, without any modification. You can independently verify them on the GitHub Releases page.

Open Source Transparency

View GitHub Source
Environment Guide

Uninstall Info

Stop containers: docker-compose down; remove directory: rm -rf docspell-docker; manually clean up persistent volumes if needed (docker volume prune).

No Extra Dependencies

Ready to use after download. No additional runtime required.

Project Info
LicenseAGPL-3.0-or-later
Last Updated2026-06-26 22:40:09
GitHub RepositoryOfficial Website

Having issues? Check the FAQ below

2 FAQs

Similar Projects