OpenSource-Hub
A

Aleph

2.4k stars·Developer Tools·SHA-256 checksum verified

Aleph indexes large document collections and structured data, enabling cross-referencing of entities like people and companies against watchlists. Designed for investigative journalism.

Smart Download

Visit Project Homepage

No installer available yet — head to the source repository

Powerful document indexing and entity cross-referencing tool for investigative research, now sunsetting.

Core Features

  • Index PDF, Word, HTML, CSV, XLS, SQL sources
  • Automatic entity extraction and cross-referencing
  • Full-text search with advanced filters
  • Multi-user collaboration and access control
  • RESTful API for integration

What It Can't Do

  • Project is sunsetting; no updates after Dec 2025. New users should consider Aleph Pro (SaaS). For self-hosting, use main branch (not develop). Security patches will cease.

Use Cases

  • Investigative journalists analyzing leaked documents
  • NGOs mapping corporate ownership networks
  • Researchers extracting entity relationships from historical archives

Detailed Introduction

Aleph is an open-source platform that indexes vast collections of documents (PDF, Word, HTML) and structured data (CSV, XLS, SQL) for easy search and browsing. Built primarily for investigative reporting, it enables users to cross-reference mentions of people, companies, and other entities against watchlists from prior research or public datasets. The project is now sunsetting and will be replaced by Aleph Pro, a fully rewritten SaaS platform. Legacy Aleph will receive no further updates after December 2025, but the codebase remains available under an open-source license.

Troubleshooting & FAQ (2)

Troubleshooting
How to fix 'SQLite objects created in a thread can only be used in that same thread' error in Aleph ingest-file when processing PDFs?

Set the ALEPH_DATABASE_URI and FTM_DATABASE_URI environment variables (or TAGS_DATABASE_URI directly) in your Aleph configuration. By default, if these are commented out in aleph.env.tmpl, the Tags database URI falls back to sqlite:///, which cannot handle multi-threaded writes. Uncomment and set them to your main database connection string (e.g., postgresql://aleph:aleph@aleph-db/aleph for the default Postgres setup). This resolves the threading error in ingest-file versions 3.22.0 and 4.0.0.

GitHub Issue #4002
Troubleshooting
How to fix 'DELETE statement expected to delete 1 row(s); Only 2 were matched' error during OAuth login in Aleph?

This error occurs due to duplicate rows in the role_membership table, causing SQLAlchemy to encounter a mismatch during group sync. To fix: 1) Identify duplicates with: SELECT group_id, member_id, COUNT(*) FROM role_membership GROUP BY group_id, member_id HAVING COUNT(*) > 1; 2) Remove extra rows manually. 3) Prevent recurrence by adding a unique constraint: ALTER TABLE role_membership ADD UNIQUE (group_id, member_id); (Or create a database migration to add UniqueConstraint('group_id', 'member_id')). After these steps, OAuth callbacks will work correctly.

GitHub Issue #3811

Tags

investigative journalismdocument indexingentity cross-referencingdata searchopen source

Getting Started

1

Download installer

Click the button above to download the installer for your system

2

Install the software

Double-click the downloaded installer and follow the prompts

3

Step 1: Set up Docker environment per docs.aleph.occrp.org

4

Step 2: Clone repo and run docker-compose up -d

5

Step 3: Access http://localhost:8080 and create admin account

Install Guide
  1. Step 1: Set up Docker environment per docs.aleph.occrp.org
  2. Step 2: Clone repo and run docker-compose up -d
  3. Step 3: Access http://localhost:8080 and create admin account
File Integrity

Checksum not available

This project has not published a SHA-256 checksum on its GitHub Release page

SHA256 Checksum

No checksum available

Download directly from GitHub Releases and verify file integrity yourself

All SHA-256 checksums on this platform are extracted from the project's official GitHub Release page, without any modification. You can independently verify them on the GitHub Releases page.

Open Source Transparency

View GitHub Source
Environment Guide

Uninstall Info

Stop Docker containers and remove volumes: docker-compose down -v. Backup data first.

No Extra Dependencies

Ready to use after download. No additional runtime required.

Project Info
LicenseAGPL-3.0
Last Updated2026-06-26 22:05:19
GitHub RepositoryOfficial Website

Having issues? Check the FAQ below

2 FAQs

Similar Projects