Back to tools
Introduction

Aleph is a tool, built by Friedrich Lindenberg during his ICFJ Knight Fellowship, for indexing large amounts of both text (PDF, Word, HTML) and tabular (CSV, XLS, SQL) data for easy browsing and search. It is built with investigative reporting as a primary use case, and it allows cross-referencing mentions of well-known people and companies against watch lists, which are built from prior research or public data sets. The tool was first developed for ANCIR, as part of Grano, a reporting tool for investigating the connections between public and private officials.

Following his ICFJ Knight Fellowship, Lindenberg used Aleph to power a data search feature for OCCRP's Investigative Dashboard. This tool lets reporters search more than 93 million documents and datasets from previous OCCRP investigations as well as official sources and other scraped data. The information is cross-referenced with watchlists based on OCCRP research and international sanctions lists.

The OpenOil platform also leverages Aleph to search through more than 2 million corporate filings related to the oil, gas and mining industries. The site also indexes and makes searchable the full text of contracts, company disclosures, news articles and government reports, enabling reporters to simultaneously check documents from a variety of sources.

Link https://github.com/pudo/aleph/
Story About the Tool "OCCRP Launches New Search Engine for Investigative Journalists"
Contributors
ICFJ Knight Fellow




International Center for Journalists