Follow the Money

Underneath all of the Aleph tools is Follow the Money, a data model for the most common entities and link types of anti-corruption reporting. Aleph tools browse, import or export this data.

The Follow the Money data model is designed to organise concepts which arise in money laundering and corruption investigations in a way that is useful to investigative journalists.

Follow the Money is both the name of the data model, and of a command-line tool used to generate and process data formatted using this model. Aleph is an interactive viewer on top of a search index of Follow the Money entities.

Schema

The data model of Follow the Money consists of so-called schemata, i.e. object types. They exist as an inheritance hierarchy, rooted in things and intervals. You can also think of these as entities and events.

Things

Things describe most of the real-world objects represented in Aleph and Follow the Money. This includes People and Companies, Assets and Court Cases. While normally generated by the file ingestion service of the Aleph server, a large set of sub-types of Documentis creating through entity mappings.

Things (or nodes) include People, Companies, Documents and similar entity types.

Intervals

Intervals are business interests, court cases, sanctions and transactions (and their descendents). Intervals tend to be useful for linking two entities together, possibly over a specific time period.

Intervals are temporary in nature, and often define a relationship between two or more things, e.g. a membership in an organisation, or a family association.

Data format

Instances of Follow the Money entities are usually expressed as snippets of JSON, with three standard fields: an id, a specification of the schema to be used, and a set of properties. All properties are multi-valued.

{
"id": "1b38214f88d139897bbd13eabde464043d84bbf9",
"schema": "Person",
"properties": {
"name": ["John Doe"],
"nationality": ["us"],
"birthDate": ["1982"]
}
}

Such entities are either generated by mapping structured data, or through the file ingestion process of Aleph.

Property types

Property types constrain the possible values of a field, e.g. by limiting what countries are valid, what a phone number and email address look like, etc. They are also used to compare how similar values of the same type are, i.e. the algorithm for comparing two names is not the same as for two phone numbers.

Name

Description

Strong

string

General purpose string field

no

text

Long-form text field. Data stored in this type will be processed for named entity recognition (NER) in Aleph.

no

name

Name of a company or person

yes

address

Street address, formatted as a simple text

yes

checksum

SHA1 of an underlying data asset (i.e. the source file for an Email entity)

yes

country

Country code, ISO 3166 with some additions

yes

date

ISO 8661 date prefix, e.g. 2009, 2009-03-01, or 2009-03-01T22:34:00

yes

email

E-mail address, must contain an @

yes

entity

Reference to another entity by its ID

yes

iban

International Bank Account Number (IBAN), ISO 13616

yes

identifier

Generic identifier, e.g. a company or tax registration number

yes

ip

Internet host address, v4 or v6

yes

json

Nested JSON object - rarely used

no

language

ISO 639 three-letter code

no

mimetype

Internet MIME type, e.g. text/plain

no

number

Numeric value, usually converted to floating point number

no

phone

Phone number, including an international dialling prefix

yes

url

Uniform Resource Locator, internet address path

yes

Schema definitions

The individual entity types (schemata) within the Follow the Money data model are defined using a set of YAML files. This model can be extended or modified for other deployments of Aleph to some extent, but this process is not very well established.

Follow the Money is reminiscent of a linked data ontology, and indeed it is regularly published in Turtle and RDF/XML format for use in RDF-based applications. See the ftm export-rdf documentation for help on how to export Follow the Money data to linked data.

Proposing changes

Any user of Aleph should feel free to propose changes or extensions to the Follow the Money data model by submitting a GitHub issue. Proposed changes should embody the design sense of the data model, which prioritises practicality over correctness; understandability over standards compliance.