The Follow the Money data model is designed to organise concepts which arise in money laundering and corruption investigations in a way that is useful to investigative journalists.
Follow the Money is both the name of the data model, and of a command-line tool used to generate and process data formatted using this model. Aleph is an interactive viewer on top of a search index of Follow the Money entities.
The data model of Follow the Money consists of so-called schemata, i.e. object types. They exist as an inheritance hierarchy, rooted in things and intervals. You can also think of these as entities and events.
Things describe most of the real-world objects represented in Aleph and Follow the Money. This includes People
and Companies
, Assets
and Court Cases
. While normally generated by the file ingestion service of the Aleph server, a large set of sub-types of Document
is creating through entity mappings.
Intervals are business interests, court cases, sanctions and transactions (and their descendents). Intervals tend to be useful for linking two entities together, possibly over a specific time period.
Instances of Follow the Money entities are usually expressed as snippets of JSON, with three standard fields: an id
, a specification of the schema
to be used, and a set of properties. All properties are multi-valued.
{"id": "1b38214f88d139897bbd13eabde464043d84bbf9","schema": "Person","properties": {"name": ["John Doe"],"nationality": ["us"],"birthDate": ["1982"]}}
Such entities are either generated by mapping structured data, or through the file ingestion process of Aleph.
Property types constrain the possible values of a field, e.g. by limiting what countries are valid, what a phone number and email address look like, etc. They are also used to compare how similar values of the same type are, i.e. the algorithm for comparing two names is not the same as for two phone numbers.
Name | Description | Strong |
| General purpose string field | no |
| Long-form text field. Data stored in this type will be processed for named entity recognition (NER) in Aleph. | no |
| Name of a company or person | yes |
| Street address, formatted as a simple text | yes |
| SHA1 of an underlying data asset (i.e. the source file for an Email entity) | yes |
| Country code, ISO 3166 with some additions | yes |
| ISO 8661 date prefix, e.g. 2009, 2009-03-01, or 2009-03-01T22:34:00 | yes |
| E-mail address, must contain an @ | yes |
| Reference to another entity by its ID | yes |
| International Bank Account Number (IBAN), ISO 13616 | yes |
| Generic identifier, e.g. a company or tax registration number | yes |
| Internet host address, v4 or v6 | yes |
| Nested JSON object - rarely used | no |
| ISO 639 three-letter code | no |
| Internet MIME type, e.g. text/plain | no |
| Numeric value, usually converted to floating point number | no |
| Phone number, including an international dialling prefix | yes |
| Uniform Resource Locator, internet address path | yes |
The individual entity types (schemata) within the Follow the Money data model are defined using a set of YAML files. This model can be extended or modified for other deployments of Aleph to some extent, but this process is not very well established.
Follow the Money is reminiscent of a linked data ontology, and indeed it is regularly published in Turtle and RDF/XML format for use in RDF-based applications. See the ftm export-rdf
documentation for help on how to export Follow the Money data to linked data.
Any user of Aleph should feel free to propose changes or extensions to the Follow the Money data model by submitting a GitHub issue. Proposed changes should embody the design sense of the data model, which prioritises practicality over correctness; understandability over standards compliance.
​
​
​