All processing in Aleph is coordinated via a set of queues kept in redis, an in-memory data store. When a document is processed, the instruction to handle it is posted on that queue, which is read by multiple independent services. Each service can subscribe to one or many
stages in order to receive tasks, e.g. the
ingest stage (which extracts text from a file), the
analyze stage (which performs NER and language detection) and finally the
index stage (which adds the document to Aleph's search index).