# Scraper for the OCCRP web site.
description: 'Organized Crime and Corruption Reporting Project'
# Uncomment to run this scraper automatically:
# This first stage will get the ball rolling with a seed URL.
# These rules specify which pages should be scraped or included:
# Parse the scraped pages to find if they contain additional links.
# Additional rules to determine if a scraped page should be stored or not.
# In this example, we're only keeping PDFs, word files, etc.
# this makes it a recursive web crawler:
# Store the crawled documents to a directory