This example requires a few additional libraries. You can install them using pip:
pip install -r requirements.txtredun run workflow.py mainBy default this will scrape web pages from https://www.python.org/ with a depth of 2 link traversals. All of the HTML files encountered will be stored in crawl/. Word frequency across all pages will be calculated and a CSV of the word counts will be stored in computed/word_counts.txt.
Lastly, an HTML report is generated in reports/report.html that summarizes the scraping and analysis. The report is generated using a jinja2 template stored in templates/report.html.
Feel free to try other urls and depth of scraping using the task arguments:
redun run workflow.py main --url URL --depth DEPTHAlso feel free to alter the report template templates/report.html. It is passed to the task make_report() as a File argument, so you should have automatic reactivity to changes in the template when rerunning the workflow.