August 1, 2018
Using Scrapy to parse FARA
Quickstart Run the spider as:
$ :> fprincipals.jl :> s.log; scrapy crawl active_foreign_principals -o fprincipals.jl --logfile=s.log This will truncate the output and the log file. JSON-lines output format is always appended to.
Setup Install scrapy for Python3.
$ virtualenv -p python3 py3 $ source py3/bin/activate $ pip install Scrapy ipython Now, create a crawler project
$ mkdir ~/src/fara $ cd ~/src/fara $ scrapy startproject fara $ cd fara $ scrapy genspider active_foreign_principals efile.
Read more