Using Screaming Frog's CLI mode on a server
Command line crawling with Screaming Frog SEO Spider
By Julien on September 19, 2018
Screaming Frog just released SEO Spider v10, with a lot of impressive new features.
Amongst the list is the CLI mode: the ability to use the crawler without a GUI (on a server for example).
Here’s a quick guide on how to get started with Screaming Frog’s CLI mode on a Debian server.
Setup
We’ll assume that you’re correctly logged in to your server, via SSH for instance, and that you have administration (sudo) rights.
Remeber upgrading your system if needed ;-)
You’ll need to install some dependencies.
sudo apt-get install cabextract xfonts-utils
wget http://ftp.de.debian.org/debian/pool/contrib/m/msttcorefonts/ttf-mscorefonts-installer_3.6_all.deb
sudo dpkg - i ttf-mscorefonts-installer_3.6_all.deb
sudo apt-get install xdg-utils zenity libgconf-2-4 fonts-wqy-zenhei First, let’s download the latest version:
wget https://download.screamingfrog.co.uk/products/seo-spider/screamingfrogseospider_10.0_all.deb Check the official website for an updated link to the latest file.
Once the file is downloaded, launch installation:
sudo dpkg -i screamingfrogseospider_10.0_all.deb Check if everything is OK:
screamingfrogseospider --help Licence
You’ll need to enter a licence to use SF in headless mode.
Simply edit ~/ScreamingFrogSEOSpider/licence.txt and enter your username on the first line, and your key on the second.
EULA agreement
At first launch, Screaming Frog’s GUI asks you to agree to the terms and conditions. This can’t be done without a GUI.
However, there’s a workaround.
Edit ~/ScreamingFrogSEOSpider/spider.config and add the following line:
eula.accepted=8 Save and exit.
Start crawling
To start crawling in headless mode, you’ll need to use at least a few arguments:
--crawl <url>is the starting URL,--headlessis needed, otherwise SF will try to open a GUI (and fail),--save-crawlenables you to save your data to acrawl.seospiderfile,--output-folder <folder>will save the crawl data to the given folder,--timestamped-outputwill create a timestamped folder in which yourcrawl.seospiderfile will be saved (this is useful to avoid crushing a previous crawl).
Here’s a minimalist example:
screamingfrogseospider --crawl https://www.example.com --headless --save-crawl --output-folder /home/julien/crawls --timestamped-output Other options and OS
Checkout SF documentation for more details on how to use Screaming Frog CLI mode on other operating systems, and what command line arguments are available.
Many thanks to the guys at Screaming Frog for this awesome release!