Does Crawlbot support authenticated crawling?

There are many authentication schemes on the web, but two of the most common are username+password HTML forms and HTTP basic authentication.

HTML Forms


A form with username & password login

Form-based authentication works by the setting a cookie in your browser using the Set-Cookie header.  Subsequent requests to the server from your browser will then send the Cookie header.  To retrieve the Cookie header, navigate to your intended site, log in with the username and password, and then in your Javascript console enter the value document.cookie and save this value.

Then supply this value as the Cookie header using the Crawlbot API.

HTTP Basic


An HTTP Basic login prompt

For HTTP Basic based login, the browser will send an Authorization header that is calculated based on the values of the username and password. The header will be of the format Authorization: Basic $hash where the $hash is computed as the Base 64 encoding of the string $username:$password. More information about basic authentication can be found here.

Once you have the Authorization header, as above, you can then supply this via the Crawlbot API in order to perform authenticated crawling.