Running PhantomJS with cookies
In this recipe, we will learn how to use the cookies-file
command-line switch to specify the location of the file for persistent cookies in PhantomJS.
Getting ready
To run this recipe, we will need a script to run with PhantomJS that accesses a site where cookies are read or written. We will need a filesystem path to specify it as the command-line argument, making sure that we have write permissions to that path.
The script in this recipe is available in the downloadable code repository as recipe05.js
under chapter01
. If we run the provided example script, we must change to the root directory for the book's sample code.
Lastly, the script in this recipe runs against the demo site that is included with the cookbook's sample code repository. To run that demo site, we must have Node.js installed. In a separate terminal, change to the phantomjs-sandbox
directory (in the sample code's directory) and start the app with the following command:
node app.js
Note
Node.js is a JavaScript runtime environment based on Chrome's V8 engine. It has an event-driven programming model and non-blocking I/O and can be used for building fast networking applications, shell scripts, and everything in between. We can learn more about Node.js including how to install it at http://nodejs.org/.
We will use this demo for many recipes throughout this cookbook. When we run the demo app for the first time, we need to download and install the Node.js modules that it depends on. To do this, we can change to the phantomjs-sandbox
directory and run the following command:
npm install
How to do it…
Given the following script:
var webpage = require('webpage').create(); webpage.open('http://localhost:3000/cookie-demo', function(status) { if (status === 'success') { phantom.cookies.forEach(function(cookie, i) { for (var key in cookie) { console.log('[cookie:' + i + '] ' + key + ' = ' + cookie[key]); } }); phantom.exit(); } else { console.error('Could not open the page! (Is it running?)'); phantom.exit(1); } });
Enter the following command at the command line:
phantomjs --cookies-file=cookie-jar.txt chapter01/recipe05.js
Note
PhantomJS will create the cookie-jar.txt
file for us; there is no need to create it manually.
The script will print out the properties for each cookie in the response, as follows:
[cookie:0] domain = localhost [cookie:0] expires = Sat, 07 Dec 2013 02:05:06 GMT [cookie:0] expiry = 1386381906 [cookie:0] httponly = false [cookie:0] name = dave [cookie:0] path = /cookie-demo [cookie:0] secure = false [cookie:0] value = oatmeal-raisin [cookie:1] domain = localhost [cookie:1] expires = Sat, 07 Dec 2013 02:04:22 GMT [cookie:1] expiry = 1386381862 [cookie:1] httponly = false [cookie:1] name = rob [cookie:1] path = /cookie-demo [cookie:1] secure = false [cookie:1] value = chocolate-chip
We can then open cookie-jar.txt
in a text editor and examine its contents. The cookie jar file should look something like the following:
[General] cookies="@Variant(\0\0\0\x7f\0\0\0\x16QList<QNetworkCookie>\0\0\0\0\x1\0\0\0\x2\0\0\0_dave=oatmeal-raisin; expires= Sat, 07 Dec 2013 02:05:06 GMT; domain=localhost; path=/cookie-demo\0\0\0^rob=chocolate-chip; expires= Sat, 07 Dec 2013 02:04:22 GMT; domain=localhost; path=/cookie-demo)"
How it works…
Our preceding example script performs the following actions:
- It creates a
webpage
object and opens the target URL (http://localhost:3000/cookie-demo
). - In the callback function, we check for
status
of'success'
, printing an error message and exiting PhantomJS if that condition fails.Tip
Throughout this cookbook, we will use exit codes of
0
and1
for success and failure respectively, because those are the exit codes traditionally used for those reasons on POSIX and Windows systems. - If we successfully open the URL, then we loop through each cookie in the
phantom.cookies
collection and print out information about each one. - Lastly, we exit from the PhantomJS runtime using
phantom.exit
.
When we start PhantomJS with the cookies-file
argument, we are telling the runtime to read and write cookies from a specific location on the filesystem. What this allows us to do is to use cookies in PhantomJS like we would with any other browser. In other words, an HTTP response or client-side script can set cookies, and when we run our PhantomJS script against that URL again, we can trust that the cookies are still there in the file.
Notice that the cookie jar file itself is essentially a plain text file. The actual file extension does not matter; we used .txt
in our example, but it could just as easily be .cookies
or even no extension at all. When persisting the cookies, PhantomJS writes them to this file. If we examine the file, then we see that it is a serialized, text-based version of the QNetworkCookie
class that PhantomJS uses behind the scenes. Although the on-disk version is not necessarily easy to read, we can easily make a copy and parse it or transform it into its constituent cookies. This can be useful for examining their contents after a script has completed (for example, to ensure that the expected values are being written to disk).
Additionally, with the cookies written to disk, they are available for future PhantomJS script runs against URLs that expect the same cookies. For example, this can be useful when running scripts against sites that require authentication where those authentication tokens are passed around as cookies.
See also
- The Managing cookies with the phantom object recipe in Chapter 2, PhantomJS Core Modules