WARNING: This is old and likely obsolete.
- published date: 2013-09-08 13:48
- keywords: [“headless”, “javascript”, “selenium”, “swaac”, “testing”, “web-development”]
- source:
Source http://www.chrisle.me/2013/08/running-headless-selenium-with-chrome/
Clipped on 2013-09-08 13:48:54 -0500
Scaling website automation for either testing or scraping can be a challenge when the site is enitrely driven by JavaScript or behaves differently when using specific browsers.
Running a headless Selenium machine with Google’s Chrome installed provides a scalable way to automate your tests on one of the most popular browsers in use.
Here are step by step instructions for installing a headless Selenium server with Chrome and Vagrant.
- Get up and running quickly
- Run a sample script that demos it works
- Use JavaScript only (via NodeJS)
The code you write locally should work when deployed at scale in production. These tools help us do that by creating identical environments for development and production.
Both are free downloads. Install with the default settings
- Download and install VirtualBox
- Download and install Vagrant
- Download and install NodeJS
I also assume you can use a command line and have some vague idea of what a virtual machine and Vagrant is.
This file tells Vagrant how configure the testing environment. It applies universally to both development and production.
Create a project directory and create a file named Vagrantfile
:
# encoding: utf-8
# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
config.vm.box = "precise64"
config.vm.box_url = "http://files.vagrantup.com/precise64.box"
config.ssh.forward_agent = true
config.vm.provider :aws do |aws, override|
aws.access_key_id = 'XXXX' # Replace this
aws.secret_access_key = 'XXXX' # Replace this
aws.keypair_name = 'XXXX' # Replace this
aws.ami = 'ami-7747d01e' # ubuntu 12.04
override.ssh.username = 'ubuntu'
override.ssh.private_key_path = '~/.ssh/amazon-ubuntu.pem'
end
config.vm.provision :shell, :path => "setup.sh"
config.vm.network :forwarded_port, guest:4444, host:4444
end
The setup.sh file executes when Vagrant creates a virtual machine for you. In the same folder as you created your VagrantFile
create a setup.sh
file:
#!/bin/sh
set -e
if [ -e /.installed ]; then
echo 'Already installed.'
else
echo ''
echo 'INSTALLING'
echo '----------'
# Add Google public key to apt
wget -q -O - "https://dl-ssl.google.com/linux/linux_signing_key.pub" | sudo apt-key add -
# Add Google to the apt-get source list
echo 'deb http://dl.google.com/linux/chrome/deb/ stable main' >> /etc/apt/sources.list
# Update app-get
apt-get update
# Install Java, Chrome, Xvfb, and unzip
apt-get -y install openjdk-7-jre google-chrome-stable xvfb unzip
# Download and copy the ChromeDriver to /usr/local/bin
cd /tmp
wget "https://chromedriver.googlecode.com/files/chromedriver_linux64_2.2.zip"
wget "https://selenium.googlecode.com/files/selenium-server-standalone-2.35.0.jar"
unzip chromedriver_linux64_2.2.zip
mv chromedriver /usr/local/bin
mv selenium-server-standalone-2.35.0.jar /usr/local/bin
# So that running `vagrant provision` doesn't redownload everything
touch /.installed
fi
# Start Xvfb, Chrome, and Selenium in the background
export DISPLAY=:10
cd /vagrant
echo "Starting Xvfb ..."
Xvfb :10 -screen 0 1366x768x24 -ac &
echo "Starting Google Chrome ..."
google-chrome --remote-debugging-port=9222 &
echo "Starting Selenium ..."
cd /usr/local/bin
nohup java -jar ./selenium-server-standalone-2.35.0.jar &
On your command line and in the directory where you created the VagrantFile
, run the following command:
vagrant up
This will kick off downloading and installing all the pieces neccessary. It should look like this:
You can check to see if everything is working by going to http://localhost:4444/wd/hub
.
The VagrantFile
has been configured to forward port 4444 on your localhost. This allows you UI control of the Selenium browser. This page shows you all the sessions that you’re running in your virtual machine. If you see this page, everything is OK.
In order to write NodeJS scripts that talk to Chrome you will need the Selenium-Webdriver for NodeJS.
On your command line, install selenium-webdriver
with the following command. This will install the modules needed for interacting with Selenium.
npm install selenium-webdriver
This first script will go to Google’s homepage, type in a query, then print out the HTML.
var webdriver = require('selenium-webdriver');
var keyword = "chris le twitter";
var driver = new webdriver.Builder().
usingServer('http://localhost:4444/wd/hub').
withCapabilities(webdriver.Capabilities.chrome()).
build();
driver.get('http://www.google.com');
driver.findElement(webdriver.By.name('q')).sendKeys(keyword);
driver.findElement(webdriver.By.name('btnG')).click();
driver.wait(function() {
return driver.getTitle().then(function(title) {
driver.getPageSource().then(function(html) {
console.log(html);
return true;
});
});
}, 1000);
driver.quit();
Run your test with node. You should see the HTML that was rendered by the Chrome browser.
So now that you have this up and running what can you use it for?
Running your automated test suites: This is great for doing integration testing against Chrome browsers and probably responsive websites.
Testing your Chrome Extentions: Debugging Chrome Extensions can be a bit of a pain. This could be your Asprin.
Taking many screenshots: If you want to make screenshots of many pages at once.
Scraping stubborn websites: I wasn’t able to scrape a website using PhantomJS because it fired JSONP requests long after the onLoad() event fired. Simply waiting for the event loop to empty itself wansn’t enough. A combination of debugging with a real browser and Selenium, I was more successful at getting the DOM after the scripts had run.