Hello, and welcome to the wonderful world of browser automation with
WebDriver and Haskell! This module is a brief tutorial on how we can use
use the webdriver-w3c library to write Haskell programs that interact
with web pages just like a person would. If you need to test a web
application, or want to automate some web thing that curl and wget alone
can't handle easily, you might find this mildly interesting, maybe.
(This text is a literate program, so we have to start with some compiler noises. Nothing to see here!)
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Web.Api.WebDriver
import Test.Tasty.WebDriver
import Test.Tasty
import Control.Monad.Trans.Class
import qualified System.Environment as SE
import Control.Monad
import System.IO
main :: IO ()
main = return ()
To follow along, you're going to need a few things.
- Stack. Stack is a build tool for Haskell projects. It compiles our programs, runs tests, processes documentation, generates code coverage reports, and keeps project dependencies under control.
- A copy of this repository
- A web browser; this tutorial assumes you're using Firefox.
- A WebDriver proxy server for your browser. For Firefox this is geckodriver. Don't sweat it if you don't know what "WebDriver proxy server" means right now, we'll get to that.
Next, start your proxy server. For geckodriver on unix-like OSs, that is
done with the geckodriver & command. You should see a line that looks
something like this:
1521524046173 geckodriver INFO Listening on 127.0.0.1:4444
Leave that program running. Just leave it alone.
Finally, in another shell window, navigate to the directory holding this repo and say
stack ghci webdriver-w3c:webdriver-w3c-intro
Well, don't say that, out loud. Type it. :) This might take a while
the first time while stack downloads the compiler and libraries it
needs. When it finishes, this command opens a Haskell interpreter with
webdriver-w3c loaded so we can play with it. You'll know everything is
okay if you see a line like
Ok, one module loaded.
followed by a λ: prompt. To be sure, try typing in return and then
hit (enter). If you see this scary error message:
<interactive>:1:1: error:
• No instance for (Show (a0 -> m0 a0))
arising from a use of ‘print’
(maybe you haven't applied a function to enough arguments?)
• In a stmt of an interactive GHCi command: print it
then everything is working great!
Ok! You've got your WebDriver proxy (geckodriver) running in one terminal window, and ghci running in another. Let's start with a simple example to illustrate what we can do, then explain how it works. Read this code block, even if the syntax is meaningless.
release_the_bats :: WebDriverT IO ()
release_the_bats = do
fullscreenWindow
navigateTo "https://www.google.com"
performActions [typeString "bats"]
performActions [press EnterKey]
wait 5000000
pure ()
Without running that code -- and maybe without being proficient in Haskell -- what do you think it does?
Now let's run it. In the interpreter, type
example1
followed by (enter). You should see a Firefox window open, go fullscreen, and search Google for "bats".
example1, by the way, is this:
example1 :: IO ()
example1 = do
execWebDriverT defaultWebDriverConfig
(runIsolated_ defaultFirefoxCapabilities release_the_bats)
return ()
Let's break down what just happened.
release_the_batsis a WebDriver session, expressed in theWebDriverDSL. It's a high-level description for a sequence of browser actions: in this case, "make the window full screen", "navigate to google.com", and so on.runIsolated_takes a WebDriver session and runs it in a fresh browser instance. The parameters of this instance are specified indefaultFirefoxCapabilities.execWebDrivertakes a WebDriver session and carries out the steps, using some options specified indefaultWebDriverConfig.
You probably also noticed a bunch of noise got printed to your terminal starting with something like this:
λ: example1
2018-06-23 15:19:46 Request POST http://localhost:4444/session
{
"capabilities": {
"alwaysMatch": {
"browserName": "firefox"
}
},
"desiredCapabilities": {
"browserName": "firefox"
}
}
2018-06-23 15:19:48 Response
{
"value": {
"sessionId": "383edca7-3054-0544-8c1e-cc64099462de",
"capabilities": {
"moz:webdriverClick": true,
"platformVersion": "17.4.0",
"moz:headless": false,
"moz:useNonSpecCompliantPointerOrigin": false,
"browserVersion": "60.0.2",
"rotatable": false,
"pageLoadStrategy": "normal",
"moz:profile": "/var/folders/td/sxyy9wl919740vddr49g8nth0000gn/T/rust_mozprofile.aleh5JscOwwI",
"moz:accessibilityChecks": false,
"moz:processID": 88470,
"platformName": "darwin",
"timeouts": {
"implicit": 0,
"script": 30000,
"pageLoad": 300000
},
"acceptInsecureCerts": false,
"browserName": "firefox"
}
}
}
This is the log. WebDriver sessions keep track of a bunch of info to help with debugging, like all requests and responses and all raised errors. By default the logs are printed to stderr but this is configurable.
So what can you do in a WebDriver session? Not much -- but this is by design. The library includes:
- A binding for each endpoint in the WebDriver spec
- Some basic functions for reading and writing files, reading and writing at the console, and making arbitrary HTTP requests
This plus Haskell's do notation make for a tidy EDSL for running
browsers. Notably, a WebDriver session cannot do arbitrary IO by
default, and WebDriver sessions are pure values. (There is an escape
hatch for this restriction.)
WebDriver is an HTTP API for controlling web browsers like a human user would. In principle a browser could implement this API directly. In practice the major browsers have their own internally maintained APIs for automation and use a proxy server to translate between WebDriver and their internal API.
This is the role geckodriver is playing in our examples so far: deep down, our code is making HTTP requests to geckodriver, and geckodriver is passing these requests on to Firefox.
This library is also tested against Chrome via chromedriver. To do that,
using chromedriver's default settings, we need to make a couple of
adjustments to the examples: replace
defaultWebDriverConfig
by
defaultWebDriverConfig
{ _environment = defaultWebDriverEnvironment
{ _env = defaultWDEnv
{ _remotePort = 9515
}
}
}
and replace
defaultFirefoxCapabilities
by
defaultChromeCapabilities
(By the way - defaultWebDriverConfig has type WebDriverConfig, and
includes knobs for tweaking almost everything about how our sessions
run.)
It's expected that you're probably interested in using browser
automation to run end-to-end tests on some web application -- and
webdriver-w3c has some extra bits built in to make this simpler.
In addition to the usual browser action commands, you can sprinkle your
WebDriver sessions with assertions. Here's an example.
what_page_is_this :: (Monad eff) => WebDriverT eff ()
what_page_is_this = do
navigateTo "https://www.google.com"
title <- getTitle
assertEqual title "Welcome to Lycos!" "Making sure we're at the lycos homepage"
return ()
Note the signature: (Monad eff) => WebDriverT eff () instead of
WebDriverT IO (). What's happening here is that WebDriverT is a
transformer over a monad eff within which a restricted set of effects
(like writing to files and making HTTP requests) take place. These
effects are "run" by an explicit evaluator that, for the default
configuration, happens to use IO, but both the effect monad and the
evaluator are configurable. By swapping out IO for another type we
can, for example, run our tests against a mock Internet, and swapping
out the evaluator we might have a "dry run" evaluator that doesn't
actually do anything, but logs what it would have done. It's good
practice to make our WebDriver code maximally flexible by using an
effect parameter like eff instead of the concrete IO unless there's
a good reason not to.
Anyway, back to the example. What do you think this code does? Let's try it: type
example2
in the interpreter. You should see a browser window open briefly to
google.com, with a scary "Invalid Assertion" message in the interpreter.
assertEqual is the assertion statement: it takes two things (strings
in this case) and checks whether they are equal. Shocking, hm? The third
argument to assertEqual is a comment, so we can include some human
readable info as to why this assertion was made.
This is example2:
example2 :: IO ()
example2 = do
(_, result) <- debugWebDriverT defaultWebDriverConfig
(runIsolated_ defaultFirefoxCapabilities what_page_is_this)
printSummary result
return ()
Here's what happened:
what_page_is_thisis a WebDriver session, just likerelease_the_bats, this time including an assertion: that the title of some web page is "Welcome to Lycos!".runIsolated_runswhat_page_is_thisin a fresh browser instance.debugWebDriverworks much likeexecWebDriver, except that it collects the results of any assertion statements and summarizes them (this isresult).printSummarytakes the assertion results and prints them out all pretty like.
Documentation on assertions is on Hackage.
Alright. If you're writing e2e tests, you probably want to write a lot
of e2e tests. In this case, we'd like our tests to be modular, isolated,
and well-organized, so that when things go wrong we can quickly diagnose
what happened. For this, webdriver-w3c integrates with the
tasty test framework --
just import Test.Tasty.WebDriver.
Suppose we've got two WebDriver tests. These are pretty dweeby just for illustration's sake.
back_button :: (Monad eff) => WebDriverT eff ()
back_button = do
navigateTo "https://www.google.com"
navigateTo "https://wordpress.com"
goBack
title <- getTitle
assertEqual title "Google" "Behavior of 'back' button from WordPress homepage"
return ()
refresh_page :: (Monad eff) => WebDriverT eff ()
refresh_page = do
navigateTo "https://www.mozilla.org"
pageRefresh
title <- getTitle
assertEqual title "Mozilla's Epic HomePage on the Internets"
"Refresh mozilla.org"
return ()
We can organize them into a hierarchy of tests like so.
test_suite :: TestTree
test_suite = testGroup "All Tests"
[ testCase "Back Button" back_button
, testCase "Refresh" refresh_page
]
Try running the suite with
example3
in the interpreter. Here's what example3 looks like:
example3 :: IO ()
example3 = do
SE.setEnv "TASTY_NUM_THREADS" "1"
defaultWebDriverMain
$ localOption (SilentLog)
$ localOption (PrivateMode True)
$ test_suite
Here's what happened:
test_suiteis a Tasty tree of individualWebDriverTtest cases.defaultWebDriverMainis a Tasty function that runs test trees. In this case we've also usedlocalOptionto tweak how the tests run -- in this case suppressing the usual session log output and running the browser in private mode.
Tasty gave us lots of nice things for free, like pretty printing test results and timings.
λ: example3
>>> Deployment environment is DEV
>>> Logging with colors
All Tests
Back Button: OK (7.23s)
1 assertion(s)
Refresh: FAIL (4.29s)
Invalid Assertion
assertion: "Internet for people, not profit \8212 Mozilla" is equal to "Mozilla's Epic HomePage on the Internets"
comment: Refresh mozilla.org
1 out of 2 tests failed (11.53s)
Other test case constructors and test options are available; see Hackage for the details.
The test suite for webdriver-w3c itself uses the Tasty integration.
There is also a function, checkWebDriver, that can be used to build
tests with QuickCheck, if you don't find that idea abominable. :)
The vanilla WebDriverT is designed to help you control a browser with
batteries included, but it has limitations. It can't possibly
anticipate all the different ways you might want to control your tests,
and it can't do arbitrary IO. But we have a powerful and very general
escape hatch: the WebDriverT monad transformer is a special case of
the WebDriverTT monad transformer transformer.
The actual definition of WebDriver is
type WebDriverT eff a = WebDriverTT IdentityT eff a
where IdentityT is the inner monad transformer. By swapping out
IdentityT for another transformer we can add more features specific to
our application.
Here's a typical example. Say you're testing a site with two deployment
tiers -- "test" and "production". For the most part the same test suite
should run against both tiers, but there are minor differences. Say the
base URLs are slightly different; maybe production lives at
example.com while test lives at test.example.com. Also while
developing a new feature some parts of the test suite should only run on
the test tier, maybe controlled by a feature flag.
What we need is some extra read-only state to pass around. We can do
this with a ReaderT transformer. To avoid adding a dependency on a
whole transformer library, lets roll our own:
data ReaderT r eff a = ReaderT
{ runReaderT :: r -> eff a
}
instance (Monad eff) => Monad (ReaderT r eff) where
return x = ReaderT $ \_ -> return x
x >>= f = ReaderT $ \r -> do
a <- runReaderT x r
runReaderT (f a) r
instance (Monad eff) => Applicative (ReaderT r eff) where
pure = return
(<*>) = ap
instance (Monad eff) => Functor (ReaderT r eff) where
fmap f x = x >>= (return . f)
instance MonadTrans (ReaderT r) where
lift x = ReaderT $ \_ -> x
reader :: (Monad eff) => (r -> a) -> ReaderT r eff a
reader f = ReaderT $ \r -> return $ f r
Now our actual state might look something like this:
data MyEnv = MyEnv
{ tier :: Tier
, featureFlag :: Bool
}
data Tier = Test | Production
env :: Tier -> MyEnv
env t = MyEnv
{ tier = t
, featureFlag = False
}
And we can augment WebDriverTT with our reader transformer.
type MyWebDriverT eff a = WebDriverTT (ReaderT MyEnv) eff a
Now we can build values in MyWebDriver using the same API as before,
using the extra features of the inner monad with liftWebDriverTT.
custom_environment :: (Monad eff) => MyWebDriverT eff ()
custom_environment = do
theTier <- liftWebDriverTT $ reader tier
case theTier of
Test -> navigateTo "http://google.com"
Production -> navigateTo "http://yahoo.com"
To actually run sessions using our custom monad stack we need to make a
few adjustments. First, we use execWebDriverTT instead of
execWebDriverT.
Second, we need to supply a function that "runs" the inner transformer
(in this case ReaderT eff a) to IO.
execReaderT :: r -> ReaderT r IO a -> IO a
execReaderT r x = runReaderT x r
Running our custom WebDriver monad is then straightforward.
example4 :: Tier -> IO ()
example4 t = do
execReaderT (env t) $
execWebDriverTT defaultWebDriverConfig
(runIsolated_ defaultFirefoxCapabilities custom_environment)
return ()
Try it out with
example4 Test
example4 Production
We can similarly use a custom inner monad to check assertions and with
the tasty integration; there are analogous debugWebDriverTT and
testCaseTT functions.
ReaderT is just one option for the inner monad transformer. We could
put mutable state, delimited continuations, or even another HTTP API
monad in there. Use your imagination!
Running browser sessions is one thing, but writing and debugging them is
another. webdriver-w3c has some tools for dealing with this as well.
Besides the log, which gives a thorough account of what happened, we can
include breakpoints in our code. When breakpoints are activated, they
stop the session and give us a chance to poke around the browser before
moving on.
Here's a simple example.
stop_and_smell_the_ajax :: (Monad eff) => WebDriverT eff ()
stop_and_smell_the_ajax = do
breakpointsOn
navigateTo "https://google.com"
breakpoint "Just checking"
navigateTo "https://mozilla.org"
breakpoint "are we there yet"
We can run this with example5:
example5 :: IO ()
example5 = do
execWebDriverT defaultWebDriverConfig
(runIsolated_ defaultFirefoxCapabilities stop_and_smell_the_ajax)
return ()
The basic breakpoint command gives the option to continue, throw an
error, dump the current state and environment to stdout, and turn
breakpoints off. A fancier version, breakpointWith, takes an
additional argument letting us trigger a custom action.
For now the canonical documentation is the haddock annotations on Hackage.