Web Browser Automation Using Geb & Spock
Abstract
Geb is a fairly
new web browser automation tool that is very useful in certain web application
development projects. It is a JQuery-like content selector and a WebDriver-like
automation tool wrapped into one unique language. Using Geb requires a few
moving parts but if you require very extensive, in depth web application
testing it can be extremely advantageous. It can also be very useful in
slightly smaller projects as well if you find yourself constantly clicking
through the same pages and scenarios to test certain web application functions.
Geb is currently not even at version 1.0 but it is increasing in popularity due
to its usefulness. In the narrative that follows you will learn when Geb should
be used, what Geb is, what it looks like, and see some examples of Geb code
fragments.
Automating
the Browser
The
reasons may seem slightly obvious why having a tool to automate a web browser
would be beneficial but all web application projects are not created equal.
There are situations where using Geb would save a large amount of time (in the
case of software engineering in industry it would save a large amount of money
too) and there are cases where using Geb would actually cost more than it would
be worth. What are these scenarios? To find out what these scenarios are one
needs to know advantages and disadvantages of integrating a browser automation
tool into a project.
Advantages - When to use Geb
The main
advantage of having a browser automation tool is to eliminate the need for
tedious manual testing. You would want to use a browser automation tool on a
large project with many test cases. Quality Assurance teams on big projects
like this will have thousands of test cases to go through before a project is
ready for deployment. With modern web applications they can be very complex and
just having j-unit tests running in your back end code is not enough to catch
errors. Once the back end code is integrated with the front end interface there
can be many errors that arise when all the parts actually come together. With
complex applications like this it forces the quality assurance teams to have to
test a lot of the front end functionality by hand. This becomes very difficult
the more complex the application actually is.
Let’s
take an industry standard e-commerce application as an example. There are many
parts. First there is back end java code that handles the functionality of the
application. Within this code there are multiple layers needed to make the
application robust and effective. This depends on the design pattern that is
used however let just use the MVC pattern as an example. MVC stands for model,
view, and controller. When using this design pattern one must define a model,
view, and controller for each feature that supports functionality on the user
interface. Then once the model, view, and controller pieces all work together
to support the new feature, the component that uses the feature must actually
be added onto the user interface. This is done using some kind of markup
language or template language in a web application. Once in use some components
will even need to talk to a database to get the required information they need
to function; for example, if a customer scans a promotional code the
application would need to access the database to receive that code, put it in
the right text field and apply the correct discount to the current total. To
make a long story short this is a lot of moving parts. Anyone who has studied
just a little bit of computer science can see that there is a lot of potential
for bugs in code of this complexity.
With complex
code comes a large amount of required testing. Using Geb to automate certain
tests can really save time. A scenario that would be advantageous to use
automation on is a one that involves navigation through many pages and a large
amount of data entry. Say your application requires a user to enter a large
amount of personal information before they can move on to the next page of an
application and you need to test one small feature three pages deeper into the
application. How would Geb help with this?
Using Geb you can write a test suite that will
access all of the text fields and drop down boxes and enter in all of the
information very quickly. You only have to define the information you want in
this boxes once in the Geb code. From then on they act like constants and you
can write the program to insert those constants in whatever field you see fit. Once
you enter in all the necessary information you can then advance to the next
page and so on and so forth until you get to the feature you actually want to
test. The difference in speed is exponential. To do the same process by hand it
would take at least five minutes while Geb can do it in seconds. This is the
main advantage using Geb. Quality assurance teams could eliminate a lot of
tedious testing time and focus their time on testing the actual functionality instead
of mainly performing data entry.
Another main
advantage is that it is not too hard to learn. The language is very readable
and the constructs do what they say they are going to do. For example the
keyword “to” is supposed to be followed by the name (if you have defined the
page in your code) or the url of the page you want to go to (such as
“GoogleHomePage”) so a line of code using the “to” keyword would read “to
GoogleHomePage.” [1] As you can see it is very readable and simple to
understand. Most of the constructs in Geb read similar to this making the
language something that programmers can easily pick up. I will go more into
actual features of Geb code later on.
Disadvantages
– When to not use Geb
Geb’s usefulness
takes a hit on smaller projects. When your web application does not have layers
of pages and many forms that require data entry you may find Geb actually costs
you more time than it saves. Although you could still set up Geb to work with a
smaller project it may take more time installing it, learning it, and coding a
Geb test for every new piece of functionality that is added when a quick manual
test would suffice. As you can see it really depends on the project at hand.
There is no cut and dried project size threshold, but since Geb is fairly new
and unknown it can be a bigger risk to integrate it into a small project. If
Geb doesn’t save a lot of time it is really not worth taking the time to become
familiar with because that time would increase the cost on the project.
Another
disadvantage of using Geb is it is very picky about what environment your
website application uses. Geb needs to be properly integrated into a specific
environment to be able to work. If you are already using some of these things
in your project then Geb can be installed in a few minutes, otherwise it could
be very tough. Geb interacts with specific web development environments like
Grails, Maven, and Gradle. If your project does not use at least one of these
in the first place it is probably not a good fit for Geb. It seems it would
take a decent amount of work to switch over the environment to one of the ones I
previously spoke about; the cost would be too great to even be worth trying to
use Geb. However if your project uses these environments than Geb would
probably be a great fit to use in your web application’s testing process.
The
last main disadvantage of Geb is the guaranteed learning curve. Since Geb is so
new and unfamiliar to many programmers there is an almost certainty that newly
integrating it into your project is going to require extra time for developers
to learn. The up side is that Geb is fairly easy to learn once you commit to
learning it but it is still going to take time away from other tasks to learn
just like any other new language. Sometimes project teams are on such a tight
time crunch that they cannot afford a few days to learn and integrate a new
language into their project. However it would be worth those few days if you
are spending more time on testing than Geb could save you in the end. Usage all
comes down to software engineering management decisions and those are made
differently on a project to project basis.
What
actually is Geb?
As I have previously described
Geb is a browser automation solution. This means Geb can completely take
control of your web browser and programmatically perform tasks that you would
otherwise have to perform manually. As you can see this would be extremely
useful for testing web applications. Now that I have previously described the
advantages and disadvantages I will get more into the bare-bones of Geb. Geb’s
features are based on combining features from WebDriver, JQuery, and the Groovy
programming language.
WebDriver
WebDriver is the main driving
force behind Geb. The WebDriver integration is what actually handles the
browser automation. WebDriver has the capability to modify document object
model content on a web page [2]. Document Object Model (DOM) content is the
specific interface components on a website. These are things like text fields,
buttons…etc. This means that WebDriver can do things like insert text into a
text field, since a text field is a DOM component.
WebDriver can
also control the behavior of a web browser. You may be thinking why would anyone
use Geb if WebDriver can do its own version of browser automation and DOM modification?
The main reason is the readability and simplicity. As you will soon see Geb is
a lot more readable and easy to learn than just using WebDriver itself. Geb
takes its main functionality from WebDriver but combines it with other tools
that make it simpler. Below is some java code that supports using WebDriver to
perform a Google search (see Figure 1). Although it is not extremely
complicated writing a WebDriver test it is still more work than writing a Geb
test because Geb uses the language of Groovy which saves a lot of key strokes
and is easier to understand
A
Very Groovy Language
Geb
speaks in the language of Groovy. Groovy is a dynamic programming language for the
java virtual machine. Although some consider Java to be a dynamic language
Groovy behaves like traditional dynamic languages that programmers are used to
like Ruby, Python, and Smalltalk [4]. Basically Groovy builds upon the strong
features of Java but adds features that one would expect in dynamic languages
like Ruby. One of the biggest advantages Groovy has is its readability and low
learning curve. It also integrates easily with virtually every Java library [4].
Another advantage is that it simplifies testing because of its support for
mocking and unit testing. This is why Geb uses groovy as the language of
choice. See Figure 2 below for some basic groovy code.

As you can see
above groovy is very similar to Ruby. Geb tests are written using this language
and this allows for use of some unique testing keywords that really make
testing simple and readable. But before I actually go into how to write a Geb
test it is important to know the final piece Geb uses in its implementation:
JQuery.
JQuery
JQuery in a
nutshell is basically a content selector. Its purpose in Geb is to improve on
the DOM content selection model of WebDriver to make Geb a more powerful
browser automation tool. When automating a web browser a very important step is
being able to quickly and easily locate content on the page that the automation
tool will either use or modify. JQuery’s purpose is to provide that powerful
selection feature. JQuery is a JavaScript library that can actually go into the
html code and modify or find desired components. It is also able to handle
events and provide its support for all the major browsers (IE, Firefox, Safari,
Chrome…etc) [5]. Using JQuery with Geb makes the browser automation faster.
Some of its language even makes its way into the syntax of Geb. This is why it
is important to know.
Consult figure
three below to see a basic use of JQuery. In this example JQuery is being used
to prevent a link embedded in the html code of a website from working. Lines 11
through 16 on Figure 3 is actual JQuery code embedded into the html code. What
this code does is overwrite the click event on the link defined in line 8
preventing it from taking the user to jquery.com. Geb uses JQuery to make modifications to web
pages in a similar matter.
Figure 3: JQuery Example
Knowing the
pieces of Geb are important before getting into using it because it makes it
even easier to learn when you start realizing what pieces are being used where.
WebDriver, Groovy, and JQuery all play an integral part in making Geb a
powerful browser automation tool. Now that you have a background in the main
pieces of Geb we can get into some examples and explanations of how Geb works.
Geb
Examples
Geb can be
combined with many testing frameworks all outlined in The Book of Geb and it can be found on the Geb website www.gebish.org. First one
needs to see what Geb looks like by itself (without importing any frameworks.).
After knowing what Geb looks like by itself one can see what a typical test
looks like using the Spock framework. The Spock framework is included in the
Geb package so a simple “import geb.spock.GebSpec” in your code should take
care of the entire configuration. It is not something that requires a large
amount of explaining but I will point out when the Spock framework is being
used in following examples. It is recommended to use this framework when using
Geb because you can write the most “clear, concise, and easy to understand test
specifications with little effort” [1].

Let’s now
dive into to a simple Geb example. Recall the WebDriver Google search example
from Figure 1. Next you will see a Google search example in Geb and it will be
compared with WebDriver. The following Geb code is an inline scripting style; it
doesn’t use any predefined packages or frameworks, it just uses basic Geb. In
the following example the computer is going to go to Google, enter Wikipedia in
the search field, and eventually click the link to Wikipedia and go to the
page.
Notice how all
you need to do to go to the webpage is use the keyword “go” (line 4) followed
by the url. In the case of WebDriver implementation you need to do a
“driver.get()” with the url as the parameter. Geb’s version of this
implementation is a lot simpler and easier to remember. You do not have to look
through the methods of the driver to figure out which one goes to a url; you
just need to remember the one word: “go.” Also notice how you can do an
“assert” statement without defining any testing frameworks. This makes it very
easy to check if you actually got to the right page. To do the same operation
in WebDriver you would need to do a “driver.getTitle” and compare that string
using the java “.equals” method. Geb simplifies WebDriver but is still able to
keep the powerful functionality.
A third and
final note about the above example is the lines that use the “$”. This is where
JQuery comes in. If you recall the example from figure three this is how JQuery
accesses content on the webpage. Geb uses that notation as well. It is able to
find “q” which is the variable name for the text field on the Google home page
and enter text into it all in one line. In our pure WebDriver example from
figure one this process takes two lines to perform. You need to find the
element then use “sendKeys” to programmatically type in what you want. As you
can see Geb requires less coding than just using WebDriver straight up.
Page
Object Model
Non-inline Geb
programs are built on the page object model. Using this model is the preferred
way to use Geb. If you need to write a quick test you can certainly use the
inline style as shown in Figure 4 but using the Page Object Model makes your
tests more readable and can save you time in future phases of your project. The
page object model in a nutshell is based on defining page objects in the test
code and reusing them whenever you need to refer to that page object [7]. If
this model was used on the example in Figure 4 the programmer would not have to
use the JQuery notation in the script; the programmer could simply call the
pre-defined page object.

To do this
you would define a page object in your Geb code. Inside of that specific page
object class you would define all of the content on that page so you could
easily refer to it later without having to use JQuery and look up the variable
names of the desired content on the page you are working with. To do this you use
the JQuery finder character ($) but store the result in a class variable that
you can call when you are doing work with that specific page. Let’s look at
what a typical page class looks like (see Figure 5 below).
Figure 5, above,
is how you would represent a login page using the page object model. It is very
simple: name the page, specify its url and content, define the content by using
the JQuery notation to access the object on the web page but name it something
meaningful so your code is very readable. Once you define your page you can
refer to it as, for example, LoginPage anywhere in your code. Notice how the
last line of actual code on Figure 5 actually refers to another page
(AdminPage) since it needs to interact with that page as well. If you did not
have AdminPage defined you would need to put in the url or use JQuery to find
it. Geb’s implementation of the page object model is really what defines it and
makes it more powerful than other browser automation tools. Let’s dive in a
little further to see what a test script using the page object model looks
like.
Testing
Seeing an
example of a test using the page object model really brings everything
together. WebDriver functionality is used in automation, JQuery is used in
defining the content in page object model classes, Groovy provides the language
support to code the page objects, and Spock is the testing framework we are
going to import. Below is a test specification for the login page we defined
above in Figure 5.
Figure 6: Geb Test Using Spock Framework and Page Object
Model
The first thing to do
when writing a test spec for a page is to use the def keyword to define the
name of the test. In this case we are calling it “login to admin section.” You
want to name tests very specifically to increase maintainability and
readability. First what the test is going to do is try to go to the login page.
If for some reason it cannot go there the test will be reported as failed due
to the “given:” keyword. This will only go on if the browser is at the login
page after the instruction is completed. The next instruction is going to add
text to the user name and password text fields using the “with” method which
will modify the state of the content on the page to whatever is specified (in
this case we are adding text for username and password). Right after the state
of the page is modified to include the information in the username and password
fields, the test will then click the login button programmatically. Once this
action is completed then it tests to see if the browser is “at” the admin page.
Since the “at” function returns a Boolean, it will decide whether or not the
test passed due to the state of the browser. If it is at the admin page, then
everything is good and the test will be shown as passing, otherwise it will be
reported as failed.
Concluding
Statements
As you can see from the explanation of the test
(figure 6) in the previous section, Geb code is very readable.
This is why Geb is so powerful; it is easy to write, easy to learn, easy to
modify, and it works! There are a few moving parts to get it going, but if it
is right for whatever project you may be doing it can save a large amount of
time and provide much more accurate test coverage. Using the page object model
efficiently makes for simple test specifications that can cover a large amount
of test cases. Defining all of the pages in your web application using the page
object model can really make it easy to go from page to page in testing since
all you have to say is something along the lines of “to LoginPage.” Also, using
the browser automation techniques can save a large amount of time because you
don’t have to keep doing data entry into text fields every time you want
something to be tested. Type the information once in a test spec and then you
can run it as many times as you need. If you decide that Geb is right for your
project, more information on how to configure Geb in a project and further
documentation can be found at www.gebish.org.
I have had some experience using Geb and would highly recommend using it for in
depth web application testing.
Bibliography
[1] Geb - very groovy browser automation… web testing, screen
scraping and more. (n.d.). Retrieved from http://www.gebish.org