ProgBlog

Sunday, April 3, 2016

Automated testing of localhost URLs with Selenium, Pytest and Sauce Labs

Selenium WebDriver is a great way to create automated tests of web applications. It can simulate a user's interactions with the a web page and examine the results.

Although it's not required to use a test framework to organize and execute your Selenium WebDriver tests, a good test framework does make this task more convenient. As a Python programmer, I'm a fan of Pytest.

Selenium WebDriver supports a wide variety of web browsers, but is limited to those browsers installed on the computer where WebDriver runs. That's where a service like Sauce Labs comes in. Sauce Labs runs Selenium WebDriver on virtual machines in the cloud, offering many combinations of device, operating system and browser.

So the combination of Selenium WebDriver, Pytest and and Sauce Labs enables testing of a web application across many platforms. There's just one catch. While our web application is under development, we likely wish to run it on our local machine, with a localhost URL. A Sauce Labs virtual machine can't access the local machine to run such tests -- unless we install Sauce Connect. Sauce Connect is free software from Sauce Labs that creates a secure tunnel between Sauce Labs' servers and our machine.

What follows is a step-by-step guide to creating and running tests using Selenium WebDriver, Pytest, Sauce Labs and Sauce Connect. This guide refers to Python 2.7 running on Ubuntu 15.04, but the same principles apply to any programming language and operating system.

Let's start by creating a web app to test. There are many ways to do this. Our example uses a "hello world" Python Flask app, based on the sample from the Flask Website. Here's the code, which saved as hello.py:

# Source of hello.py

from flask import Flask
app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello World!"

if __name__ == "__main__":
    app.run(port=8080)

Once we run

python hello.py

we can visit http://localhost:8080 in our browser and see a web page that displays "Hello World!" Now we can write a test to verify that the web application is working as expected. Selenium WebDriver supports almost any programming language and web browser with Selenium. This example uses Python and Firefox. Let's make a directory named tests and create a file in it named test_hello.py:

# Source of test_hello.py

from selenium import webdriver
driver = webdriver.Firefox()
driver.get('http://localhost:8080/')
body_text = driver.find_element_by_css_selector('body').text
assert body_text == 'Hello World!'
driver.quit()

As long as Firefox is installed and hello.py is still running, we can execute

cd tests
python test_hello.py

Selenium WebDriver will open Firefox, navigate to http://localhost:8080, and verify that the text "Hello World!" is displayed.

Let's introduce Pytest to make the test a little more convenient to run. Pytest's discovery feature automatically finds tests based on naming conventions, such as files and functions whose names start with test_. We'll make a small change to test_hello.py, placing everything inside a function named test_1:

# Source of test_hello.py

def test_1():
    from selenium import webdriver
    driver = webdriver.Firefox()
    driver.get('http://localhost:8080/')
    body_text = driver.find_element_by_css_selector('body').text
    assert body_text == 'Hello World!'
    driver.quit()

Now, as long as Pytest is installed, we can simply run

py.test

Our test will run, and we'll get a handy summary of the results:

This test ran on Firefox. Suppose we'd like to test our web application on an operating system and browser that aren't installed locally, such as Windows 8.1 with Internet Explorer 11. For such remote testing with Selenium WebDriver running on a virtual machine, we'll need a Sauce Labs account. We'll need to supply our Sauce Labs username and an access key in order to connect to a Sauce Labs virtual server. We start by changing test_hello.py, replacing driver = webdriver.Firefox() with several lines of code that connect to a remote WebDriver.

# Source of test_hello.py
# Using a remote WebDriver

import os


def test_1():
    from selenium import webdriver
    desired_capabilities = {
        'platform': "Windows 8.1",
        'browserName': "internet explorer",
        'version': "11.0",
        'screenResolution': '1280x1024'
    }
    sauce_url = 'http://%s:%s@ondemand.saucelabs.com:80/wd/hub' %\
                (os.environ['SAUCE_USERNAME'], os.environ['SAUCE_ACCESS_KEY'])
    driver = webdriver.Remote(
        desired_capabilities=desired_capabilities,
        command_executor=sauce_url
    )
    driver.get('http://localhost:8080/')
    body_text = driver.find_element_by_css_selector('body').text
    assert body_text == 'Hello World!'
    driver.quit()

The code above assumes the Sauce Labs username and access key are stored in environment variables named SAUCE_USERNAME and SAUCE_ACCESS_KEY, respectively.

Once again, run the test with

py.test

The test runs, but doesn't pass:

Why? Because the Selenium WebDriver instance running on a Sauce Labs server can't access http://localhost:8080, which is running on our local machine. The error message provides a hint how to fix this: install Sauce Connect, which we can download here. This example uses Sauce Connect v4.3.14 for Linux. Versions for other operating systems are also available.

Installation on Linux is as simple as unzipping the downloaded file and ensuring that the resulting bin/sc file is on the path and has execute permission. Don't forget that, as above, our Sauce Labs credentials must be stored in the environment variables SAUCE_USERNAME and SAUCE_ACCESS_KEY.

Prior to running the test again, we run sc, which creates a secure tunnel between Sauce Labs' servers and our local machine. The tunnel remains available until we terminate sc (e.g by pressing Ctrl+c). Leaving sc running (either in the background or in a separate terminal window), we run py.test once more. This time, our test should pass.

At this point, we have everything we need to run remote tests of localhost URLs. However, there are some best practices we can implement to make such tests more convenient and reliable. Sauce Labs recommends that a tunnel be established and terminated for each test run, rather than leaving the tunnel open indefinitely. We'll accomplish this in two steps. First, we'll pass command-line arguments and add a few more lines of code to run sc as as daemon (that is, in the background). Second, we'll use a Pytest fixture to ensure the tunnel is ready before any test runs.

The full list of Sauce Connect command-line arguments is available here. The ones we want are:

-t: Controls which domains are accessed via the tunnel. For improved performance, we'll tell sc to use the tunnel for localhost URLs only.
--daemonize: Runs sc as a daemon.
--pidfile: The name of a file to which sc will write its process ID (so we know which process to kill when we're done).
--readyfile: The name of a file that sc will touch to indicate when the tunnel is ready.
--tunnel-identifier: Gives our tunnel a name so we can refer to it.

We could place the code to launch sc at the top of function test_1() in test_hello.py, and the code to terminate the tunnel at the bottom of the same function. However, this has a drawback. Presumably, we plan to write many test functions, not just one. We need to tunnel to be created before the first test runs, and terminated after the last test finishes -- regardless of the order in which the tests might be executed. Pytest provides a powerful feature known as fixtures that can accomplish this.

We could create a module-scoped fixture that applies to every test function in our test_hello.py file. But we can do even better. Let's create a session-scoped fixture. By defining our fixture in a specially-named file, conftest,py, and using a decorator to give the fixture session scope, the fixture will automatically apply to every test function in every test file we choose to create. The comments in the source code below provide more details.

# Source of conftest.py

import os
import pytest
import signal
import subprocess
import time


@pytest.fixture(scope="session")  # Session scope makes the fixture apply to any test function in any test file
def tunnel(request):
    sc_pid_file_name = '/tmp/sc_pid.txt'  # File where sc will store its PID
    sc_ready_file_name = '/tmp/sc_ready.txt'  # File sc will touch when the tunnel is ready
    sc_pid = None

    def fin():  # Function that executes when the last test using the fixture goes out of scope
        if sc_pid:
            os.kill(int(sc_pid), signal.SIGTERM)  # Kill sc's process, terminating the tunnel

    try:
        os.remove(sc_ready_file_name)
    except OSError:
        pass

    # Sauce Connect reads credentials from environment variables SAUCE_USERNAME and SAUCE_ACCESS_KEY
    subprocess.call([
        'sc',
        '-t', 'localhost',                            # Use tunnel for localhost URLs only
        '--readyfile', sc_ready_file_name,            # Name of the "ready" file
        '--tunnel-identifier', 'my_tunnel',           # Name of the tunnel
        '--daemonize', '--pidfile', sc_pid_file_name  # Run as daemon; store PID in specified file
    ])

    with open(sc_pid_file_name) as sc_pid_file:
        sc_pid = sc_pid_file.read()  # Read the PID
    request.addfinalizer(fin)  # Register the finalizer function

    # Wait for the tunnel to be ready
    start_time = time.time()
    while True:
        if os.path.exists(sc_ready_file_name):
            break
        if time.time() - start_time > 30:
            raise Exception('Timed out waiting for Sauce Connect')

Now we just need to make two small changes to test_hello.py, passing the name of the fixture function as an argument to the test function, and including the tunnel identifier in the desired_capabilities dictionary that controls Selenium WebDriver's behavior.

# Source of test_hello.py
# Using a remote WebDriver

# Using a Pytest fixture

import os


def test_1(tunnel):
    from selenium import webdriver
    desired_capabilities = {
        'platform': "Windows 8.1",
        'browserName': "internet explorer",
        'version': "11.0",
        'screenResolution': '1280x1024',
        'tunnelIdentifier': 'my_tunnel'
    }
    sauce_url = 'http://%s:%s@ondemand.saucelabs.com:80/wd/hub' %\
                (os.environ['SAUCE_USERNAME'], os.environ['SAUCE_ACCESS_KEY'])
    driver = webdriver.Remote(
        desired_capabilities=desired_capabilities,
        command_executor=sauce_url
    )
    driver.get('http://localhost:8080/')
    body_text = driver.find_element_by_css_selector('body').text
    assert body_text == 'Hello World!'
    driver.quit()

That's it! We now have a framework for running any number of test functions, with a tunnel automatically created before the first test, and terminated after the last test.

Sunday, July 19, 2015

Python dictionary comprehensions

Python programmers may be familiar with list comprehensions, a compact syntax for defining a list. Here's a typical example that creates a list containing the squares of the first five positive integers:

print [n*n for n in range(1,6)]
[1, 4, 9, 16, 25]

But suppose we want to create a dictionary, rather than just a list, mapping each integer to its square. We can use a dictionary comprehension. It differs from a list comprehension in using curly braces instead of square brackets, and in specifying two expressions separated by a colon to the left of the

for.

print {n: n*n for n in range(1,6)}
{1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

This is equivalent to:

d = {}
for n in range(1,6):
    d[n] = n*n
print d

Dictionary comprehensions are a great of example of how we can do a lot in Python with a single, very readable, line of code.

Thursday, January 1, 2015

Avoiding inaccurate results when using SQL joins with aggregate functions

Using JOIN in a SQL query is a powerful way to combine data from two or more tables. However, if a join is used incorrectly in combination with an aggregate function such as COUNT or SUM, it may not give the output you intended. With a little care, you can get the right result.

As an example, suppose our database represents a group of students. Each student has zero or more email addresses. So we'll have a table named STUDENT, as well as a table named EMAIL that includes a STUDENT_ID column. An email address can be marked inactive, meaning the student doesn't currently use it.

Our goal is to count the number of email addresses for each student.

Here's the SQL to create the two tables:

create table student
(
  id integer,
  name character varying
);

create table email
(
  student_id integer,
  address character varying,
  is_active boolean
);

Let's populate our STUDENT and EMAIL tables with some data:

insert into student values(1, 'Steve');
insert into student values(2, 'Dana');
insert into student values(3, 'Jason');
insert into student values(4, 'Mark');

insert into email values(1, 'steve@gmail.com', true);
insert into email values(1, 'steve@yahoo.com', true);
insert into email values(2, 'dana@gmail.com', true);
insert into email values(2, 'dana@gmail.com', false);
insert into email values(4, 'mark@gmail.com', false);

Here are the resulting contents of the two tables:

Notice that Jason has no email address, and Mark's only email address is inactive.

As a first attempt at counting email addresses, we might try this:

select s.name, count(e.address)
from student s
join email e on e.student_id = s.id

group by (s.name)

The counts for Dana, Mark and Steve are correct, but Jason is missing. We forgot to use a LEFT OUTER JOIN, which will include every record from the STUDENT table even if there is no corresponding record in the EMAIL table. So let's try:

select s.name, count(e.address)
from student s
left outer join email e on e.student_id = s.id
group by (s.name)

As we wanted, Jason shows up with a count of 0. So far, so good. Next, suppose we want to count only active email addresses. We expect the count for Dana to be 1, and for Mark to be 0. Let's try this query:

select s.name, count(e.address)
from student s
left outer join email e on e.student_id = s.id
where e.is_active
group by (s.name)

Jason and Mark have gone missing. To understand why, consider the output of this simple JOIN query:

select * from student s join email e on e.student_id = s.id:

Now condition the query on e.is_active, and only three rows will remain:

select * from student s join email e on e.student_id = s.id where e.is_active

It's these three rows that are considered by the aggregate function COUNT: two for Steve, one for Dana, and none for Jason or Mark.

How can we avoid this problem and make sure results are displayed for all students? The answer is to apply the filter e.is_active as part of the JOIN condition, not in the query's WHERE clause. Now we get the desired result:

select s.name, count(e.address)
from student s
left outer join email e on e.student_id = s.id and e.is_active
group by (s.name)

This type of mistake can be easy to overlook. When dealing with a large database, if you use the wrong query, the results may still look reasonable: you might not notice that a few students are missing. The moral of the story is, when using JOIN together with aggregate functions, always consider whether you want any filter conditions to be part of the JOIN condition, or part of a WHERE clause.

Tuesday, November 11, 2014

Null values can cause unexpected results in SQL subquery

I was recently using a SQL query to examine data in my PostgreSQL 9.3 database, and the results were not what I expected. It took a while to figure out that this due to using a subquery (also known as an inner query or nested query) that involved NULL values.

I have a table users, like this:

And a table addresses, like this:

In addresses, the user_id column indicates the user -- if any -- to whom the address belongs. I want to find any users who don't have an address. In this example, that would be betty: her user_id, 4, doesn't appear anywhere in addresses. I thought this query would do the job:

select * from users where user_id not in

(select distinct user_id from addresses)

But that query returns no rows! Here's why. Note that the row in addresses with address_id = 5 has a NULL user_id. Therefore, the query is equivalent to:

select * from users where user_id not in (1, 2, 3, NULL)

This, in turn, is equivalent to:

select * from users where user_id <> 1 and user_id <> 2 and user_id <> 3

and user_id <> NULL

In SQL, no comparison to NULL is ever true when using the = or <> operator. So no row matches the query. (The IS operator is proper to use for comparisons to NULL.)

We can avoid this problem by excluding NULL values in our subquery:

select * from users where user_id not in

(select distinct user_id from addresses where user_id is not null)

Or by using COALESCE to convert any NULL values to something we can compare against an integer:

select * from users where user_id not in

(select distinct coalesce(user_id, -1) from addresses)

Either solution will cause the query to output betty's row of the addresses table, as desired.

Thursday, July 17, 2014

Installing a Belkin N150 wireless network adapter on Ubuntu 14.04 LTS

I'm a longtime Windows user who also has some experience with Linux. This week, I'm taking the plunge and configuring a desktop system with Ubuntu that I plan to use daily for software development and management tasks. My first challenge was getting connected to the network. I purchased a Belkin N150 wireless network adapter. It came with a driver installation CD for Windows, but my computer has neither Windows nor a CD drive! Luckily, I found some good instructions thanks to theharrylucas. I'll just add a couple of points here:

The instructions are from 2011, so -- like most of the other info I found on this topic -- they pertain to older versions of Ubuntu. However, they worked fine for 14.04.
The instructions refer to the device ID 050d:935a. My device ID is slightly different, 050d:945a. Nonetheless, the instructions worked as is.

Friday, June 6, 2014

Handling null values in a SQLAlchemy query - equivalent of isnull, nullif or coalesce

SQLAlchemy allows you to use the equivalent of SQL's COALESCE function to handle NULL values. I didn't find the documentation for this easy to understand, so here's a quick tutorial...

Depending on which database management system you're familiar with, you might have used ISNULL, NULLIF or COALESCE in a query to return a specific value in case the database column contains NULL.

My example concerns a PostgreSQL database with a table named CONFIG. I want to find a record that was least recently processed, according to its LAST_PROCESSED_AT column, or was never processed, indicated by LAST_PROCESSED_AT = NULL.

Here's the result of the query

select name, last_processed_at from config:

Note that the third row has a null value for LAST_PROCESSED_AT. Here's a revised query using SQL's coalesce function to map NULLs to the earliest date recognized by PostgreSQL:

select name, coalesce(last_processed_at, to_timestamp(0)) from config

Note the effect on the third row:

Armed with COALESCE, I can find the record I'm looking for with this query:

select name from config order by coalesce(last_processed_at, to_timestamp(0)) limit 1

The question is how to do the same thing with Python and SQLAlchemy. And the answer is this:

import datetime

from sqlalchemy.sql.functions import coalesce

my_config = session.query(Config).order_by(coalesce(Config.last_processed_at, datetime.date.min)).first()

Friday, March 14, 2014

Python Flask - passing JSON to a template in a variable

I ran into a problem in my Python Flask app. I wanted to store some JSON-formatted data in a Python variable, and use JQuery's parseJSON function to consume that data inside a Jinja template. When I tried to parse the JSON in the template, a JavaScript error resulted. I figured out why, and how to work around it.

By the way, the same problem occurs, and the same workaround applies, whether you pass the variable in the call to render_template, or use a session variable.

Here's a copy of my first attempt at the code...

Python:


def index():

        return render_template('viewer_type.html', myVar='{"abc": "123", "def": "456"}')

JavaScript:


<script src='jquery-1.10.0.min.js></script>

<script>

        var myVar = "{{ myVar }}";
        var myJSON = $.parseJSON(myVar);
        prompt("abc", myJSON.abc);
</script>

This failed. prompt never executed, and Chrome's JavaScript console reported this error.


Uncaught SyntaxError: Unexpected token &

I realized that the contents of the variable were being encoded when Flask passed them to the template. Adding this line of JavaScript...


prompt("myVar", myVar);

... revealed that my JSON had turned into:


{&#34;abc&#34;: &#34;123&#34;, &#34;def&#34;: &#34;456&#34;}

Thanks to this helpful post on stackoverflow, I found a convenient way to decode the encoded text in JavaScript. I encapsulated that in a function:


function decodeJSON(encodedJSON) {
            var decodedJSON = $('<div/>').html(encodedJSON).text();
            return $.parseJSON(decodedJSON);
}

Then I just needed to amend my original JavaScript to call decodeJSON, so the whole thing looks like this:


<script src='jquery-1.10.0.min.js></script>
<script>
 function decodeJSON(encodedJSON) {
  var decodedJSON = $('<div/>').html(encodedJSON).text();
  return $.parseJSON(decodedJSON);
 }



 var myVar = '{{ myVar }}';
 prompt('abc', decodeJSON(myVar).abc);
</script>

Success! The expected output, 123, was displayed, and there were no more errors in the JavaScript console.