Selenium Web Automation with Python

Everyone uses the web at one point or the other, so it‘s a huge call for developers to ensure their web applications are functioning as intended. In other to do this, web automation could be very helpful.

For any commercial software to be successful, it has to undergo a couple of tests. Automation could be useful for user tests, simulating the use of software just like a user would. It is also useful for penetration tests, such as trying to crack passwords, perform SQL injections etc.

Asides from testing, web automation could be very handy for scraping JavaScript heavy websites.

Selenium is one of the most efficient tools for web automation. It is very popular among different languages too, available in languages such as Java, JavaScript.

Installation

Selenium can be installed in python using the pip module as shown in the command below:

pip install selenium

It would install the library and needed dependencies, the installation can be confirmed by importing it in an interactive session.

$ python
Python 3.5.2 (default, Sep 14 2017, 22:51:06)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import selenium

Since no error occured, it means our installation was successful. However, it doesn‘t end there; this is because selenium works hand in hand with browsers such as Chrome and Firefox and it needs a driver from the browser to be able to proceed with its duties.

We are going to be taking a look at how to get the drivers installed. For Mozilla Firefox, you can download its driver known as geckodriver from the github page. If you are a Chrome user, you can download it‘s driver known as chromedriver from the official site.

After download, you then add the driver to the path. Personally I‘d like to keep such a file in my /usr/local/bin directory, and I‘d advise you do same.

If you‘d want to do same, the command below should move it from your current directory to the bin directory.

$ sudo mv geckodriver /usr/local/bin
$ sudo mv chromedriver /usr/local/bin

To add geckodriver or chromedriver to path from that directory, run the following command.

$ export PATH=$PATH:/usr/local/bin/geckodriver
$ export PATH=$PATH:/usr/local/bin/chromedriver

After adding the driver for your desired browser to the path, you can confirm if everything works fine by running the following from an interactive session.

For Firefox:

$ python
Python 3.5.2 (default, Sep 14 2017, 22:51:06)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from selenium import webdriver
>>> webdriver.Firefox()

For Chrome:

After running that, if a browser comes up then everything is working fine. Now we can proceed to do cool stuff with Selenium.

Most of the code for the rest of this article would be done in the interactive session, however you can write it in a file just like your usual python script.

Also, we would be working on the driver variable from the code above.

Visting web pages

After the webpage is open, you can visit any webpage by calling the get method on driver. The opened browser then loads the address passed in, just like it would when you do it yourself.

Do not forget to use http:// or https://, else you‘d have to deal with unpleasant errors.

>>> driver.get("http://google.com")

This would load the Google homepage.

Getting source code

Now that we have learnt to visit web pages, we can scrape data from the visited web page.

From the driver object, we can get the source code by calling the page_source attribute, you can then do what ever you want with the html using the BeautifulSoup library.

>> driver.page_source

Filling text boxes

If for example we have loaded Google‘s homepage, and we want to type in some information in the search box; it can easily be done.

To do this, we use the inspector element to check the source code and see the tag information of the search box. To do this, simply right-click on the search box and select inspect element.

On my machine, I got the following:

With selenium, we can select elements either by tag name, id, class name etc.

They can be implemented with the following methods:

.find_element_by_id
.find_element_by_tag_name
.find_element_by_class_name
.find_element_by_name

From the google web page, the search box has an id lst-ib, so we would find element by id.

>>> search_box = driver.find_element_by_id("lst-ib")

Now that we have found the element and saved it in a search_box variable, we can get to perform some operations on the search box.

>>> search_box.send_keys("Planet Earth")

This would input the text “Planet Earth“ in the box.

>>> search_box.clear()

This would clear the entered text from the search box. You should use the send_keys method again, in the next section we would be clicking the search button so we have something to search.

Clicking the right buttons

Now that we have filled the search box with some information, we can go ahead and search.

The same way we found the search box is the same way we are going to find the search button.

On my machine, I got the following:

Looking at this we can make use of the name attribute. We can get it by using the code below:

>>> search_button = driver.find_element_by_name("btnK")

After finding the desired tag, we can then click on the button using the click method.

>>> search_button.click()

Be careful though, due to Google‘s auto suggestions you may end up search for something else.

To bypass this, you need to make the keyboard hit the enter key immediately. Keys are beyond the scope of this article, but here‘s the code anyway.

>>> from selenium.webdriver.common.keys import Keys
>>> search_box = driver.find_element_by_id("lst-ib")
>>> search_box.send_keys("Planet Earth")
>>> search_box.send_keys(Keys.RETURN)

With the code above, we do not have to click the search button. It works just like it would when we hit the enter key after typing in the search values.

This method of clikcing buttosn doesn‘t only work with buttons, it also works with links.

Taking screenshots

You read that right! You can take screenshots using selenium, and it‘s as easy as the previous sections.

What we‘ll do is to call the save_screenshot method on the driver object, we‘d then pass in the name of the image and the screenshot would be taken.

>>> driver.save_screenshot("Planet-earth.png")

Ensure that the image name has a .png extension, else you might end up with a corrupted image.

When you are done with the operations, you can close the browser by running the following code:

>>> driver.close()

Conclusion

Selenium is known as a very powerful tool, and being able to use it is considered a vital skill for automation testers. Selenium can do much more than discussed in this article, keyboard movements can actually be replicated as shown with Keys.RETURN. If you wish to learn more about selenium you can check out it‘s documentation, it‘s quite clear and easy to use.

Selenium