Automate Web Page Logins

To ensure a safe number of users during Covid we had to log into Web sites for access to activities like pools, gyms and ski hills. These precautions made sense, however the booking process was awkward and was easy to miss an activity if you weren’t signed up early enough.

Luckily there are some great Linux tools that can be used to automate web logins. In this blog I’ll look at two techniques:

  • Keyboard simulation using xte and xdotool
  • Python accessing the HTML on page using Selenium

Linux Keyboard Simulation

If you are working in MS-Windows take a look at SendKeys command.

In the Linux environment there a number of choices, such as xautomation and xdotool.

The xdotools package is feature rich with special functions for desktop and windows functions. The xautomation package was a little simpler and it focuses on keyboard and mouse simulation.

I found that it was very useful to install wmctrl, this allows you to easily find out which windows are running and it can set the active window with a substring (you don’t need the full window name).

To install xautomation and wmctrl in Ubuntu:

sudo apt-get install xautomation wmctrl

To install xdotool in Ubuntu:

sudo apt-get install xdotool

Log in Example with xte

To create an automation script you need to do the required steps once manually and document what is a Tab, an entry field or a Return key. Some trial and error work will probably be required for timing if you are working between pages links or with pages that have a slow call up time.

A good simple example is to try and log into Netflix.

The following Bash script will : 1) open a Chrome browser page, 2) set the focus to the page and 3) sent the correct tab, text and a return key.

#!/bin/bash
# netflix_login.sh - script logs into Netflix
#

url="https://www.netflix.com/ca/login"
email="my_email.com"
pwd="my_password"

chromium-browser $url & #open browser to

# wait for the page to open, the set focus to it
sleep 2
wmctrl -a "Netflix - Chromium"
sleep 1 # allow time to get focus before sending keys
xte "key Tab"
xte "key Tab"
xte "str $email"
xte "key Tab"
xte "str $pwd"
xte "key Tab"
xte "key Tab"
xte "key Return"

echo "Netflix Login Done..."

Use wmctrl -l to see which windows are open.

A specific window can have the focus based on a substring, with the -a options

The xte command uses a string to pass key, text and mouse actions to the active window.

Park Booking using xdotool

The xdotool syntax to send keystrokes and text is very similar to xte, but with a few extra features.

The park booking example is a little bit more complex because a booking time needs to be selected from a list. It might be possible to tab to the required time, but a safer way would be to do a search to find the required time.

Neither the xte nor the xdotool utilities support a search text function. A simple workaround is to use the web browser’s search function. To have keystrokes go to the result of the browser search it’s important to enable Caret navigation.

The Caret dialog is shown by entering F7. It’s important to note that the Caret Enable/Cancel or Yes/No buttons will vary between different browsers.

For the park booking page, the animation script needs to manage 8 entry fields. To keep things simple I’ll pass the date in the URL.

The important issue is to book a time slot, for this Control-F will call up the Search Dialog. After the required time is entered, the navigation moves to the selected time. The next step is to close the Search Dialog, this done by sending three tabs and a return key.

The final bash script is shown below. One of the useful features of xdotool is that it can do repeat key strokes with a delay between entries.

#!/bin/bash
# book10am.sh - make a 10:00 park booking
#
sdate="startDate=2021-04-23" #adjust the date
url="https://book.parkpassproject.com/book?inventoryGroup=1554186518&&inventory=1229284276&$sdate"

chromium-browser $url & #open browser to park booking page
sleep 5 # wait for browser to come up
wmctrl -a "Chromium"
sleep 2
# Turn on caret browsing
xdotool key F7
xdotool key Return
sleep 1

# tab to 'Time Slot' area 
tabcnt=8
xdotool key --repeat $tabcnt --delay 100 Tab

xdotool key Return
sleep 1

# Search for 10:00 time and select it
xdotool key ctrl+f 
xdotool type '10:00'
xdotool key Return
# Close find dialog and select time
xdotool key Tab Tab Tab Return Return

echo "Park Time Booking Complete"

Script Limitations with xdotool and xte

Using xdotool or xte is great for simple web page automation where the HTML form items are sequential and no special decision making is required.

Unfortunately I found that when I tried to book a park time on the weekend I started to see some limitation. During busy times if I tried to book by time the xte or xdotool could not determine if the time slot was full.

A simple workaround would be to search for the first ‘Available’ or ‘Not Busy’ slot but this doesn’t allow you to pick times that you like.

For projects that require some logic (like picking a good time from a list of time), Selenium with Python is an excellent fit.

For more complex web automation projects Selenium with Python is an excellent fit.

Installing Selenium with Python

Selenium is a portable framework for testing web applications, with server/client tools and IDE’s.

The Selenium WebDriver component sends commands from client APIs directly to a browser. There are webdriver components for Firefox, Google Chrome, Internet Explorer, Safari, Opera and Edge. Client API’s are available for C#, GO, Java, JavaScript, PhP, Python and Ruby.

For details on extra information on installation of the different webdrivers see: https://www.selenium.dev/downloads/

To install the Linux 32-bit Selenium Driver (geckodriver) for Firefox:

wget https://github.com/mozilla/geckodriver/releases/download/v0.29.1/geckodriver-v0.29.1-linux32.tar.gz
tar -xvzf geckodriver-v0.24.0-linux32.tar.gz
chmod +x geckodriver
sudo mv geckodriver /usr/local/bin

To install the Selenium library for Python:

pip install selenium

The big difference between using xte and selenium is that selenium can directly access the HTML code of the selected web page. The xte approach to accessing form items is to tab to them like you would with a keyboard. Selenium allows code to access an HTML item using its id name.

Selenium Log In Example

Like using xte the user needs to do some manual work before the script is written.

Once the required web page is open, the Web Developer Inspector tool can used to examine HTML code. To access the Inspector, Select Tools > Web Developer > Inspector from the top menu bar, or use the shortcut control-shift-C.

For the Netflix log in example the key HTML items are the “email or phone number” input and the the password input. Using the Inspector these id’s (or names) can be found.

The Python code to log into Netflix is:

#
# netflix_login.py - automate Netflix Login
#
from selenium import webdriver

url="https://www.netflix.com/ca/login"
email="my_email.com"
pwd="my_password"

browser = webdriver.Firefox()

browser.get(url)

# wait for page to refresh
browser.implicitly_wait(10) 

username = browser.find_element_by_id('id_userLoginId')
username.send_keys(email)

password = browser.find_element_by_id('id_password')
password.send_keys(pwd)

password.submit()

print("Login Complete")

When a web page is called it’s important to give the page some time to refresh. The implicity_wait(10) call will wait up to 10 seconds for a Selenium query. HTML Items can be found by either find_element_by_id() or by find_element_by_name(). The send_keys() method is used to pass text strings to <input> tags. Finally calling submit() will send all the form data to the requested action.

Using Selenium Searches

From the earlier park booking example we saw that xte had some limitations when a variable lists of options were presented. Luckily Selenium has a number of functions that can be used for searching HTML tags and text.

The first step is to manually open the web page and inspect the structure.

For this example the Inspector shows that each status entry in the list has a <div class=”jss97″>. with the Available items having a <div class=’jss100′> and the “Not Busy” items being a <div class=’jss100′>. The very top level is <div class=”jss94″>, this has both the times and the status messages.

Now that I know how the park times are defined I can check for the first “Not Busy”entry, or I can check a specific time and see if it’s busy or not.

Below is a code example that shows the first “Not Busy” time. To select a specific time/status entry use the click() method (itimes.click()).

Final Comments

Using simulated keyboard and mouse movements is a nice tool for automating simple web pages.

If you need to look at more complex web automation solutions check out Selenium, it has a Python library that allows you to access the DOM object of a web page.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s