Better Web automation using orthogonal principles, with a practical example

Write code that is easier to maintain and less likely to break

Luis Garcia Fuentes
6 min readMay 4, 2023

Orthogonal code is a programming concept that is often overlooked but plays a crucial role in software development. Orthogonal code means code that is modular and loosely coupled, which is important because it makes the code easier to maintain, understand, and modify over time.

From the pragmatic programmer book

One of the examples of how tightly coupled code can be problematic is when one function references another function in the same codebase. If the second function is modified or removed, it can cause errors in the first function. This can create a “spaghetti code” situation where multiple functions are dependent on each other, making the code difficult to modify or maintain over time.

Practical Example 🍝

Take for example the below code for a function named “action_download_ invoices”. Given the needed code to complete this function, we know we can re-write it into smaller chunks that are easier to debug. Additionally, the presence of 16 “if” and “try” (combined) statements indicates that if this function were to break, debugging it could turn into a long evening.

Downloading an invoice is indeed not too complicated, here is the code that can do this job:

So how is it possible that the prior version of the code required far more code? Well, it is because the function did far more than just download the invoice, as it ran multiple checks to verify that the invoice could actually be downloaded (right format, etc), checked if the pdf had been actually downloaded into a folder, renamed the pdf, moved the file into a new folder and deleted the file from the original folder.

As you read this, you probably get a feeling that many of these actions could be their own functions if we were to follow orthogonal principles. But how to get started? And when does error handling come into place?

Begin with a diagram 🌱

I once heard that “People who program in C++ use Python as a quick prototyping tool, while people who program in Python use paper to prototype”. While probably an attempt to dissin’ pythoneers, there is value in following the stereotype.

Diagrams allow you to get your mind around what you are going to build without worrying if your code can compile. Diagrams also allow you to visualize where you expect to encounter errors (highlighted errors) and define plans on how to proceed in those instances.

While the diagram will not determine your final solution, it will capture the most important functions to get you started. As you test your code, you will learn that there are unforeseen bugs to be accounted for, but now that you have already 12 main functions written (8 general functions and 4 error handling functions), you can with far more ease understand if you need additional functions or tweaks to your already defined work.

Accounting for unforeseen bugs 🐞

Example 1: Selenium fails to deliver key inputs

In my diagram above I go from step 3) inputting invoice details to step 4) clicking search directly. This seems quite intuitive, but after testing the code, I learned that sometimes selenium failed to interact with the browser, and hence, I was searching for instances where the script fails to finish inputting all necessary information, which can lead to downloading the wrong data.

To account for this, I create a new function that is meant to run in between these two steps. This new function reads the content of the cells that I have just filled in and compares the found text to what I was supposed to pass in via send.keys(). This comparison will be used in the script to verify that what we send as input keys, is what is recorded by the browser.

Notice, this is not complex thinking, in fact, it is very simple work; both identify the issue and come up with a solution. This is the beauty of orthogonal code principles.

Example 2: Table that should be filled with content, is empty

Step 5 in the diagram is to check if there is an invoice in the table that provides links to pdf attachments. This was originally handled by reading the table contents first, as the website is coded in such a way that if there are no attachments provided for an invoice, the text “No invoice found” was displayed where you would expect the invoice link to be.

However, during testing, we discovered that in some scenarios the website would not render any message, instead, it would simply provide an empty table. We accounted for this with the function below, which was placed right after what our diagram names step 5.

While I set myself to decompose a script of 3 functions into 12 different functions, the final version of the script now counts 23 distinct functions. (I do not provide a breakdown of all other useful functions created). Each of these is fully independent of the other but for the output expected by functions down the run.

Explaining the code 👩‍💻

You may have noticed that I did not explain any of my code in this article so far. This was intentional, as it was unnecessary to understand the gist of orthogonal code principles. However, in the remainder of this article, I will explain some key decisions made while writing this script.

The function below follows a common template used by each function written in this exercise, hence understanding it will help you understand over half of the entire code I’ve written.

[1] It defines a list of two different XPath expressions that point to the search button element on the webpage. This is useful because some web pages can alternate between HTML code, and having multiple options to choose from increases the chances of finding a working XPath expression.

By passing XPaths as a list, we can quickly iterate between them, until one is loaded by the browser. This is better than writing code that requires waiting a long time for one version of the XPath to load before even attempting the second.

bad code

[2] Next, the function sets the attempts variable to 5, which is the maximum number of times the function will attempt to click on the search button. This ensures that the function won't get stuck in an infinite loop if the search button is never clickable.

[4] The function then uses a loop to iterate over the list of XPath expressions. Within the loop, it uses the WebDriverWait function to wait for up to 1 second for the search button to become clickable. If the search button is clickable within that time, the function clicks on the button and sets the clicked variable to True. If the search button is not clickable within 1 second, the function continues to the next XPath expression in the list.

[3] If the search button has been clicked (i.e., clicked is True), the function breaks out of both loops and returns. This means that the function was successful in clicking the search button and the search process can continue.

If after 5 attempts the search button has still not been clicked, the function prints a message stating that it was unable to click the search button and returns. This means that the function was unable to find a working XPath expression for the search button and the search process cannot continue.

Hope you enjoyed the read! Let get coding

--

--