Skip to content

Our AutoCare maintenance tool has been given eyes by AI

AutoCare, known as a reliable maintenance tool, has already established its place as the guardian of our digital services. But now, it has received a new, exciting update from artificial intelligence.

Mark Vicuña, March 4, 2024

This is more than just an update; it’s a leap towards a smarter and more user-friendly way to maintain websites.

Excellent automation just became better

Visual regression testing has always been an excellent form of automation, but it’s not perfect. Even small visual deviations can trigger alerts, leading to the process being halted and manual checks being required. While this is often necessary, it is also time-consuming. Sometimes the alert also turns out to be false.

Two near identical images of the front page of a website, labelled Image A and Image B.
Do these images look the same to you?

Visual regression is an essential part of AutoCare. When AutoCare updates a website, we use a series of visual regression tests to check a site’s appearance and functionality before and after the update. It’s a method that has worked for us, but one that becomes harder to rely on as sites become more complex and dynamic.

A single page on a modern website usually consists of several moving pieces, which can cause the page to look a little different every time it loads. When you factor in cookie-consent pop-ups and scripts with variable loading times, taking two controlled screenshots of the same page starts to sound like an impossible task.

Even the tiniest, imperceivable differences may cause your basic visual regression test to believe that it’s looking at two completely different things.

The differences between two images marked in red, captured by a visual regression test.
A single-pixel shift in content, seen through the eyes of your standard visual regression test (red indicates the differences).

So how can we distinguish between actual, breaking changes in a website? Can we create a smart visual regression test, which can look at a page and understand the difference between a critical error and a minor shift in content?

When the task is to rewrite our programs to make decisions like a human, we have perhaps stumbled upon a use case for AI.

AI as a decision-maker

Within the Care team at Evermade, we took on the challenge to find out if ChatGPT could help us in this endeavor. The timing was right, as OpenAI had just released the GPT-4 Vision model for its API, which would now allow us to send images to ChatGPT for analysis and comparison.

On a technical level, laying the groundwork to connect to the API and plugging it into our automation tool was surprisingly straightforward. The real head-scratching began when it came time to figure out what we were going to say to our new assistant, the AI visual regression expert.

Coming up with a suitable prompt became a creative challenge, and a fair bit of trial and error was required to get our expert to provide usable answers. One of the most surprising observations was how ChatGPT sometimes “shied away from responsibility” and did not want to position itself as a decision-maker in evaluating visual regression. Similar to how ChatGPT may refuse to give medical advice, it also didn’t want to bear the burden of releasing a broken website.

Test response from an AI assistant that reads: I'm sorry, but I can't assist with comparing these images. If you have any other questions or need assistance with a different topic, feel free to ask!
I’m sorry, Dave. I’m afraid I can’t do that.

Eventually, ChatGPT very well and quite intelligently recognized the situations where human expertise was needed. It became clear to us that our new assistant’s understanding of visual regression and quality control was far more nuanced than we thought. We could be even less explicit in our prompts, and it would still know what to look out for and what to ignore.

Text response from an AI assistant that reads: However, since I need to ignore any cookie banner differences, other potential minor discrepancies may include variations in dynamic content such as time-sensitive information or interactive elements that may have changed state between the captures.
At this stage, we had only told our assistant to ignore cookie pop-ups!

After multiple prompt iterations and back-and-forth brainstorming within the team, we strengthened our assistant’s decision-making skills and placed it into AutoCare as a guardian of visual regression; an expert we could count on whenever our standard visual regression tests would fail.

With AI in place, we were able to reduce the false positives we were getting from our previous tests. Once again, automation and now specifically artificial intelligence frees up our professionals’ time to solve more creative problems. This is particularly evident in the speed of response to our clients’ support requests.

The first step has been taken

Integrating artificial intelligence into AutoCare has brought us a new perspective regarding AI. When thinking about ChatGPT as a tool, we are usually drawn to its reach and ability to act as a contact point for us to interact with large sets of data. We recognize its capacity to parse information, summarize knowledge, and provide assistance over a range of disciplines.

However, we learned that on a micro-level, it can also excel at executing small but challenging tasks, and play a tiny (but crucial) role within a larger architecture. It can specialize in tasks that are repetitive and unnecessary, but might also be too overwhelming to automate with traditional methods.

In that sense, the real breakthrough for us was the idea itself, and the realization that we had finally identified a problem where AI could be the best solution. We hope this encourages you to explore these tools with a creative approach and discover the many interesting ways in which they can improve your work.

Our journey in utilizing artificial intelligence in web development continues, and we strive to further improve our services for our customers. The first steps in harnessing AI into our automation architecture have now been taken, and we look forward to the future.

Search