IMUG: Computer Vision and Behavior Driven Visual Automation in L10n and i18n QA

Yahoo! LogoGeorge Betak from Yahoo! gave an excellent talk on “Computer Vision and Behavior Driven Visual Automation in L10n and i18n QA” at IMUG tonite. This is the future of QA automation.

George is an interesting fellow. Besides writing cutting edge software development for QA, he is also a leader for SF BayLEAF, the Nissan LEAF Owners Association.

TL;DR: Selenium+Sikuli computer vision library allows automated visual testing of web-based apps. It already works in the Yahoo! environment for mature apps that don’t drastically change.


– functional testing is low-level
– object level is still too low level
– complex frameworks take days or weeks to setup, tough for QA staff who are not generally expert programmers, even more bugs to find and fix
– ie. Selenium, TestNG, Java, Hudson/Jenkins

– Angry Birds demo (computer able to play a game, and well)

– Ostriches: Spot the Difference

– tofu, mojibake, text swell
Sikuli adds visual test capabilities
– computer vision project based on OpenCV
– MIT and BSD licensed

– Robot framework based on Java
– Cucumber framework based on Ruby

– keyword-driven testing is more productive than low-level programming
– paves the way to true localization automation testing
– replaces human testers and increases productiviry
– cheaper and more consistent than manual review
– cross-platform and cross-device
– HTML object IDs and names replaced with images
– works with native apps and native dialogs (Save dialog, file picker, etc.) too

Sikuli Issues

– Open Source project with limited support
– stability of software stack (bugs, crashes)
– making it work is an art, like configuring the tolerance setting
– 3x or 4x faster than Selenium plus manual

Commercial Tools

– branchauto

Audience Questions

– can multiple browsers and languages be data-driven, meaning parameter-driven? Yes.
– what happens with global font change or color? Sikuli tolerance, or review major changes and promote as master
– false-positives? 97% with image-based comparison tools on global changes, TBD with Sikuli.
– CAPTCHA challenge? Sikuli gets great results.

Audience Reaction

Many audience members seemed to mistake machine vision for old-school image screenshot comparison.
Some of the translation-oriented members were unable to understand the programmerish aspects of the talk.

Thanks once again to Yahoo! for hosting the meeting, providing WebEX remote access, and the lovely wine and cheese assortment.

wikipedia: eggPlant, Sikuli
George’s Google+

This entry was posted in Angry Birds, API Programming, Business, i18n, Japanese, Linux, Open Source, Tech, Toys. Bookmark the permalink.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.