As a Masters student studying Embedded Systems at Delft University of Technology, I was very interested in automated testing. At one of the university’s recruitment days, most recruiters didn’t really know what Embedded Systems are, until by chance I met an Adyen representative who told me that the company was using robots to automate the testing of their point-of-sale system. These robots simulate the actions normally done by a real person, like inserting a card, typing in PIN code, and so on. My interest was immediately sparked.
In mid-2016 I was excited to join Adyen as an intern on the development team. From day one, interns are expected to solve real problems for the company, and I was no exception. In my case, I had the opportunity to improve Adyen’s point-of-sale (POS) test setup by looking at data using passive learning.
The challenge
From a shopper’s perspective, doing an in-store payment seems to be fairly simple. However, the configurations behind a point-of-sale payment are actually quite complex. Some of the possible variations to consider are:
- Card brands
- Card entry modes (chip, swipe, or contactless)
- Currency conversion (aka Dynamic Currency Conversion or DCC)
- Loyalty points
- Validation type (PIN versus signature)
- Issuer responses (such as declined due to insufficient balance)
- Cancellations by shopper or merchant
In addition, there are multiple other scenarios to consider. For example, the shopper may abort the transaction (at any stage), remove their card, and so on. In all, there are hundreds of possible payment flows.
While automation software such as Jenkins can run hundreds of tests for Adyen’s backend code within seconds, the robots to test the transactions need much longer. With the mushrooming numbers of payment configurations, it was not feasible to run all our tests in this manner, and so we began to investigate new ways to scale how to test our solution and analyze the performance.
The tools
I investigated possibilities to use the log files produced by the payment terminals, to see if the terminals in the field perform as we expect, and whether what the robots are testing actually covers the main flows in the field. The approach that sparked our interest for this was passive learning.
Passive learning tools try to describe the behavior of a system by transforming observations into a model. The observations in our case are log files, and the model is a graph model. In this model, it is possible to see which flows occur the most frequently, where flows split (for example: PIN versus signature), what happens to transactions that do not get approved, and so on.
Some of the tools we used are:
Synoptic— a tool that aims to make state machine inference from log files easy, mainly for system debugging purposes.
InvariMint— which aims to improve understandability of inference algorithms, by describing an approach to model inference algorithms declaratively.
DFASAT— a tool with greedy merging algorithm, based on Blue-Fringe, which takes traces (fixed format) as input, which contain the different events as well as the trace type (i.e., accepting or rejecting), and outputs an edge-labeled automaton.
The process
Passive learning tools are generally being used on synthetic examples from academia, or for (reverse engineering) competitions like STAMINA. We couldn’t find any evidence of real industrial usages of those tools, and it turns out that applying them to a real system is actually challenging.
Under the direction of my supervisor and with an Adyen colleague, over a nine-month period I improved the output produced by the tools, by following an iterative evaluation process. In each iteration, we discussed possible improvements to the use of passive learning at the company with the expert developer, worked on the improvements, and analyzed the new results.
In particular, after a few experiments, we chose DFASAT as the tool to customize. All our changes and improvements are available as open source on our fork on Bitbucket. We adjusted one of the existing heuristics for our payment domain. For example, we introduced colors and numbers corresponding to different end states (Approved, Cancelled, Declined, Error), and we incorporated timestamp differences between log lines to see the bottlenecks in our system.
The results
We learned about potential bottlenecks in the system, differences between card brands, bugs in upcoming releases, and much more. Some of the key things we found included:
- By including time differences in the graph models, we were able to identify the potential bottlenecks in the system. This meant that we were able to see how long on average each step in a transaction takes, and use these insights to speed up the transaction time by 3–4 seconds for certain flows.
- With one of the versions under test, we found that the terminal asked for both PIN and signature for cardholder verification, where only one of those two is necessary to complete a transaction. With this knowledge, we were able to fix this bug before we released this version.
- The graph models revealed that if the network connectivity in the store is unstable, some transactions may timeout at different stages of the transaction. This is not a bug, but it is not optimal, and can interfere with the payment process. In later versions, we added retry mechanisms, which drastically minimized this type of issue.
- The data also revealed that for some card brands, there were twice as many chip errors as normal. While there was not a lot we could do to resolve this issue, we could alert our merchants.
Identifying potential bottlenecks and issues had a real impact on improving Adyen’s POS system, and a flow on effect for its POS customers, which include some of the major luxury and retail brands in the world.
One of the robot setups
Presenting my research in China — and working at Adyen
Another great outcome of my research at Adyen was the possibility to present my findings at theInternational Conference on Software Maintenance and Evolution (ICSME)in Shanghai. The slides for this are available onSlideShare.
Moreover, after graduating, I joined Adyen’s point-of-sale quality assurance team where we are still using (DFASAT/flexfringe) to analyze and improve our point-of-sale solution. Also, as the robots have been so valuable in our testing process, we recently added another 13 to increase our testing capacity.
If you are interested in the paper we published, it can be found here:An Experience Report on Applying Passive Learning in a Large-Scale Payment Company.
Update November 3, 2017
Arie van Deursen, Professor in Software Engineering at Delft University and supervisor on this project, recently presented a keynote at theAutomated Software Engineering Conferencein Chicago on the topic Software Engineering without Borders, including discussion on Rick’s research. The video is now available onYoutube, with the most relevant part from 23:15 –27:10.
Technical careers at Adyen
We are on the lookout for talented engineers and technical people to help us build the infrastructure of global commerce!
Check out developer vacanciesFresh insights, straight to your inbox
By submitting your information you confirm that you have read Adyen's Privacy Policy and agree to the use of your data in all Adyen communications.