The Verification Playground
Verification Playground focuses on the human side of verification and helps to promote discussion and hands-on experience with formal verification.
When it comes to testing a software application, we need verification and validation (V&V) to ensure that we can be confident in the final product, and that our tests are resilient to changes in the underlying system. The goal of V&V is to make sure that the software being tested can meet all of its intended functionality and user expectations, and can be used safely and securely by users.
The VerifyThis competition is a popular way for 검증놀이터 researchers and practitioners to exchange ideas and techniques, and to showcase their work in an organized and controlled environment. Organizers often design the challenges to be challenging but also feasible, while still offering enough detail so that participants can find the best solution with their preferred verification tool.
This year’s competition challenged participants to tackle a variety of interesting and relevant verification challenges, including a few that were new to the event. Using data from previous editions, we looked at how well teams fared on these problems and compared their results to other teams.
Despite the fact that the challenge problems were progressively harder throughout the course of the competition, few teams managed to solve them all completely correct. However, there are some interesting trends that may point to why this is the case.
In particular, we noticed that a team’s performance on a challenge depends largely on the level of difficulty in that challenge: The more difficult the challenge is, the fewer correct solutions it elicits. Moreover, the more difficult the challenge is, the more likely it is that the team will have to use additional information in order to prove that they’ve completed their proof correctly.
Another factor is the timing of a challenge’s appearance: The later the challenge appears in a competition, the fewer teams manage to solve it correctly. This is because teams get tired and less focused on a problem as the competition stretches over a long period of time.
To help us understand why this is the case, we performed a regression analysis on the challenges used at the last three VerifyThis events. We measured how much time it took to complete a correct solution, and analyzed the number of teams that solved each challenge completely. In addition, we analyzed whether the number of correct solutions changed over the course of the competition.