Setup for parallel test runs
Reliable & efficient CI pipeline - that’s something we all would like. One of the best ways to achieve this is to run test cases in parallel. For our main Rust backend at Fieldwire, we chose to stick to cargo test which by default runs the test cases in parallel across multiple threads. In such parallel processing, the main issue to watch out for is globally shared mutable state. This usually comes about in these 2 shapes:
- Sharing memory across multiple threads
- Database which is usually shared between & mutable from multiple processes/threads
Of these, the first one is easily solved in our case because of the nature of Rust & our codebase. Rust makes it a bit involved to share mutable memory across threads. We are big fans of not fighting it in our production code, where the only global state we share between threads is immutable.
Database sharing is a bit tricky. A lot of codebases have the luxury of encapsulating the whole test case inside a transaction which gets rolled back at the end thereby undoing all the changes made by the test. With the proper isolation level, this would make sure that a single database could be shared across all test cases without them stepping on each other’s toes. Unfortunately, this isn’t something we could depend on because of multiple reasons which are oustide the scope of this blog post. So, we had to instead go for each test case having its own isolated database. At the same time, we don’t want to setup a database from scratch for every single test case since it carries non-trivial performance overhead. So, we need both isolation & maximum reusability of databases for ensuring correctness & efficiency.
Let’s start diving into how we achieved it by taking a look at how the test cases are written:
The TestDatabase instance & the underlying physical database created above are only accessible inside that particular test case. That’s all the test cases need to do to achieve our requirements! All the interesting stuff happens in the setup method & when the test_db binding created above is dropped automatically when it goes out of scope. Let’s take a look at the underlying pseudo-code
database_instance.rs
DatabaseInstancedenotes a single physical database- On
setup, a new database is created with random name - Required tables, triggers etc. are setup before returning
- On
drop, the backing physical database is dropped
test_database.rs
TestDatabasemanages a pool ofDatabaseInstances- On
setup, an existingDatabaseInstanceis checked out - If no such instance exists, a new one is created
- On
drop, the associated instance is cleaned up & sent back to the pool
So, for the initial test cases, since the pool would be empty, new databases are created. As each of them complete (regardless of passing or failing), they would send the database they were using back to the pool after all the changes are wiped from the database. When the next test case starts, one of these existing databases would be checked out & the process repeats. Thus, we provide completely clean, isolated databases to each running test case for reliability and reuse them for subsequent test cases for efficiency