Setup for parallel test runs

Reliable & efficient CI pipeline - that’s something we all would like. One of the best ways to achieve this is to run test cases in parallel. For our main Rust backend at Fieldwire, we chose to stick to cargo test which by default runs the test cases in parallel across multiple threads. In such parallel processing, the main issue to watch out for is globally shared mutable state. This usually comes about in these 2 shapes:

Sharing memory across multiple threads
Database which is usually shared between & mutable from multiple processes/threads

Of these, the first one is easily solved in our case because of the nature of Rust & our codebase. Rust makes it a bit involved to share mutable memory across threads. We are big fans of not fighting it in our production code, where the only global state we share between threads is immutable.

Database sharing is a bit tricky. A lot of codebases have the luxury of encapsulating the whole test case inside a transaction which gets rolled back at the end thereby undoing all the changes made by the test. With the proper isolation level, this would make sure that a single database could be shared across all test cases without them stepping on each other’s toes. Unfortunately, this isn’t something we could depend on because of multiple reasons which are oustide the scope of this blog post. So, we had to instead go for each test case having its own isolated database. At the same time, we don’t want to setup a database from scratch for every single test case since it carries non-trivial performance overhead. So, we need both isolation & maximum reusability of databases for ensuring correctness & efficiency.

Let’s start diving into how we achieved it by taking a look at how the test cases are written:

#[tokio::test(flavor = "multi_thread")]
async fn some_descriptive_test_name() {
    let test_db = TestDatabase::setup().await;

    // Rest of the test case:
    //  1. Rest of test specific setup
    //  2. Invoking system under test
    //  3. Assertions
}
Copy

The TestDatabase instance & the underlying physical database created above are only accessible inside that particular test case. That’s all the test cases need to do to achieve our requirements! All the interesting stuff happens in the setup method & when the test_db binding created above is dropped automatically when it goes out of scope. Let’s take a look at the underlying pseudo-code

database_instance.rs

DatabaseInstance denotes a single physical database
On setup, a new database is created with random name
Required tables, triggers etc. are setup before returning
On drop, the backing physical database is dropped

// database_instance.rs

#[derive(Debug)]
pub struct DatabaseInstance {
    pub name: Uuid,
    pub connection_pool: Pool,
}

impl DatabaseInstance {
    pub async fn setup() -> DatabaseInstance {
        let name = Uuid::new_v4();
        let connection_pool = create_db(&name).await;
        load_structure(&connection_pool).await;
        
        DatabaseInstance {
            name,
            connection_pool,
        }
    }

    pub async fn truncate(&mut self) -> DatabaseInstance {
        helper::truncate_all_tables(&self)
    }

    pub async fn close(&mut self) -> DatabaseInstance {
        helper::terminate_connections(&self)
    }
}

impl Drop for DatabaseInstance {
    fn drop(&mut self) {
        tokio::task::block_in_place(|| {
            Handle::current().block_on(async move {
                helper::terminate_connections(&self);
                helper::drop(&self);
            })
        });
    }
}
Copy

test_database.rs

TestDatabase manages a pool of DatabaseInstances
On setup, an existing DatabaseInstance is checked out
If no such instance exists, a new one is created
On drop, the associated instance is cleaned up & sent back to the pool

// test_database.rs

static DB_POOL: DbPool = DbPool::new_empty_pool();

#[derive(Debug)]
pub struct TestDatabase {
    db_instance: Option<DatabaseInstance>
}

impl TestDatabase {
    pub async fn setup() -> TestDatabase {
        let db_instance = DB_POOL.acquire().await {
            Some(db) => db,
            None => DatabaseInstance::setup().await,
        }

        TestDatabase {
            db_instance
        }
    }
}

impl Drop for TestDatabase {
    fn drop(&mut self) {
        tokio::task::block_in_place(|| {
            Handle::current().block_on(async move {
                // Remove all data & close the DB instance
                let mut db_instance = self.db_instance.take().unwrap();
                db_instance.truncate().await;
                db_instance.close().await;

                // Let's put the instance back into the pool
                DB_POOL.put_back(db_instance).await;
            })
        });
    }
}
Copy

So, for the initial test cases, since the pool would be empty, new databases are created. As each of them complete (regardless of passing or failing), they would send the database they were using back to the pool after all the changes are wiped from the database. When the next test case starts, one of these existing databases would be checked out & the process repeats. Thus, we provide completely clean, isolated databases to each running test case for reliability and reuse them for subsequent test cases for efficiency