Tidying Up Your PHPUnit Tests with Data Providers

Writing tests for real-world applications can be messy. Our intentions are good: we write a few tests to cover our edge cases, but we soon notice very similar logic repeated in each of them with only slight variances in the setup data. Maybe we're testing form input validation, or implementations of different objects that share the same contract. Fortunately, PHPUnit gives us a way to consolidate this common logic while varying our setup data, without losing the benefit of dedicated, smaller test methods. In this post, we'll explore PHPUnit's data providers. While data providers are available in any PHPUnit test suite, let's look at how they can help us tidy up our tests in a Laravel application.

PHPUnit Data Providers

Data Providers allow us to define tests once and run them multiple times with different datasets. Using a data provider requires three steps:

  1. Define a public method that returns an array of datasets to be used
  2. Annotate the data provider in the docblock of your test
  3. Pass the dataset as an argument to the test method.
/**
 * @test
 * @dataProvider myTestProvider
 */
public function we_are_testing_something_good_here($dataProvider)
{
    //
}

public myTestProvider()
{
    return [
        ['set 1'],
        ['set 2'],
        ['set 3'],
    ];
}

This test will be executed three times, the first time $dataProvider will equal 'set 1', the second 'set 2' and the third 'set 3'.

Let's look at some real-world uses of data providers to better understand how they can help tidy up our tests.

Form Validation

Let's say we have a form endpoint with a handful of required inputs, and we want to test that each one is required by our application. If our form has three inputs, we'd ordinarily need three tests to cover them all. Using data providers, we can instead define a single test which verifies that all three inputs in our form are required:

/**
 * @test
 * @dataProvider clientFormValidationProvider
 */
public function required_inputs_are_required($formInput)
{
    $response = $this->json('POST', route('client.store', [
        $formInput => '',
    ]));

    $response->assertStatus(422);
    $response->assertJsonValidationErrors($formInput);
}

public function clientFormValidationProvider()
{
    return [
        ['name'],
        ['company'],
        ['email'],
    ];
}

The clientFormValidationProvider defines three datasets. In this case, the value of each dataset corresponds to the name value of the input under test. These three datasets will cause the required_inputs_are_required test to be run three times, once for each input.

We can also use a data provider to test multiple multiple validation rules—for example, name, company, and email are required, but the email input must also be in a valid email format. Fortunately, data providers allow us to define multiple values per dataset, and gives us access to these values in our tests by adding additional test method arguments.

/**
 * @test
 * @dataProvider clientFormValidationProvider
 */
public function test_form_validation($formInput, $formInputValue)
{
    $response = $this->json('POST', route('client.store', [
        $formInput => $formInputValue,
    ]));

    $response->assertStatus(422);
    $response->assertJsonValidationErrors($formInput);
}

public function clientFormValidationProvider()
{
    return [
        ['name', ''],
        ['company', ''],
        ['email', ''],
        ['email', 'not-an-email'],
    ];
}

Now our data provider will allow us to test all four validation rules with a single test. Just as in the previous example, the provider datasets cause the test_form_validation test to be run four times, once per dataset.

One thing you'll quickly notice when using data providers is that the output of failed tests has changed. When a test with a data provider fails, the failure message includes the numeric index of the dataset that failed, along with the values that caused the failure.

Time: 134 ms, Memory: 14.00MB

There were 2 failures:

2) Tests\Feature\ExampleTest::test_form_validation with dataset #3 ('email', 'not-an-email')

This isn't the most descriptive failure message. Fortunately, PHPUnit allows us to set a more descriptive identifier simply by defining an array key for each of our datasets.

public function clientFormValidationProvider()
{
    return [
        'Test name is required' => ['name', ''],
        'Test company is required' => ['company', ''],
        'Test email is required' => ['email', ''],
        'Test email is valid' => ['email', 'not-an-email'],
    ]
}

Now when a test using this provider fails, we'll get a clear description of the dataset that was responsible.

Time: 134 ms, Memory: 14.00MB

There were 2 failures:

2) Tests\Feature\ExampleTest::test_form_validation with dataset "Test email is valid" ('email', 'not-an-email')

Our test output is getting better, but we still don't have a clear way to identify which fields we're testing with a quick glance at our test. It would be nice if we could see a list of all the fields we intend to test without having to dive into the provider array. Once again, this is solvable with very little effort on our part, by defining a separate data provider for each input. We'll give each provider a descriptive name so we have a better idea of its contents, and enable them all by adding additional @dataProvider notations to our test.

/**
 * @test
 * @dataProvider nameInputValidation
 * @dataProvider companyInputValidation
 * @dataProvider emailInputValidation
 */
public function test_form_validation($formInput, $formInputValue)
{
    $response = $this->json('POST', route('client.store', [
        $formInput => $formInputValue,
    ]));

    $response->assertStatus(422);
    $response->assertJsonValidationErrors($formInput);
}

public function nameInputValidation()
{
    return [
        'Name is required' => ['name', ''],
    ];
}

public function companyInputValidation()
{
    return [
        'Company is required' => ['company', ''],
    ];
}

public function emailInputValidation()
{
    return [
        'Email is required' => ['email', ''],
        'Email must be valid' => ['email', 'not-an-email'],
    ];
}

Testing Object Implementations

Let's say we have designed an app that allows users to bill clients with a payment processor of their choosing (e.g. Stripe, Square, or Braintree). We want to be able to test common actions on each processor object (can charge an amount, can refund an amount). We could create one test class for each processor object (StripeProcessorTest, BraintreeProcessorTest, etc) and duplicate the tests for each action; alternatively, we could use data providers to reduce this to a single test class PaymentProcessorTest.

/** 
 * @test
 * @dataProvider paymentProcessorProvider
 */
public function can_charge_an_amount($processor)
{
    $processor->charge(2000);

    $this->assertEquals(2000, $processor->totalCharges());
}

/** 
 * @test
 * @dataProvider paymentProcessorProvider
 */
public function can_refund_an_amount($processor)
{
    $processor->charge(2000);

    $this->assertEquals(-2000, $processor->totalCharges());
}

public function paymentProcessorProvider()
{
    return [
        'Braintree processor' => [new BraintreeProcessor],
        'Stripe processor' => [new StripeProcessor],
        'Square processor' => [new SquareProcessor],
        ...
    ];
}

Now we've written a single test class responsible for testing each of our processors' objects with exactly the same logic. Since we only have one test class, and one test for each action the processor object should support, we have a stronger guarantee that all of these objects will be tested exactly the same way. We've also reduced the possibility of deviations between our tests that could occur if we split them up into multiple files, and made adding tests for new processors as simple as adding 1 line of code to our data provider.

Caveat

There is one gotcha to consider when using PHPUnit's data providers.

The test runner builds a test suite by scanning all of your test directories and identifying test classes and test methods within those classes. During the process, when the test runner identifies test methods (containing test_ in the method name or the @test annotation), the test runner evaluates any @dataProvider annotations. When a @dataProvider annotation is found, the referenced data provider is EXECUTED, then a TestCase is created and added to the TestSuite for each dataset in the provider.

Here's the rub:

First, if you use factory methods in your data providers, these factories will run once for each test utilizing this data provider BEFORE your first test even runs. So a data provider with a complicated dataset (requiring, say, numerous database records to exist) that is used by ten tests will run ten times before your first test even runs. This could drastically slow down the time until your first test executes. Even if you run a single test using phpunit --filter, every data provider will still run multiple times. Filtering occurs after the test suite has been generated and therefore after any data providers have been executed.

Second, Laravel bootstraps itself in PHPUnit's setUp method. This means that out of the box, we can't use the application container, facades, or any other Laravel niceties in our data providers if they depend on the application/container instance being available. If you try to use a Laravel feature in your data provider, the tests that implement it won't execute and you'll get a No tests executed! message.

Fortunately, we can solve this too. By returning a closure from our dataset and calling it at the top of our test method, Laravel will have a chance to load the application before any code in our data provider is executed. Now we can rely on factory methods, and they won't hit the database until our test is run.

Let's modify our example slightly by assuming our PaymentProcessor needs to be resolved from the container.

/** 
 * @test
 * @dataProvider paymentProcessorProvider
 */
public function user_can_charge_an_amount($paymentProcessorProvider)
{
    $paymentProcessorProvider();
    $paymentProcessor = $this->app->make(PaymentProviderContract::class);

    $paymentProcessor->charge(2000);

    $this->assertEquals(2000, $paymentProcessor->totalCharges());
}

public function paymentProcessorProvider()
{
    return [
        'Braintree processor' => [function () {
            $container = Container::getInstance();
            $container->bind(PaymentProviderContract::class, BraintreeProvider::class);
        }],
        ...
    ];
}

Conclusion

Data providers can help DRY up your test suites and reduce visual debt. As with anything, their use comes with tradeoffs, but when carefully implemented they can add value and help your test suites spark joy.

Happy coding!