A mere claim that an intervention will make a difference is no longer sufficient to make business sense. A solid, mathematically driven method is needed to measure and guarantee the difference the said intervention will make. However, measuring the impact in an economically volatile time such as the new normal would be very challenging as few people outside the discipline know that the field of economics contains many powerful impact assessment tools that have been perfected for years.
What is Econometrics?
Econometrics is the branch of Economics that contains a wealth of statistical and mathematical tools used in Data Analysis and Data Science. For economists, data represents the signals that show how businesses and individuals in an economy act in order to maximize their well-being within a set of constraints. This is an example of a ‘causal science’. Economists will be preoccupied with understanding whether A causes B but not whether B causes A. This will consequently make them invest a lot of time ensuring the causality is effective, and not an accidental correlation.
In this sense, the primary goal of an economist is to develop a perfect counterfactual. To measure the effect of an intervention, for example a policy change, they need to observe the difference between a world where the policy has been implemented and the counterfactual world where it has not been. In reality, we only get to live in a world where the policy has occurred, and it is impractical to create a similar world where the policy has not been implemented to compare. A good economist can hypothetically construct a near-perfect counterfactual to compare the two states.
One technique used, often proved to be quite naive, is to compare the pre and post implementation. For example, consider measuring the impact of a tax cut on laptops. In this case, the change in demand over time could be due to the tariff or changes in any other macro-economic variable such as business cycle fluctuations. It is also possible that the demand for laptops was on the rise regardless of the tax cut. Therefore, the pre-post analysis fails to capture all underlying trends. Consider comparing the outcomes of the treatment group and the control group during the treated time period. Here, there is no guarantee if socioeconomic characteristics affect both groups equally. Maybe these features are not in the dataset or they are fundamentally unobserved. Controlling all these variables are difficult. This is where the Difference in Difference (DiD) method comes to rescue.
Difference-in-Difference (DiD) combines the before-after and treatment-control group comparison. Its applications are widespread in fields such as economics, policy building, and corporate. DiD requires a panel dataset (or a longitudinal dataset) for the treatment and the control group. In this approach, first we calculate the before-after difference of the control group. Then resultant value will be subtracted from the before-after difference in the treatment group. This method succeeds in achieving two goals:
· Any changes happening over time, such as the effect of fluctuating economic cycles, are nullified.
· Any characteristics that determine the outcomes of the treatment and the control group are also accounted for so long as they are constant across time. Thus, the time variant and time invariant differences that are often difficult to observe are accounted for.
A Deep Dive into a Sample Use Case
Consider the following example.
You run an airplane company that operates in North America. It is contemplating whether to use a recently invented airplane fuel that claims to increase profits. In the pilot project, you used this fuel in the Canada operation only. Having used the fuel from 2016 to 2018, you need to measure the impact on profits during the pilot period. Grounded by your background in economics, you decide to use DiD to measure the impact. Your treatment group is Canada, and the control group is America, where the intervention has not been implemented. You have collected data from 2016 to 2018. Your goal in the impact measurement can be illustrated as follows,
Estimating the Model
A well-defined regression model allows you to estimate the causal effect with any other control variables. We will use dummy variables to define the before and after states as well as the differences in the treatment and control group. The full model will have dummy variables with an interaction variable. We will look at the model in relation to our sample use case. β3 will give you the impact of the intervention.
Equal trends assumption
The researcher needs to convince the audience that the “equal trends” assumption holds true between the treatment and control groups over-time. Essentially, we need to ensure that the treatment group and the control group are affected equally by any trends that are occurring across time. If the trends change drastically across time between the two groups, the estimates could be biased. There are no specific tests for this but often time-series graphs will do the job. This is the assumption that is often hardest to justify in a DiD study. Therefore, the success of the study hangs on your ability to find a control group that satisfies this assumption. Apart from the equal trend’s assumption, we must also make sure there are no spillover effects from the treatment manifesting in the control group.
Another important factor to consider is that the datapoints are serially correlated in the panel dataset. Accounting for the serial correlation is essential to maintain the homoscedasticity assumption in linear regression.
However, the DiD method is quite flexible and can be combine with other methods such as Fixed Effects.
DiD has produced great results especially when the resources to conduct a comprehensive experimental design is not available. This is only one of many panel-data methods that are employed in the field of economics, but is a well-executed impact assessment tool that could be your competitive weapon. This can effectively assess the impact even in an economically volatile period if you can defend the equal-trends assumption successfully.
Written by Praveen Ekanayake, Data Scientist.