Powering Environmental Decisions with Causal AI

September 1, 2023

Dr Jordan Hart

Carbon management

Climate change is one of the most pressing issues facing the modern world. An important way in which businesses are combating climate change is through the use of carbon management strategies. Such strategies involve several steps, often defined as 1) measure, 2) report, and 3) act. A key part of carbon management is the use of measured data to drive actions, with the aim of reducing or negating carbon emissions. Machine learning (AI) algorithms have become a popular tool for processing such data, because they can ingest large amounts of data and make predictions based on detected patterns. Unfortunately, despite their widespread use, traditional machine learning algorithms only understand patterns, not causal processes. This means they cannot reliably predict the impact of actions or interventions such as a change in sustainability policy. This is where causal machine learning comes in.

Example: Company size and carbon emissions

To demonstrate the power and necessity of causal AI for carbon management, we've built a basic simulated hypothetical example using Causa's sophisticated structural causal modelling approach.

Consider a company wishing to reduce its overall carbon emissions. It's logical to assume that larger companies tend to emit more carbon than smaller companies. Such emissions come from a range of factors, such as increased economic activity, greater numbers of commuters, and greater facilities requirements, to name a few. On the flipside, larger companies may often have more opportunities for remote or hybrid working patterns across large distributed teams, which can go some way to mitigate the increased carbon costs of larger companies. This contrasts with smaller companies where many employees may work on-site, which has a greater environmental impact than remote or hybrid workers due to the carbon cost of commuting and office facilities.

Environmentally conscious companies may be concerned about the carbon emissions due to business decisions such as expanding the workforce, and may turn to data-driven carbon management strategies to help inform decisions. Unfortunately, traditional AI approaches are widely misused to answer questions of a causal nature. To get a grasp on the nature of the problem, we've simulated an example dataset, a sample of which is shown below:

‍

Traditional carbon management approaches using machine learning often use all variables available (in our case the size of a company and the number of remote workers) as input variables. An all too common - but mistaken - approach in using AI to predict the impacts of actions is to use predictions alone to estimate a causal effect. Let's say a company of 500 people wishes to know the carbon cost of expanding its workforce by 10% (50 employees). A traditional AI approach would involve calculating the difference in predicted CO2 emissions for the current workforce size of 500, and the potential future size of 550, while all other variables (in our case, remote) are taken from the original dataset. For demonstration purposes we fitted a standard regression model to our simulated dataset.

This predicted difference in carbon emissions from standard models would be around 339 tonnes higher if the company increased the workforce by 10%. This information could discourage an environmentally-minded company from going ahead with expansion plans. However, if the company had pushed ahead with the expansion anyway, they would find their CO2 emissions would have increased by only around 75 tonnes, a fraction of the predicted amount. What has gone wrong here? The reason for this shortfall is because traditional AI algorithms are incapable of reliably predicting the outcomes of actions, because they are not causally-aware.

Who knows how many bad or damaging decisions have been made on the back of traditional predictive AI unwittingly masquerading as a causal tool?

Structural causal models

At Causa we use structural causal models to estimate the impacts of actions and decisions. SCMs are different to traditional AI approaches because they integrate causal information and expert knowledge in their core structure. This means that they can accurately predict the outcomes of an action or intervention, without the need to actually do the intervention. SCMs use basic knowledge about potential and likely causal mechanisms, encoded in a structure known as a directed acyclic graph (DAG), visualised below for our simple case study:

‍

‍
‍

Causal models use DAGs to understand potential causal processes and reason over causal structure in a dataset, where traditional AI approaches have no concept of causation. In our case study, causal reasoning is a necessity, because it allows us to see the true carbon impact of increasing the workforce, without actually doing it. The quantity of interest in our causal analysis is the expected change in CO2 emissions should the company increase its workforce by 10%. There are a few ways to approach this, ideally using a concept called counterfactuals, which can make highly personalised estimates at the company level. For the purposes of our case study, a less ideal, but simple way is to imagine how an average company of size 500 would respond to an increase in workforce of 10%. We can calculate this in our causal model using the do() operator, which simulates interventions and allows their effect to propagate through the causal structure - something that traditional AI algorithms aren't capable of. We did this using CausaDB and got an estimate of approximately 72 tonnes (plus a small amount of uncertainty).

This value is extremely close to the true expected change in CO2 emissions from our simulated data. If the very slight difference from the true 75.0 is a concern, CausaDB also provides uncertainty estimates, but that is a topic for another time. Only through using causal models are we able to discover that increasing the size of the workforce is far less carbon emitting than predicted by the traditional AI algorithm. This highlights the importance of causal modelling in carbon management practices. This example might seem exaggerated, but in fact more subtle issues in data can lead to much more severe problems than those we present here.