How to Build a Causal Diagram for Your Financial Model

A Step-by-Step Guide

Jul 04, 2025

Your analyses don’t need to be this broken. Image generated with Leonardo AI

Your model isn’t just crunching numbers—it’s telling a story about how the world works. A causal diagram helps you get that story straight.

In financial analysis, we’re trained to follow the numbers. P&L statements, ratios, forecasts—we model them, compare them, regress them. But when it comes to sustainability, this approach often falls short. Why?

Because sustainability isn’t just a collection of metrics. It’s a web of cause-and-effect relationships—between energy use and regulation, climate risk and credit spreads, supply chains and social stability.

If your model doesn’t account for these causal links, it might look precise but lead you astray.

This article will show you how to take the first step toward causal modeling: building a causal diagram, also known as a causal Directed Acyclic Graph (DAG). By the end, you’ll know how to sketch one from scratch, encode it in Python, and use it to structure smarter models and insights.

What Is a Causal DAG and Why Should You Care?

A causal DAG is a visual representation of how you believe variables in your system causally influence each other. Each node is a variable; each arrow represents a directional, cause-and-effect relationship.

Here’s why this matters:

Causal DAGs help you structure thinking before touching data.
They surface hidden assumptions that affect your conclusions.
They guide what variables to control for (or not) in a model.
They’re the backbone of advanced tools like DoWhy, EconML, or Bayesian Networks.

Think of a DAG as the blueprint for your financial model. If sustainability is part of the building, it needs to be in the structure—not bolted on afterward.

Step-by-Step: Build Your First Causal DAG

Let’s walk through the creation of a simple DAG related to sustainability risk in a company. We'll model how carbon intensity may affect stock return volatility, using regulation, energy prices, and sector as causal factors.

Step 1: List your variables

Start with the key variables relevant to your hypothesis. For this example:

carbon_intensity: Scope 1 and 2 emissions per revenue
energy_prices: Proxy for exposure to oil/gas price volatility
regulatory_pressure: Country-level or sector-specific climate policy stringency
sector: Controls for baseline differences (e.g. utilities vs tech)
stock_volatility: 1-year rolling standard deviation of returns

Step 2: Define your causal beliefs

Now ask: what affects what? This is a theory, not yet tested with data.

sector → carbon_intensity
sector → stock_volatility
energy_prices → carbon_intensity
regulatory_pressure → carbon_intensity
carbon_intensity → stock_volatility

Optional additions:

regulatory_pressure might also affect stock_volatility directly.
There may be feedback loops—but we’ll ignore cycles for now (DAGs must be acyclic).

Step 3: Draw the graph

You can draw it by hand or in Python. Let’s do both.

Visualizing the DAG in Python

We’ll use the networkx and matplotlib libraries:

import networkx as nx
import matplotlib.pyplot as plt

# Define the DAG
G = nx.DiGraph()

# Add edges
edges = [
    ("sector", "carbon_intensity"),
    ("sector", "stock_volatility"),
    ("energy_prices", "carbon_intensity"),
    ("regulatory_pressure", "carbon_intensity"),
    ("carbon_intensity", "stock_volatility"),
    ("regulatory_pressure", "stock_volatility")
]

G.add_edges_from(edges)

# Draw the DAG
pos = nx.spring_layout(G, seed=42)
nx.draw(G, pos, with_labels=True, node_size=3000, node_color="lightblue", arrowsize=20)
plt.title("Causal DAG: Sustainability and Volatility")
plt.show()

This gives you a clean, inspectable map of the world as you currently believe it works.

Using the DAG to Structure Analysis

Now that you’ve built your DAG, how do you use it?

Let’s say you want to estimate the causal effect of carbon_intensity on stock_volatility. The DAG tells you what confounding variables you need to control for, so your estimate isn’t biased.

This is where do-calculus and tools like DoWhy come in.

Identify a Valid Adjustment Set

From the DAG, we can spot that:

sector affects both carbon_intensity and stock_volatility (confounder)
regulatory_pressure does too
energy_prices only affects carbon_intensity

So, to isolate the causal impact of carbon_intensity on stock_volatility, we should control for sector and regulatory_pressure, but not energy_prices (since it doesn’t affect the outcome directly).

This “adjustment set” can be passed to your regression or causal estimation tool.

Implementing in DoWhy

Let’s do a minimal DoWhy setup (assuming your data is in a pandas DataFrame called df):

from dowhy import CausalModel

model = CausalModel(
    data=df,
    treatment="carbon_intensity",
    outcome="stock_volatility",
    common_causes=["sector", "regulatory_pressure"]
)

# View the causal graph
model.view_model(layout="dot")

# Identify and estimate the effect
identified_estimand = model.identify_effect()
estimate = model.estimate_effect(
    identified_estimand,
    method_name="backdoor.linear_regression"
)

print(estimate)

This gives you a causal estimate, not just a correlation—grounded in your theory of the system.

Extensions: Where to Go From Here

Now that you’ve built a basic causal model, you can:

Add intermediate outcomes (e.g., revenue, operating margin)
Include treatment heterogeneity (does carbon matter more in some sectors?)
Simulate interventions using do-operations: what happens to volatility if carbon intensity is halved?

You can even refactor your entire portfolio model using a causal graph backend, turning your spreadsheet into a directed model of the world.

Common Pitfalls and How to Avoid Them

Mistaking correlation for causation: Always start with theory.
Forgetting time: If one variable lags another, encode that in your DAG (or extend to dynamic DAGs).
Overfitting the DAG: Don’t just add arrows to fit the data—stick to causal logic.
Using all variables as controls: Some variables block the causal path you’re trying to measure.

Causal modeling rewards careful thought more than raw statistical power.

Why This Matters in Sustainable Finance

Sustainability is not a “nice to have.” It is systemic, complex, and non-linear. Causal DAGs are one of the few tools that can help financial professionals:

Make decisions under uncertainty
Anticipate policy or climate-driven shocks
Build models that explain, not just fit

If you're integrating sustainability into credit risk, equities, or strategic advisory work, a causal diagram can transform how you think and communicate.

The Bottom Line: Your Causal Blueprint Starts Here

Most analysts treat modeling like wiring up an output: “If I tweak this cell, what happens over there?”

But real insight begins before the spreadsheet. With causal modeling, you’re not just adjusting numbers—you’re designing a hypothesis about how the world works.

And when sustainability forces start to reshape that world—as they already are—you’ll be glad you took the time to get your blueprint right.

Next up: In next Tuesday’s article, we’ll step back and ask a deeper question: Are we modeling the world, or just modeling ourselves?

Until then, build wisely.

—
Note to my readers: I’ll be sharing a starter DAG template and Python notebook to selected subscribers next week. Leave a comment or reply to this email if you’d like early access.