Stochastic Volatility — A Heston Pricing Engine with Carr-Madan FFT and Surface Calibration
A derivatives pricing engine where volatility is itself a mean-reverting random process rather than a constant. Black-Scholes assumes one volatility for all strikes and maturities; markets disagree, quoting a persistent implied volatility smile and skew that no single lognormal can reproduce. The Heston (1993) model fixes this with a second stochastic factor — a CIR variance process correlated with the spot — while remaining tractable: the log-spot characteristic function is known in closed form, so European options price by Fourier inversion. The module implements the numerically stable "little trap" characteristic function, the Carr-Madan FFT method that prices an entire strike slice with a single \(O(N \log N)\) transform, Black-Scholes implied vol reconstruction of the model surface, calibration of all five parameters to an option chain with bounded least squares in seconds, and a full-truncation Euler Monte Carlo engine for path-dependent payoffs the transform cannot reach. Option chains come from Yahoo Finance's .options API, spot history for a realised-vol reference point, and the 3-month T-bill from FRED to strip carry. Built on Python 3.11+ using numpy, scipy, pandas, yfinance, plotly, streamlit, duckdb, and pydantic v2; packaged with hatchling and tested with pytest against an independent quadrature pricer and known limiting cases.
%==========%
I. Interactive Dashboard:
The dashboard below runs entirely in the browser via stlite (Streamlit on WebAssembly — no server required). The five Heston parameters sit on sidebar sliders with a live Feller-condition indicator. The four tabs show the 3D implied volatility surface rebuilt by FFT on every slider move, smile slices with the 90–110 skew and ATM term structure, a side-by-side Black-Scholes versus Heston pricing table across strikes, and a Monte Carlo simulator whose European price is checked against the FFT while pricing an arithmetic Asian option the transform cannot handle. First load downloads Pyodide and may take 20–40 seconds; subsequent loads are cached.
%==========%
II. Project Layout:
heston-vol/
├── pyproject.toml # Build config, deps, ruff + pytest settings
├── .env.example # DB_PATH
├── data/ # Populated by scripts/download_data.py (git-ignored)
│ └── heston.duckdb # DuckDB: options + spot + calibrations tables
├── scripts/
│ └── download_data.py # yfinance chain + FRED → DuckDB
├── src/heston_vol/
│ ├── data/
│ │ ├── schemas.py # Pydantic v2: OptionQuote, SpotRecord, CalibrationRecord
│ │ ├── fetchers.py # yfinance option chains + spot, FRED 3m T-bill
│ │ └── store.py # DuckDB init, upsert, read for all three tables
│ ├── pricing/
│ │ ├── charfn.py # HestonParams, little-trap CF, Feller condition
│ │ ├── carr_madan.py # FFT pricer + independent Gil-Pelaez quad cross-check
│ │ ├── black_scholes.py # BS price, vega, Brent implied vol
│ │ └── surface.py # IV surface, ATM term structure, 90-110 skew
│ ├── calibration/
│ │ └── calibrate.py # Bounded least-squares calibration, synthetic quotes
│ ├── montecarlo/
│ │ └── simulate.py # Full-truncation Euler; European + Asian MC pricing
│ ├── report/
│ │ └── plots.py # Plotly: 3D surface, smile slices, MC paths, BS diff
│ ├── cli.py # Typer CLI: fetch | price | calibrate | surface | mc | dashboard
│ └── app.py # Streamlit: 4 tabs (Surface, Smiles, BS vs Heston, MC)
└── tests/
├── conftest.py # Equity-index-like true params, strike grid (seed 42)
├── test_pricing.py # CF properties, FFT vs quad, BS limit, skew direction
├── test_calibration.py # Parameter recovery, Feller, noise robustness
└── test_montecarlo.py # Martingale, mean reversion, MC vs FFT, Asian < European
%==========%
III. Data Sources:
Live option chains with strikes, expiries, bids, and asks come from Yahoo Finance's .options API: for each of the nearest expiries the fetcher pulls calls and puts, filters out zero-bid (stale, untradeable) quotes, computes mid-prices, and converts expiry dates to year fractions. Underlying spot history provides a realised-volatility reference point for sanity-checking \(\sqrt{v_0}\). The 3-month T-bill yield from FRED strips carry from option prices so the calibration is not absorbing a rate error into \(\theta\). Quotes, spot, and calibration results all persist to DuckDB so calibrations can be tracked across days.
# fetchers.py
def fetch_option_chain(ticker: str, max_expiries: int = 8) -> pd.DataFrame:
tk = yf.Ticker(ticker)
rows = []
for expiry_str in tk.options[:max_expiries]:
chain = tk.option_chain(expiry_str)
maturity = (pd.Timestamp(expiry_str).date() - date.today()).days / 365.25
for opt_type, df in (("call", chain.calls), ("put", chain.puts)):
df = df[(df["bid"] > 0) & (df["ask"] > 0)] # drop untradeable quotes
...
rows.append({..., "mid": float(0.5 * (row["bid"] + row["ask"]))})
return pd.DataFrame(rows)
%==========%
IV. Why Black-Scholes Cannot Smile:
Invert the Black-Scholes formula on market option prices and the resulting implied volatility is not constant: equity index options show a steep downward skew (low-strike puts carry far higher implied vols than high-strike calls) and a convex smile that flattens with maturity. Under Black-Scholes assumptions this is impossible — one underlying has one volatility. The smile is the market pricing two empirical facts the lognormal misses: fat tails (large moves are more frequent than Gaussian returns allow, so wings are worth more) and the leverage effect (volatility rises when prices fall, so the left tail is fatter than the right). A stochastic volatility model generates both endogenously. Making variance itself random produces excess kurtosis in returns — mixing lognormals over a random variance always fattens both wings — and a negative correlation \(\rho\) between the variance and spot shocks skews the distribution leftward, loading the extra mass onto OTM puts exactly as the index market quotes it.
| Market fact | BS verdict | Heston mechanism |
|---|---|---|
| Smile (wings rich) | Impossible | Random variance ⇒ fat tails via mixture |
| Skew (puts richer than calls) | Impossible | \(\rho < 0\): vol spikes when spot falls |
| Smile flattens with maturity | — | Variance mean-reverts at speed \(\kappa\) |
| Vol clustering in time series | Impossible | CIR persistence of \(v_t\) |
%==========%
V. Heston Dynamics and the Feller Condition (pricing/charfn.py):
The Heston model couples the risk-neutral spot to a Cox-Ingersoll-Ross variance process through correlated Brownian motions:
\[ dS_t = r S_t\,dt + \sqrt{v_t}\,S_t\,dW_t^S \] \[ dv_t = \kappa(\theta - v_t)\,dt + \sigma\sqrt{v_t}\,dW_t^v, \qquad d\langle W^S, W^v\rangle_t = \rho\,dt \]Five parameters: \(\kappa\) (mean-reversion speed), \(\theta\) (long-run variance), \(\sigma\) (vol-of-vol), \(\rho\) (spot-vol correlation), and \(v_0\) (initial variance). The square-root diffusion keeps variance non-negative and switches off the noise as \(v_t \to 0\); whether the origin is actually reached is governed by the Feller condition:
\[ 2\kappa\theta \geq \sigma^2 \]When it holds, the drift pushes variance away from zero faster than the diffusion can carry it there and \(v_t\) stays strictly positive. When it fails, variance touches zero with positive probability (then reflects). Index calibrations violate Feller routinely — fitting the steep short-dated skew demands a large \(\sigma\) that the fitted \(\kappa\theta\) cannot cover. The pricing transform remains valid either way; the casualties are simulation schemes (naive Euler produces negative variance and NaNs — hence the full-truncation scheme in section X) and calibration stability, since near the violated regime several parameter combinations produce nearly identical surfaces.
%==========%
VI. The Characteristic Function:
Heston's tractability rests on the affine structure of \((\,\ln S_t, v_t)\): the characteristic function of the terminal log-spot is exponentially affine in the state, with coefficients solving Riccati ODEs that integrate in closed form:
\[ \varphi(u) = \mathbb{E}\bigl[e^{iu \ln S_T}\bigr] = \exp\Bigl( iu(\ln S_0 + rT) + \frac{\kappa\theta}{\sigma^2}\Bigl[(\kappa - \rho\sigma iu - d)T - 2\ln\frac{1 - g e^{-dT}}{1 - g}\Bigr] + \frac{v_0(\kappa - \rho\sigma iu - d)}{\sigma^2}\,\frac{1 - e^{-dT}}{1 - g e^{-dT}} \Bigr) \] \[ d = \sqrt{(\rho\sigma iu - \kappa)^2 + \sigma^2(iu + u^2)}, \qquad g = \frac{\kappa - \rho\sigma iu - d}{\kappa - \rho\sigma iu + d} \]The branch choice matters. Heston's original formulation used the reciprocal of \(g\) and crosses the negative real axis of the complex logarithm at long maturities, producing discontinuous — silently wrong — prices. The implementation follows Albrecher et al.'s "little Heston trap" convention above, which keeps the log argument away from the cut for all maturities. Two identities pin the implementation in tests: \(\varphi(0) = 1\) (it is a characteristic function) and \(\varphi(-i) = \mathbb{E}[S_T] = S_0 e^{rT}\) (the martingale property).
# charfn.py
def heston_charfn(u, T, params, S0, r):
"""E[exp(i u ln S_T)] — Albrecher 'little trap' branching."""
iu = 1j * np.asarray(u)
d = np.sqrt((rho * sigma * iu - kappa) ** 2 + sigma**2 * (iu + np.asarray(u) ** 2))
g = (kappa - rho * sigma * iu - d) / (kappa - rho * sigma * iu + d)
exp_dT = np.exp(-d * T)
C = (kappa * theta / sigma**2) * (
(kappa - rho * sigma * iu - d) * T - 2.0 * np.log((1.0 - g * exp_dT) / (1.0 - g))
)
D = ((kappa - rho * sigma * iu - d) / sigma**2) * ((1.0 - exp_dT) / (1.0 - g * exp_dT))
return np.exp(iu * (np.log(S0) + r * T) + C + D * v0)
%==========%
VII. Carr-Madan FFT Pricing (pricing/carr_madan.py):
A call price is not square-integrable in log-strike \(k = \ln K\) (it tends to \(S_0\) as \(k \to -\infty\)), so Carr and Madan damp it with \(e^{\alpha k}\), \(\alpha > 0\), before transforming. The Fourier transform of the damped call has a closed form in terms of the characteristic function:
\[ \psi(v) = \int_{-\infty}^{\infty} e^{ivk}\,e^{\alpha k}\,C(k)\,dk = \frac{e^{-rT}\,\varphi\bigl(v - (\alpha + 1)i\bigr)}{\alpha^2 + \alpha - v^2 + i(2\alpha + 1)v} \]and inverting gives the price at any log-strike:
\[ C(k) = \frac{e^{-\alpha k}}{\pi} \int_0^{\infty} e^{-ivk}\,\psi(v)\,dv \]Discretising the integral on \(N = 4096\) points with spacing \(\eta = 0.25\) and Simpson weights turns the inversion into a single FFT that returns prices on a log-strike grid of spacing \(\lambda = 2\pi/(N\eta) \approx 0.006\) — one transform prices every strike in a maturity slice, which is precisely why full-surface calibration takes seconds rather than hours. Two numerical details surfaced during testing. First, the Simpson weight pattern must run \(\tfrac{1}{3}, \tfrac{4}{3}, \tfrac{2}{3}, \tfrac{4}{3}, \dots\) — transposing the 4/3 and 2/3 weights biases every price by \(O(10^{-2})\). Second, interpolating off the grid linearly leaves \(O(\lambda^2\, \partial^2 C/\partial k^2) \approx 10^{-3}\) errors; a cubic spline over the grid segment spanning the requested strikes removes them. Both were caught by an independent pricer: the Gil-Pelaez inversion \(C = S_0 P_1 - Ke^{-rT}P_2\) evaluated with adaptive quadrature, slow but sharing no code with the FFT path.
# carr_madan.py
def carr_madan_fft(S0, r, T, params, alpha=1.5, N=4096, eta=0.25):
lam = 2.0 * np.pi / (N * eta)
b = 0.5 * N * lam
v = np.arange(N) * eta
k = -b + lam * np.arange(N)
phi = heston_charfn(v - (alpha + 1.0) * 1j, T, params, S0, r)
psi = np.exp(-r * T) * phi / (alpha**2 + alpha - v**2 + 1j * (2.0 * alpha + 1.0) * v)
# Simpson's rule weights 1/3, 4/3, 2/3, 4/3, ...
simpson = (3.0 - (-1.0) ** np.arange(N)) / 3.0
simpson[0] = 1.0 / 3.0
integrand = np.exp(1j * b * v) * psi * eta * simpson
calls = np.exp(-alpha * k) / np.pi * np.fft.fft(integrand).real
return np.exp(k), calls
The damping factor must satisfy \(\mathbb{E}[S_T^{\alpha+1}] < \infty\); Heston moments explode beyond a maturity-dependent order, so \(\alpha = 1.5\) is the conventional safe choice. Far wings where the true option value falls below numerical resolution (\(\sim 10^{-9} S_0\), e.g. 30% OTM at one month) are excluded from the implied-vol surface — real markets quote nothing there either.
%==========%
VIII. Calibration to the Option Surface (calibration/calibrate.py):
Calibration finds the parameter vector \((\kappa, \theta, \sigma, \rho, v_0)\) whose model surface best matches market mids, by bounded trust-region least squares on residuals normalised by spot so every maturity contributes on the same scale. Each residual evaluation prices the full quote set with one FFT per maturity. The pitfalls are well known and the module is built around them. Parameter degeneracy: \(\kappa\) and \(\sigma\) trade off against each other (a faster-reverting, noisier variance produces nearly the same smile as a slower, calmer one), so point estimates of \(\kappa\) are unstable even when the fitted surface is excellent — the tests therefore assert tight recovery of the well-identified parameters (\(v_0\), \(\rho\), \(\theta\)) and judge the rest by fit RMSE. Local minima: the objective is non-convex; bounded starts at economically sensible values (and in production, yesterday's calibration) matter more than optimiser choice. Feller violations: the optimiser will happily cross \(2\kappa\theta < \sigma^2\) chasing short-dated skew; the result records the Feller margin so the user sees when the fit lives in the unstable regime. Data hygiene: zero-bid quotes and deep wings are excluded before fitting — they contribute noise, not information.
# calibrate.py
_LOWER = np.array([0.05, 1e-3, 0.01, -0.999, 1e-3]) # kappa, theta, sigma, rho, v0
_UPPER = np.array([20.0, 1.5, 3.0, 0.5, 1.5])
def calibrate(quotes, S0, r, x0=None, max_nfev=200) -> CalibrationResult:
def residuals(x):
params = HestonParams.from_array(x)
res = []
for T in maturities: # one FFT prices the whole slice
slice_ = quotes[quotes["maturity"] == T]
model = heston_call(S0, slice_["strike"].values, r, float(T), params)
res.append((model - slice_["mid"].values) / S0)
return np.concatenate(res)
result = least_squares(residuals, x0 or _DEFAULT_X0, bounds=(_LOWER, _UPPER),
max_nfev=max_nfev, xtol=1e-12, ftol=1e-12)
%==========%
IX. Parameter Intuition — How Each Parameter Moves the Surface:
The model earns practitioner trust because each parameter has one dominant, visible effect, all reproducible on the dashboard sliders:
| Parameter | Dominant effect on the surface | Mechanism |
|---|---|---|
| \(\rho\) (spot-vol correlation) | Skew tilt: \(\rho < 0\) makes low strikes rich, high strikes cheap | Down-moves coincide with vol spikes, fattening the left tail |
| \(\sigma\) (vol-of-vol) | Smile curvature: lifts both wings relative to ATM | Random variance mixes lognormals, adding kurtosis |
| \(\kappa\) (mean reversion) | Term-structure decay: how fast the smile flattens with maturity | Variance shocks have half-life \(\ln 2/\kappa\); long maturities average them out |
| \(\theta\) (long-run variance) | Long-end level: anchors ATM vol at long maturities to \(\sqrt{\theta}\) | Ergodic mean of the CIR process |
| \(v_0\) (initial variance) | Short-end level: anchors short-maturity ATM vol to \(\sqrt{v_0}\) | Today's instantaneous variance |
The ATM term structure interpolates between \(\sqrt{v_0}\) at the short end and \(\sqrt{\theta}\) at the long end at rate \(\kappa\) — an upward-sloping vol term structure simply means \(v_0 < \theta\). The skew decays roughly like \(1/T\) at long maturities, slower than the \(1/T\) of jump models at the short end, which is why short-dated index skew is the one feature Heston chronically struggles to match without jumps (Bates' model adds them for exactly this reason).
%==========%
X. Monte Carlo under Heston Dynamics (montecarlo/simulate.py):
The FFT prices anything European; everything path-dependent needs simulation. Discretising the CIR variance is the hard part: a naive Euler step makes \(v_t\) negative (the Gaussian increment ignores the square-root cutoff) and \(\sqrt{v_t}\) then produces NaNs. Among the standard fixes — absorption, reflection, full truncation, and the exact-but-costly Broadie-Kaya scheme that samples the integrated variance from its true distribution — the module uses full truncation (Lord et al. 2010), which applies \(v^+ = \max(v, 0)\) inside both drift and diffusion and empirically carries the smallest discretisation bias of the Euler family:
\[ v_{t+\Delta} = v_t + \kappa\bigl(\theta - v_t^+\bigr)\Delta + \sigma\sqrt{v_t^+ \Delta}\; Z_v, \qquad \ln S_{t+\Delta} = \ln S_t + \bigl(r - \tfrac{1}{2}v_t^+\bigr)\Delta + \sqrt{v_t^+ \Delta}\; Z_S \]with \(Z_v = \rho Z_S + \sqrt{1 - \rho^2}\,Z_\perp\). The European MC price agreeing with the FFT within standard error is the engine's end-to-end validation; the engine then prices an arithmetic-average Asian call — no transform exists for it — which comes out cheaper than the European since averaging shrinks effective volatility.
# simulate.py
for t in range(n_steps):
z1 = rng.standard_normal(n_paths)
z2 = rho * z1 + np.sqrt(1.0 - rho**2) * rng.standard_normal(n_paths)
v_plus = np.maximum(v[:, t], 0.0)
sqrt_v_dt = np.sqrt(v_plus * dt)
log_s = log_s + (r - 0.5 * v_plus) * dt + sqrt_v_dt * z1
v[:, t + 1] = v[:, t] + kappa * (theta - v_plus) * dt + sigma * sqrt_v_dt * z2
S[:, t + 1] = np.exp(log_s)
%==========%
XI. Heston vs SABR — Practitioner Context:
The other stochastic volatility workhorse is SABR (Hagan et al. 2002), and the two divide the market by asset class and use case. SABR's lognormal (non-mean-reverting) volatility admits the Hagan asymptotic expansion — an algebraic formula mapping parameters to implied vol with no transform at all — making it the standard for quoting and interpolating single-expiry smiles, especially in rates (swaptions, caps), where each expiry-tenor pair is fitted independently. Heston, with mean reversion and a known characteristic function, is built for one consistent dynamic model across the whole surface: a single parameter set prices every strike and maturity, evolves states through time, and supports exotics by simulation. The trade-offs follow: SABR fits any one slice nearly perfectly but says nothing about how smiles at different maturities cohere (and the Hagan expansion famously breaks down at low strikes/long expiries); Heston coheres by construction but fits each individual slice less tightly, and underprices very short-dated skew without jump extensions (Bates). Equity and FX desks lean Heston-family for exotics and term-structure-sensitive products; rates desks quote SABR. The two are complements, not competitors — one is a smile interpolator, the other a model of the world.
| Heston | SABR | |
|---|---|---|
| Variance dynamics | CIR, mean-reverting | Lognormal vol, no mean reversion |
| Pricing route | Characteristic function + FFT | Hagan asymptotic IV formula |
| Scope | Whole surface, one parameter set | One expiry slice per fit |
| Term structure | Endogenous via \(\kappa\) | None (refit per expiry) |
| Typical habitat | Equity/FX exotics, VIX-linked | Rates: swaptions, caps/floors |
| Weakness | Short-dated skew too flat | Low-strike breakdown, no dynamics |
%==========%
XII. CLI — cli.py:
Six subcommands cover the pipeline from chain ingestion through calibration to Monte Carlo, all sharing the DuckDB store.
# Install
pip install -e ".[dev]"
# Live option chain + spot from Yahoo, 3m T-bill from FRED
heston fetch --ticker SPY --max-expiries 8
# Price one call: Carr-Madan FFT vs Gil-Pelaez quad cross-check vs Black-Scholes
heston price --strike 105 --maturity 0.5 --rho -0.7
# Calibrate the five parameters to the stored chain (near-the-money, T > 0.05y)
heston calibrate --ticker SPY --rate 0.03
# ATM term structure and 90-110 skew implied by a parameter set
heston surface --sigma 0.5 --rho -0.7
# Monte Carlo: European (vs FFT benchmark) and arithmetic Asian
heston mc --strike 100 --n-paths 50000
# Launch Streamlit server-side dashboard
heston dashboard
| Command | Key options | Output |
|---|---|---|
heston fetch | --ticker, --max-expiries, --db | Spot, filtered option chain, FRED rate to DuckDB |
heston price | --strike, --maturity, five params | FFT vs quad vs BS prices + Heston implied vol |
heston calibrate | --ticker, --rate | Five parameters, RMSE, Feller margin; stored to DuckDB |
heston surface | five params | ATM IV and 90-110 skew by maturity |
heston mc | --n-paths, --strike, five params | European MC vs FFT, Asian price, standard errors |
heston dashboard | — | Launches streamlit run src/heston_vol/app.py |
%==========%
XIII. Test Suite:
All 36 tests are offline and deterministic (seed 42), built around an equity-index-like parameter set (\(\kappa = 2\), \(\theta = 0.04\), \(\sigma = 0.3\), \(\rho = -0.7\), \(v_0 = 0.04\)). Characteristic-function tests assert the two defining identities — \(\varphi(0) = 1\), \(\varphi(-i) = S_0 e^{rT}\) — and \(|\varphi(u)| \leq 1\). The FFT pricer is validated against the independent Gil-Pelaez quadrature pricer across a strike-maturity grid, against Black-Scholes in the vanishing vol-of-vol limit, and against the no-arbitrage constraints: prices within \([\,(S_0 - Ke^{-rT})^+, S_0]\), decreasing and convex in strike. Surface tests pin the economics — \(\rho = -0.7\) produces a positive 90–110 skew, \(\rho = 0\) a symmetric smile, and higher vol-of-vol strictly more curvature. Calibration tests recover \(v_0\), \(\rho\), and \(\sqrt{\theta}\) from a synthetic surface generated by known parameters (starting the optimiser from a deliberately wrong guess), keep RMSE under \(10^{-3} S_0\), and confirm the skew direction survives bid-ask-scale noise. Monte Carlo tests check the discounted-spot martingale property within four standard errors, CIR mean reversion of terminal variance to \(\theta\), European MC agreement with the FFT, and the Asian-below-European ordering.
# test_pricing.py — the cross-validation that caught two real bugs
def test_fft_matches_quad(self, true_params, market, strike_grid):
for T in [0.25, 1.0, 2.0]:
for K in strike_grid[::2]:
c_fft = heston_call(market["S0"], float(K), market["r"], T, true_params)
c_quad = heston_call_quad(market["S0"], float(K), market["r"], T, true_params)
assert c_fft == pytest.approx(c_quad, rel=2e-4, abs=2e-4)
def test_bs_limit_when_vol_of_vol_vanishes(self, market):
params = HestonParams(kappa=2.0, theta=0.04, sigma=1e-4, rho=0.0, v0=0.04)
for K in [80.0, 100.0, 120.0]:
c_h = heston_call(market["S0"], K, market["r"], 1.0, params)
c_bs = bs_price(market["S0"], K, market["r"], 1.0, 0.2)
assert c_h == pytest.approx(c_bs, rel=1e-4)