Black-Litterman — Portfolio Construction with Equilibrium Priors and Bayesian Views

A portfolio construction engine built around the two notorious failure modes of classical mean-variance optimisation: extreme sensitivity to expected-return estimates and corner-solution weights that short or zero out most of the universe. Black and Litterman (1992) fix the inputs rather than the optimiser. The prior comes from reverse optimisation — the expected returns \(\pi = \delta \Sigma w_{mkt}\) that would make the observed market-cap portfolio optimal — and investor views enter as a normal distribution over linear combinations of assets, so conviction is a first-class input rather than an afterthought. The module implements the equilibrium prior with risk aversion estimated from the market's own Sharpe ratio, the pick-matrix view machinery with both proportional-to-prior and Idzorek (2005) confidence-based view uncertainty, the posterior master formula in a form that remains valid at 100% confidence, constrained mean-variance optimisation with cvxpy (plus closed-form unconstrained fallbacks), and a rolling out-of-sample backtest comparing 1/N, plain mean-variance, and Black-Litterman at weak and strong conviction. Daily adjusted prices come from Yahoo Finance for a ten-ETF multi-asset universe, fund AUM proxies build the market portfolio, and the 3-month T-bill from FRED strips the risk-free rate. Built on Python 3.11+ using numpy, scipy, cvxpy, pandas, yfinance, plotly, streamlit, duckdb, and pydantic v2; packaged with hatchling and tested with pytest against the published He-Litterman (1999) seven-country results.


%==========%


I. Interactive Dashboard:

The dashboard below runs entirely in the browser via stlite (Streamlit on WebAssembly — no server required). Pick an outperformer and an underperformer, state the outperformance and your Idzorek confidence, and watch the Black-Litterman weights tilt away from the market portfolio in real time — only along the view, never elsewhere. The four tabs show BL versus market weights with the active tilt isolated, the prior-versus-posterior return scatter, the corner-solution pathology of plain mean-variance next to BL's stable allocation, and the \(\Omega\)/\(\tau\) sensitivity charts including the tau-cancellation result. First load downloads Pyodide and may take 20–40 seconds; subsequent loads are cached.


%==========%


II. Project Layout:

black-litterman/
├── pyproject.toml                              # Build config, deps, ruff + pytest settings
├── .env.example                                # DB_PATH
├── data/                                       # Populated by scripts/download_data.py (git-ignored)
│   └── black_litterman.duckdb                  # DuckDB: prices + market_caps + riskfree tables
├── scripts/
│   └── download_data.py                        # yfinance prices/AUM + FRED → DuckDB
├── src/black_litterman/
│   ├── data/
│   │   ├── schemas.py                          # Pydantic v2: PriceRecord, MarketCapRecord, RateRecord
│   │   ├── fetchers.py                         # yfinance adjusted closes + AUM, FRED 3m T-bill
│   │   └── store.py                            # DuckDB init, upsert, read for all three tables
│   ├── model/
│   │   ├── equilibrium.py                      # pi = delta·Sigma·w_mkt, delta estimation, inverse map
│   │   ├── views.py                            # Pick matrix P, view vector Q, proportional + Idzorek Omega
│   │   ├── posterior.py                        # Master formula (Omega-robust form, valid at Omega = 0)
│   │   ├── optimizer.py                        # cvxpy max-Sharpe/min-var/target-return + closed forms
│   │   └── compare.py                          # Weight comparison, tau grid, confidence interpolation
│   ├── backtest/
│   │   ├── engine.py                           # Rolling walk-forward: 1/N, plain MV, BL weak/strong
│   │   └── metrics.py                          # Sharpe, drawdown, one-way turnover
│   ├── report/
│   │   └── plots.py                            # Plotly: weights, prior/posterior, tau, backtest
│   ├── cli.py                                  # Typer CLI: fetch | equilibrium | posterior | compare | backtest | dashboard
│   └── app.py                                  # Streamlit: 4 tabs (Weights, Prior vs Posterior, Sensitivity, Backtest)
└── tests/
    ├── conftest.py                             # He-Litterman 1999 seven-country dataset
    ├── test_equilibrium.py                     # Published-pi reproduction, round trip, delta estimation
    ├── test_posterior.py                       # Zero views, Omega → 0 limit, uncertainty shrinkage, span property
    ├── test_views.py                           # P/Q construction, proportional Omega, Idzorek calibration
    ├── test_optimizer.py                       # Closed-form vs cvxpy, KKT conditions, market recovery
    └── test_backtest.py                        # Seed-42 determinism, turnover ordering, tau invariance
  

%==========%


III. Data Sources:

Daily adjusted closes for a ten-ETF multi-asset universe (SPY, QQQ, IWM, EFA, EEM, TLT, IEF, LQD, GLD, VNQ) come from Yahoo Finance with dividends and splits folded in — total-return series, not price series, since the equilibrium argument concerns total wealth. The market portfolio is built from fund AUM (totalAssets), the standard market-cap proxy when the universe is funds rather than single names. The 3-month T-bill from FRED converts raw returns into the excess returns the model is stated in. Everything persists to DuckDB; with no database present, every command falls back to a deterministic synthetic eight-asset market (seed 42) so the full pipeline runs offline.


# fetchers.py
def fetch_market_caps(tickers: list[str]) -> pd.DataFrame:
    """Market-portfolio weights input: fund AUM (ETFs) or market cap (stocks)."""
    rows = []
    for ticker in tickers:
        info = yf.Ticker(ticker).info
        cap = info.get("totalAssets") or info.get("marketCap")
        if cap is None:
            raise ValueError(f"no AUM or market cap available for {ticker}")
        rows.append({"ticker": ticker, "date": today, "market_cap": float(cap)})
    return pd.DataFrame(rows)
  

%==========%


IV. The Disease — Mean-Variance as an Error Maximiser:

Markowitz optimisation is exactly right given its inputs and catastrophically wrong given estimated ones. Sample means over \(T\) observations carry standard error \(\sigma/\sqrt{T}\) — for a 16%-vol asset and five years of daily data, roughly 7% annualised, which is larger than the cross-sectional spread of plausible true expected returns. The optimiser cannot tell estimation noise from signal, and because it is a maximiser it loads hardest exactly where the noise is most flattering: Michaud (1989) called mean-variance optimisers "estimation-error maximisers". Two symptoms follow. Corner solutions: long-only optimisation on historical means typically holds two or three of ten assets and zeroes the rest; unconstrained optimisation takes wild long-short bets on near-collinear pairs. Input hypersensitivity: perturbing one expected return by 50 basis points can swing weights by tens of percentage points, because the inverse covariance matrix amplifies differences between correlated assets. Practitioners historically responded with ad hoc weight bounds — which merely move the corner solutions to the bounds. Black-Litterman's diagnosis is that the disease lives in the return vector, and supplies a disciplined one.

SymptomMechanismBL treatment
Corner solutionsOptimiser maximises into estimation noisePrior anchored at the diversified market portfolio
Hypersensitivity to inputs\(\Sigma^{-1}\) amplifies return differences of correlated assetsViews move the posterior smoothly, weighted by conviction
Unintuitive weightsNo connection between stated opinions and positionsWeights deviate from market only along the stated views
Ad hoc constraintsBounds imposed to hide the symptomsStability emerges from the inputs; constraints become optional

%==========%


V. The Equilibrium Prior — Reverse Optimisation (model/equilibrium.py):

Instead of asking "what returns do I forecast?", Black-Litterman first asks "what returns would justify the portfolio everyone already holds?". A mean-variance investor with risk aversion \(\delta\) maximises \(w^{\top}\mu - \tfrac{\delta}{2} w^{\top}\Sigma w\), whose first-order condition is \(\mu = \delta \Sigma w\). Evaluating at the market-cap weights gives the equilibrium expected excess returns:

\[ \pi = \delta\, \Sigma\, w_{mkt}, \qquad \delta = \frac{\mathbb{E}[r_{mkt}] - r_f}{\sigma_{mkt}^2} \]

with \(\delta\) estimated as the market's excess return over its variance — the market Sharpe ratio divided by the market volatility. This is the CAPM in portfolio-weight clothing: \(\pi_i = \delta\,\mathrm{cov}(r_i, r_{mkt})\), i.e. each asset's prior return is proportional to its beta. Three properties make \(\pi\) the right anchor. It is guaranteed consistent with a diversified portfolio — feed \(\pi\) back through the optimiser and you recover \(w_{mkt}\) exactly, a round trip the test suite asserts to machine precision. It embeds the covariance structure — high-beta assets get high priors, so the optimiser sees no free lunch to exploit. And it is opinion-free — the investor who knows nothing holds the market, which is exactly what equilibrium reasoning says they should.


# equilibrium.py
def implied_equilibrium_returns(delta, sigma, w_mkt):
    """pi = delta * Sigma * w_mkt — excess returns that make w_mkt optimal."""
    return delta * sigma @ w_mkt

def implied_market_weights(delta, sigma, mu):
    """Inverse map: w* = (delta Sigma)^-1 mu. Round trip recovers w_mkt exactly."""
    return np.linalg.solve(delta * sigma, mu)
  

%==========%


VI. Encoding Views — P, Q, and \(\Omega\) (model/views.py):

A view is a distribution, not a number: "\(P_k^{\top} r = Q_k + \varepsilon_k\), \(\varepsilon_k \sim N(0, \omega_k)\)" — a normal belief about a linear combination of asset returns. The pick matrix \(P\) (one row per view, one column per asset) encodes which assets the view touches: an absolute view ("gold earns 6% excess") has a single 1; a relative view ("QQQ beats TLT by 3%") has +1 and −1; a basket view weights each leg by market cap within the leg, the He-Litterman convention. The view vector \(Q\) stacks the expected values; the diagonal matrix \(\Omega\) stacks the uncertainties — and choosing it is where most implementations go wrong, because \(\Omega\) has no natural units a user can supply directly. The module offers two constructions: proportional-to-prior, \(\Omega = \mathrm{diag}(\tau\, P \Sigma P^{\top})\), which makes each view exactly as credible as the prior on that combination and has the convenient side effect that \(\tau\) cancels from the posterior mean; and the Idzorek confidence-percentage method of section IX, which lets the user state conviction as a number between 0 and 1.


# views.py
def relative_view(long, short, q, w_mkt=None, confidence=0.5) -> View:
    """Long basket outperforms short basket by q; legs cap-weighted if w_mkt given."""
    picks = {}
    for leg, sign in ((long, 1.0), (short, -1.0)):
        if w_mkt is not None:
            total = sum(w_mkt[a] for a in leg)
            for a in leg:
                picks[a] = picks.get(a, 0.0) + sign * w_mkt[a] / total
        ...
    return View(assets=picks, q=q, confidence=confidence)

def omega_proportional(p, sigma, tau):
    """Omega = diag(tau * P Sigma P') — view variance proportional to prior variance."""
    return np.diag(np.diag(tau * p @ sigma @ p.T))
  

%==========%


VII. The Master Formula — a Conjugate-Normal Derivation (model/posterior.py):

The model is a textbook Bayesian update in disguise. Treat the unknown mean return vector \(\mu\) as the parameter: the prior says \(\mu \sim N(\pi, \tau\Sigma)\) (equilibrium, held loosely — \(\tau\) scales how loosely), and the views are noisy observations of linear functions of the parameter, \(Q \mid \mu \sim N(P\mu, \Omega)\). Normal prior, normal likelihood, linear map — the posterior is normal with precision-weighted mean:

\[ \mu_{BL} = \left[(\tau\Sigma)^{-1} + P^{\top}\Omega^{-1}P\right]^{-1}\left[(\tau\Sigma)^{-1}\pi + P^{\top}\Omega^{-1}Q\right] \] \[ M = \left[(\tau\Sigma)^{-1} + P^{\top}\Omega^{-1}P\right]^{-1}, \qquad \Sigma_{post} = \Sigma + M \]

Prior precision \((\tau\Sigma)^{-1}\) plus view precision \(P^{\top}\Omega^{-1}P\), each multiplied by what it believes — conviction is literally the weight in a weighted average. \(M\) is the remaining uncertainty about the mean; since realised returns are the uncertain mean plus market noise, the covariance relevant for optimisation is \(\Sigma + M\) (the "posterior predictive" covariance — a point Meucci stresses and many implementations silently drop). The implementation uses the algebraically equivalent form

\[ \mu_{BL} = \pi + \tau\Sigma P^{\top}\left(P\tau\Sigma P^{\top} + \Omega\right)^{-1}(Q - P\pi) \]

which never inverts \(\Omega\) and therefore survives the 100%-confidence limit \(\Omega \to 0\), where the posterior satisfies \(P\mu_{BL} = Q\) exactly — an identity the test suite checks to \(10^{-12}\). Read as prior plus gain times innovation, this is precisely a Kalman filter measurement update with state \(\mu\), and \(Q - P\pi\) — how much the view disagrees with equilibrium — is the innovation.


# posterior.py
def bl_posterior(sigma, pi, p, q, omega, tau):
    ts = tau * sigma
    if p is None or np.size(p) == 0:           # no views: posterior = prior
        return pi.copy(), sigma + ts

    pts = p @ ts                               # K x N
    a = pts @ p.T + omega                      # K x K, SPD even when Omega = 0
    gain = np.linalg.solve(a, np.eye(a.shape[0]))

    mu_bl = pi + pts.T @ gain @ (q - p @ pi)   # prior + gain * innovation
    m = ts - pts.T @ gain @ pts
    return mu_bl, sigma + m
  

%==========%


VIII. The Role of \(\tau\) — the Most Argued-About Scalar in Finance:

\(\tau\) scales the prior covariance: the prior on the mean is \(N(\pi, \tau\Sigma)\), so \(\tau\) answers "how uncertain am I about equilibrium itself?". The classical reading sets \(\tau \approx 1/T\) — the mean of \(T\) observations is estimated with covariance \(\Sigma/T\) — giving values like 0.02–0.05; Black and Litterman used 0.025–0.05, others use 1, and the literature has spent three decades arguing. The module's position, demonstrated rather than asserted: the argument mostly doesn't matter, because \(\tau\) and \(\Omega\) only enter through their ratio. When \(\Omega\) is chosen proportional to the prior, \(\Omega = \tau\,\mathrm{diag}(P\Sigma P^{\top})\), the \(\tau\)'s cancel in the posterior-mean gain \(\tau\Sigma P^{\top}(\tau P\Sigma P^{\top} + \tau\,D)^{-1}\), and the BL weights are identical across \(\tau \in [0.01, 1]\) — the test suite pins the spread below \(10^{-10}\), and the dashboard's tau-grid chart shows eight perfectly flat lines. What \(\tau\) does still move is the posterior covariance \(\Sigma + M\) (since \(M \approx \tau\Sigma\) away from the views), hence the implied leverage of unconstrained portfolios. The practical advice falls out: fix \(\tau\) at something defensible like \(1/T\), spend your modelling effort on \(\Omega\), and treat any result that is sensitive to \(\tau\) as a sign the \(\Omega\) convention is inconsistent.


%==========%


IX. Idzorek's Confidence Percentages (model/views.py):

"My view has variance 0.000841" is not a sentence an investment committee will ever say; "I'm 60% confident" is. Idzorek (2005) bridges the gap by defining confidence operationally in weight space. For each view in isolation: compute the unconstrained BL weights at 100% confidence (\(\omega_k = 0\)), giving the full tilt \(w_{100} - w_{mkt}\); define the target as the market portfolio plus the stated fraction of that tilt, \(w_{target} = w_{mkt} + c_k\,(w_{100} - w_{mkt})\); then solve numerically for the scalar \(\omega_k\) whose posterior weights land closest to the target. The module solves the inner problem by bounded scalar minimisation over \(\log\omega_k\) (the objective is smooth and unimodal in the log). The definition is self-calibrating in exactly the way users expect: 50% confidence produces almost exactly half the tilt (the test suite asserts the ratio to 2%), the tilt is monotone in confidence, and \(c_k = 1\) reproduces the hard-constraint limit \(P\mu = Q\). On the dashboard, the confidence slider is this \(c_k\).


# views.py
def omega_idzorek(p, q, sigma, pi, w_mkt, delta, tau, confidences):
    omegas = np.zeros(len(q))
    for k in range(len(q)):
        pk, qk = p[k:k+1], q[k:k+1]
        mu_100, _ = bl_posterior(sigma, pi, pk, qk, np.zeros((1, 1)), tau)
        w_100 = implied_market_weights(delta, sigma, mu_100)
        w_target = w_mkt + confidences[k] * (w_100 - w_mkt)

        def depart(log_omega):
            mu, _ = bl_posterior(sigma, pi, pk, qk,
                                 np.array([[np.exp(log_omega)]]), tau)
            w = implied_market_weights(delta, sigma, mu)
            return float(np.sum((w - w_target) ** 2))

        res = minimize_scalar(depart, bounds=bracket, method="bounded")
        omegas[k] = float(np.exp(res.x))
    return np.diag(omegas)
  

%==========%


X. Why the Weights Come Out Stable — the He-Litterman Property:

The deepest result in He and Litterman (1999) explains structurally why BL portfolios behave. For the unconstrained investor, the optimal weights under the posterior decompose as

\[ w^{*} = \frac{w_{mkt}}{1+\tau} + P^{\top}\Lambda \]

— the (slightly shrunk) market portfolio plus one tilt per view, along that view's own pick portfolio, with \(\Lambda\) a \(K\)-vector of tilt sizes determined by conviction and disagreement with equilibrium. Assets not named in any view keep their market weight exactly, regardless of how correlated they are with the view assets: the correlation effects are already absorbed into \(\pi\), so they do not leak into spurious positions. This is the property that kills both MV pathologies at once. No corner solutions, because the baseline is the fully diversified market portfolio and deviations are deliberate, bounded bets; no hypersensitivity, because a small change in a view moves only its own \(\Lambda_k\), smoothly, rather than being amplified through \(\Sigma^{-1}\) into every weight. The test suite verifies the decomposition directly — projecting \(w^{*} - w_{mkt}/(1+\tau)\) onto the span of \(P^{\top}\) leaves a residual below \(10^{-10}\) — and the dashboard's tilt chart makes it visible: move any slider and only the view assets move.


%==========%


XI. Weight Comparison and Rolling Backtest (backtest/engine.py):

The point-in-time comparison feeds the same long-only max-Sharpe optimiser four different return estimates over the synthetic eight-asset market. Plain MV on one year of sample means holds 3 of 8 assets with 53% in the single luckiest one; BL with a weak view holds all eight within a few points of market weights; BL with a strong view takes a visible but diversified bet. The rolling backtest then walks forward (252-day estimation window, 21-day rebalance), regenerating a mechanical momentum view each period — deliberately mediocre signal, because the interesting question is not whether the view adds alpha but how each construction degrades when views are noisy or wrong. Plain MV churns: its trailing-mean estimates swing every window, producing the highest turnover of the four strategies and a Sharpe no better than 1/N. BL-weak hugs the market with a fraction of the turnover — the equilibrium anchor acts as a no-trade region — and BL-strong sits between, paying for conviction in turnover but degrading gracefully rather than catastrophically when its view is wrong, because even a 100%-confidence view only moves the portfolio along one pick vector. The turnover ordering (BL-weak < BL-strong < plain MV) is asserted in the test suite on the seed-42 market.


# engine.py — one rebalance step
est = returns.iloc[t - window : t]
sigma = est.cov().values * 252
pi = implied_equilibrium_returns(delta, sigma, w_mkt)
p, q = build_pq([_momentum_view(est)], names)        # trailing best vs worst

weights["mv_historical"] = max_sharpe(est.mean().values * 252, sigma, long_only=True)
for label, scale in (("bl_weak", 10.0), ("bl_strong", 0.1)):
    omega = scale * omega_proportional(p, sigma, tau)
    mu_bl, sigma_bl = bl_posterior(sigma, pi, p, q, omega, tau)
    weights[label] = max_sharpe(mu_bl, sigma_bl, long_only=True)
  

%==========%


XII. Constrained Optimisation and the CLI (model/optimizer.py, cli.py):

Constrained problems (long-only, fully invested, target return) go through cvxpy: minimum variance and target return are direct quadratic programs, and long-only max-Sharpe uses the standard homogenisation — minimise \(y^{\top}\Sigma y\) subject to \(\mu^{\top}y = 1\), \(y \geq 0\), then \(w = y/\mathbf{1}^{\top}y\). Every unconstrained problem also has a closed form (\(w \propto \Sigma^{-1}\mu\)), so the core BL results never strictly require a convex solver — the stlite dashboard runs the same mathematics in the browser where cvxpy cannot follow. Six CLI subcommands cover the pipeline; with no database present every command runs on the synthetic seed-42 market, so the whole walkthrough below works offline. Note that \(\mathtt{bl\ posterior}\) prints prior, posterior, and weights side by side — the fastest way to sanity-check a view before trusting it.


# Install
pip install -e ".[dev]"

# Adjusted prices + ETF AUM from Yahoo, 3m T-bill from FRED
bl fetch --tickers SPY,QQQ,IWM,EFA,EEM,TLT,IEF,LQD,GLD,VNQ

# Reverse-optimise market weights into the equilibrium prior pi
bl equilibrium --delta 2.5

# Blend one relative view (Idzorek confidence) into the prior
bl posterior --long QQQ --short TLT --q 0.03 --confidence 0.5

# Equal-weight vs market vs plain MV vs BL weights side by side
bl compare --tau 0.05

# Rolling out-of-sample backtest: Sharpe and turnover per approach
bl backtest --window 252 --rebalance 21

# Launch Streamlit server-side dashboard
bl dashboard
  
CommandKey optionsOutput
bl fetch--tickers, --start, --dbPrices, AUM/caps, FRED rate to DuckDB
bl equilibrium--delta (0 = estimate from market)w_mkt, pi, and vol per asset
bl posterior--long, --short, --q, --confidence, --taupi vs mu_BL vs weights, Idzorek Omega
bl compare--tau, --delta1/N, market, plain MV, BL weak/strong weights
bl backtest--window, --rebalance, --tauSharpe, vol, drawdown, turnover per strategy
bl dashboardLaunches streamlit run src/black_litterman/app.py

%==========%


XIII. Test Suite and References:

All 46 tests are offline and deterministic (seed 42), anchored to the He-Litterman (1999) seven-country dataset as external ground truth: reverse optimisation reproduces the published equilibrium returns to 0.2%, and the Germany-versus-Europe view with \(\Omega = \tau P\Sigma P^{\top}\) reproduces the published posterior to 0.25%. Limit tests pin the structure of the formula — zero views return the prior to machine precision, \(\Omega \to 0\) enforces \(P\mu_{BL} = Q\) to \(10^{-12}\), enormous \(\Omega\) recovers the prior, and the posterior view return always lies strictly between prior and view. Uncertainty tests assert that \(\mathrm{diag}(M)\) shrinks monotonically as views sharpen. Optimiser tests check the KKT conditions of the closed forms, agreement between cvxpy and the closed form when the long-only constraint is slack, and the round trip that unconstrained max-utility on \((\pi, \Sigma, \delta)\) is exactly the market portfolio. Idzorek tests assert 50% confidence yields half the full tilt within 2% and monotonicity throughout. Backtest tests require bit-identical reruns, fully-invested long-only weights, and the BL-weak < plain-MV turnover ordering.


# test_posterior.py — the external ground truth
def test_posterior_matches_published_table(self, hl, pi):
    """Germany-vs-Europe view with Omega = tau P Sigma P' reproduces Table 4."""
    omega = omega_proportional(hl["p"], hl["sigma"], hl["tau"])
    mu, _ = bl_posterior(hl["sigma"], pi, hl["p"], hl["q"], omega, hl["tau"])
    assert mu == pytest.approx(hl["mu_published"], abs=2.5e-3)

def test_full_confidence_enforces_views_exactly(self, hl, pi):
    """Omega -> 0: the posterior satisfies P mu = Q exactly."""
    mu, _ = bl_posterior(hl["sigma"], pi, hl["p"], hl["q"], np.zeros((1, 1)), hl["tau"])
    assert (hl["p"] @ mu).item() == pytest.approx(float(hl["q"][0]), abs=1e-12)
  

References. Black, F. and Litterman, R. (1992), "Global Portfolio Optimization", Financial Analysts Journal 48(5). He, G. and Litterman, R. (1999), "The Intuition Behind Black-Litterman Model Portfolios", Goldman Sachs Investment Management Research. Idzorek, T. (2005), "A Step-by-Step Guide to the Black-Litterman Model", Zephyr Associates working paper. Meucci, A. (2010), "The Black-Litterman Approach: Original Model and Extensions", The Encyclopedia of Quantitative Finance. Michaud, R. (1989), "The Markowitz Optimization Enigma: Is 'Optimized' Optimal?", Financial Analysts Journal 45(1).