Chapter 2: Markets as Data Objects

What you’ll learn

A market lives in your computer as a time-indexed Pandas object. In this chapter you go from raw closing prices all the way to the statistics that drive every risk and performance report:

Prices, raw vs adjusted close, simple and log returns, and the rules for aggregating returns across horizons.
Cumulative gross returns (equity curves), resampling daily → monthly, and rolling statistics.
Volatility with $\sqrt{T}$ scaling, drawdowns via the running maximum, Sharpe ratios with proper annualisation, cross-asset correlation, and the subtle but unavoidable volatility drag that pulls compound returns below the arithmetic mean.

These are the gateway from raw return series to portfolio-level diagnostics.

Chapter Introduction

Where you’ll see this: every time you open Yahoo Finance, Robinhood, or a TikTok video screaming “this stock is up 200%!”, you are looking at the output of a pipeline that started exactly where this chapter starts — with a long, sorted list of (time, price) rows. By the end of the chapter you will be able to build that pipeline yourself in Python, and (more importantly) to spot when somebody else’s pipeline is lying to you.

Finance is one of the oldest data-intensive industries on earth. From the merchants of Venice in the 14th century, who kept ledger books of grain and silver prices, to a quant (a quantitative analyst — someone who builds math/code-driven trading strategies) on a 2026 trading desk staring at a 5-millisecond tick stream, the underlying object has not changed: a time-stamped sequence of prices. What has changed is the toolkit. The Venetian merchant had quills, ink, and arithmetic. You have Python, pandas (the standard Python library for working with tables of data), and a laptop that can scan a hundred years of daily prices in milliseconds.

This chapter is the first of two on markets as data objects. The premise is simple but consequential: every price quote a market produces — every trade, every closing price, every option chain (the menu of contracts that let you bet on a stock’s future price) — is just a piece of data tagged with a time stamp. Once you accept this view, finance stops being mysterious and becomes a problem of time-series engineering (the discipline of cleaning, aligning, and transforming data that is indexed by time). The rest of the course — risk, portfolios, factor models, sentiment, machine learning signals — is built on the foundations laid in this chapter.

The view from a real trading floor is unforgiving. At a hedge fund, the data layer is the source of more bugs than the model layer. A wrong dividend adjustment, a missing splits column, a date index that silently misaligns by one row — any of these will corrupt a backtest (a simulation of how a strategy would have done historically) in ways that no statistical test can detect until you have lost real money. The discipline you build here — always know what each column means, always check the index, always know whether prices are raw or adjusted — is not a textbook ritual. It is the difference between a profitable strategy and an embarrassing post-mortem.

Why “data objects” instead of “data”

The phrase data objects is deliberate. A pandas.Series of closing prices is not a passive list of numbers. It is an object that knows its own time index, its dtype, its name, and a battery of methods — .pct_change(), .rolling(), .resample(), .cumprod() — that encode decades of empirical finance practice into a single dot-method. The skill is less about memorizing formulas (most of them are one line of math) and more about learning to think in objects: every transformation produces a new object, every object carries its time index, and every plot you make is a window into that object’s state.

What you bring in, what you take out

You should already be comfortable with the material in Chapter 2: loading a CSV, examining df.head(), the difference between a Series and a DataFrame, basic boolean indexing, and df.plot(). This chapter adds the time dimension on top: dates as index, returns as differenced data, compounding as accumulation, volatility as standard deviation, and the $\sqrt{T}$ scaling rule that lets you move risk between horizons. By the end you will have the vocabulary needed for Chapter 4, where we introduce portfolios as weighted combinations of these return series.

Prices and the time index
Raw close vs adjusted close: dividends and splits
Simple returns
Log returns
Simple vs log: when each one matters
Aggregating returns over time
Cumulative gross return and the equity curve
Resampling: daily → weekly → monthly
Rolling statistics: means and volatility
Annualizing volatility: the $\sqrt{252}$ rule
Worked example: SPY end-to-end
Exercises

Prices and the time index

Why this matters: before you can compute returns, risk, or anything else, you need to be able to load a price series and trust it. This section is about the rules for the underlying object — what a price series is, why the time index matters more than the prices themselves, and the kinds of silent bugs that destroy backtests before the model is even written.

What a price series actually is

Open a financial chart on any phone app and you see a wavy line moving from left to right. Strip away the chart and what remains is a table with two columns: a time stamp and a price. Everything else — moving averages (a smoothed running mean of the price), candlesticks (a chart style that shows open/high/low/close on each bar), volume bars, indicators — is derived from this minimal pair. Internally, a market data feed at any modern fund looks like exactly this: a long, sorted sequence of (timestamp, price) rows.

In pandas the natural home for such an object is a Series with a DatetimeIndex. A pandas Series is just a one-column table with row labels; a DatetimeIndex is the special kind of row label that holds actual dates instead of plain integers like 0, 1, 2. The Series stores the prices, the index stores the time stamps, and the two are kept in lock-step by pandas: when you slice the Series by date, the prices follow; when you reindex the Series to a new calendar, the prices align automatically. This tight coupling between values and time is the single most important property of a pandas time-series object.

Three practical implications follow.

First, chronological order is sacred. The whole machinery of returns, rolling windows, and resampling assumes the index is sorted. A common bug — and a costly one — is to feed pandas a Series whose dates are out of order; many methods will not raise an error, they will simply produce nonsense. Always call .sort_index() after loading a new source.

Second, the index defines the universe. When you compute P_t / P_{t-1} (read aloud as “today’s price divided by yesterday’s price” — but be careful what “yesterday” means), “$t-1$” does not mean “yesterday in the real world”; it means “the row immediately before $t$ in the index”. If your data is sampled daily but skips weekends and holidays, then $t-1$ for a Monday row is the previous Friday. If you accidentally resample to a calendar-day frequency, $t-1$ becomes a Sunday with NaN (pandas’s marker for “missing value”) and every return becomes NaN. The arithmetic is unforgiving — the index, not the calendar, is the ground truth.

Third, the index enables time-aware methods. Series.rolling("22D"), Series.resample("1M"), Series.shift(1), and Series.asof() all rely on the index being a DatetimeIndex. Without it these methods either fall back to positional behavior or fail outright. The very first thing you do after loading a price file is verify that the index is a proper date type:

print(type(price.index))
# <class 'pandas.core.indexes.datetimes.DatetimeIndex'>

Acquiring price data with `yfinance`

For the rest of this chapter we use yfinance, a lightweight Python package (an open-source library you install with pip install yfinance) that scrapes Yahoo Finance for end-of-day OHLCV bars — that acronym just means the five standard daily numbers a market reports: Open (first trade of the day), High (highest price), Low (lowest price), Close (last trade), Volume (number of shares traded). It also gives you corporate actions (dividends and splits, which we explain in the next section). It is free, fast enough for teaching, and adequate for prototypes. Real work at a fund uses paid feeds (Bloomberg, Refinitiv, Polygon, IEX), but the data structure — a date-indexed OHLCV table — is identical.

A minimal first call looks like this:

import yfinance as yf
import pandas as pd
import numpy as np

msft = yf.Ticker("MSFT").history(period="5000d", interval="1d")
msft.head()

The result is a DataFrame whose index is a DatetimeIndex of trading dates and whose columns are Open, High, Low, Close, Volume, Dividends, and Stock Splits. The first five rows for MSFT look like this:

                              Open       High        Low      Close      Volume  Dividends  Stock Splits
Date
2006-04-26 00:00:00-04:00  18.903    19.008    18.847    18.917    39190000        0.0           0.0
2006-04-27 00:00:00-04:00  18.826    19.287    18.806    19.022    96509600        0.0           0.0
2006-04-28 00:00:00-04:00  16.914    17.102    16.753    16.858   591052200        0.0           0.0
2006-05-01 00:00:00-04:00  16.977    17.451    16.816    16.956   174800900        0.0           0.0
2006-05-02 00:00:00-04:00  17.095    17.451    16.683    16.760   190533500        0.0           0.0

A handful of details deserve attention. The index is timezone-aware (-04:00 means US Eastern time, the timezone of the NYSE — so pandas knows that 9:30 AM in this Series is not the same instant as 9:30 AM Hong Kong time). The Open/High/Low/Close numbers are reported to fractional cents, because Yahoo back-adjusts historical prices for splits and dividends — a subject we will return to in the next section. The Volume column is in shares traded, and Dividends and Stock Splits are non-zero only on corporate-action days (days when the company paid a dividend or split its stock).

What this gave us: a single DataFrame (a DataFrame is the pandas word for a multi-column table) where each row is one trading day and each column is one piece of information — exactly the shape we need.

Anatomy of an adjusted-close price series

Throughout this chapter we will treat the closing-price column as a stand-alone Series — one date label per row, one number per row. The diagram below sketches that object so you have a clear mental picture before we start computing returns.

When you read code in the rest of this chapter, keep this picture in mind: a single column of floats with a DatetimeIndex on the left. Every operation we apply — .pct_change(), .cummax(), .rolling(), .resample() — touches one of these two parts (the values, or the index), never anything else.

Why a long history matters

A 5,000-day window is roughly 20 trading years — enough to span at least one major regime change (the 2008 crisis, the 2020 pandemic crash, the 2022 inflation shock). When you measure volatility or test a strategy, the question is not what was the last year like but what kinds of years has this asset lived through. A one-year sample is a dangerous teacher.

Now plot the closing series — the canonical first chart in any financial analysis:

The trajectory of the curve — slow drift, sharp drawdowns, long recoveries — is the visual signature of equity returns. Notice that the level of the price is not what matters; the chart looks essentially the same whether the y-axis runs from $20 to $400 or from $2 to $40. What matters are the percentage changes between adjacent points, which are the actual quantity an investor experiences. That is the topic of the next several sections.

Beyond a single stock: ETFs, bonds, options

The same code pattern downloads any tradable asset Yahoo covers. ETFs (Exchange-Traded Funds) are the building blocks of most retail and many institutional portfolios because each ETF gives you a single-ticker bet on an entire basket — an index, a sector, a country, a bond segment.

qqq = yf.Ticker("QQQ").history(period="5000d", interval="1d")    # NASDAQ-100
tlt = yf.Ticker("TLT").history(start="2020-01-01")               # 20+ year Treasuries
lqd = yf.Ticker("LQD").history(start="2020-01-01")               # IG corporate bonds

A few ETFs you should know by name:

Ticker	Asset class	Why it matters
SPY	S&P 500	The benchmark for US large-cap equity
QQQ	NASDAQ-100	Tech-heavy growth benchmark
XLK	S&P 500 Information Technology sector	A cleaner tech-only exposure
TLT	20+ year US Treasuries	Long-duration safe-asset proxy
IEF	7–10 year US Treasuries	Intermediate Treasuries
SHY	1–3 year US Treasuries	Cash-like rate exposure
LQD	Investment-grade corporate bonds	Credit spread exposure
HYG	High-yield (junk) corporate bonds	High-credit-risk proxy

For derivatives, yfinance also exposes option chains:

t = yf.Ticker("AAPL")
t.options                # tuple of expiration dates
chain = t.option_chain(t.options[0])
calls = chain.calls      # DataFrame of call contracts
puts  = chain.puts       # DataFrame of put contracts

We will return to options in Chapter 6. For now the point is structural: every tradable instrument fits into the same date-indexed-table mold. Once you can manipulate a daily close Series, you can manipulate the entire investable universe.

yfinance is not a production data source

yfinance scrapes Yahoo, which means its data is best-effort: sometimes prices are missing, sometimes a ticker is silently delisted, sometimes a corporate action is applied incorrectly. For coursework and prototyping it is fine. For real money use a vendor with service-level guarantees (Bloomberg, Refinitiv, Polygon, IEX, Norgate). Even then, always spot-check.

Raw close vs adjusted close: dividends and splits

Where you’ll see this: when an Instagram post shows a chart of “Apple before and after the 2020 split” and claims investors “lost 75% overnight”, they are confusing raw with adjusted prices. Nobody lost a cent — Apple just multiplied each share into four. This section is the antidote: it tells you exactly what to look for so you never get tricked by the same bug.

The single most common source of silent error in equity analysis is confusing raw close with adjusted close. You must always know which one you are holding.

Intuition

Imagine a friend gives you a HK$100 bill, then later asks for it back and hands you ten HK$10 bills instead. You haven’t gained or lost anything — but if you only tracked the biggest single bill in your wallet, it would look like you went from HK$100 to HK$10, a “90% loss”. Stock splits create that exact illusion in raw price data. Adjusted prices undo the illusion so the percentage changes match the true experience.

What corporate actions do to a price series

A company can change its share price for two reasons that have nothing to do with investor demand. First, it may pay a cash dividend — a per-share cash payment to shareholders (think: “the company takes some of its cash and mails it out to whoever owns the stock”). On the ex-dividend date (the cutoff day for who gets paid), the share price drops by roughly the dividend amount, because that cash is no longer inside the firm. Second, it may declare a stock split (or reverse split) — the number of shares is multiplied, and the per-share price is divided to match. Example: in a 4-for-1 split, every 1 share becomes 4, but the price drops to 1/4 of what it was. Your total holding is worth exactly the same; only the unit changed (like splitting a HK$100 bill into ten HK$10 bills).

Both events create mechanical “drops” in the raw price line that have no economic meaning for a buy-and-hold investor (someone who just owns the stock and doesn’t trade it). The investor either gets the dividend as cash or ends up holding more shares after the split. A naive return computation on raw prices will record these drops as losses, which is wrong.

The fix is to use adjusted close prices: a synthetic (artificially constructed) series in which historical prices are scaled so that the percentage change between any two adjacent adjusted closes equals the true total return — the price change plus any dividends paid (treated as if you immediately reinvested them) — that a buy-and-hold investor would have earned over that interval.

The construction is straightforward. Let $P_t$ be the raw close at date $t$, $D_t$ the dividend paid at $t$ (zero on non-dividend days), and $s_t$ the split ratio at $t$ (1 on non-split days, e.g. 2 for a 2-for-1 split). Define the adjustment factor that propagates backward in time:

\[ a_t = a_{t+1} \cdot \frac{P_{t+1} - D_{t+1}}{P_{t+1}} \cdot \frac{1}{s_{t+1}}, \qquad a_T = 1. \]

The adjusted close is then $P_t^{\text{adj}} = a_t \cdot P_t$. The exact algorithm depends on the vendor, but the goal is universal: adjusted close returns should equal total returns.

Which one does `yfinance` give you?

Recent versions of yfinance set auto_adjust=True by default in yf.download(), and Ticker.history() likewise returns prices that are already split- and dividend-adjusted. The Close column in the DataFrame above is therefore already an adjusted close. The legacy Adj Close column is no longer separately reported.

Always confirm whether prices are adjusted

The default has flipped at least twice in yfinance’s history. Never assume — print the first few rows, inspect the Dividends and Stock Splits columns, and convince yourself the Close column has been adjusted before computing any return.

A quick diagnostic for adjustment is to look at a known split day for a major stock — for example, Apple’s 4-for-1 split on 31 August 2020. The unadjusted close fell from about $499 to $129; the adjusted close shows a smooth percentage change of roughly +3%, the true one-day total return.

Dividend, split, and total return defined

To keep terminology clean:

Price return between $t-1$ and $t$ is $P_t/P_{t-1} - 1$, computed on raw close.
Total return is the price return plus dividends received, treated as if reinvested at the close on the ex-date.
Adjusted close return is computed as $P_t^{\text{adj}}/P_{t-1}^{\text{adj}} - 1$ on the adjusted series, and approximates the total return.

For equity strategy research, you almost always want total returns — dividends are a real cash flow and a meaningful fraction of long-horizon equity returns (about 2% per year of the long-run ~7% real US equity return historically). The shortcut is to use adjusted close throughout and never compute returns on raw prices unless you have a specific reason.

Simple returns

Why this matters: every time a finance app says “AAPL +1.2% today”, that 1.2% is a simple return. This is the most common number in all of finance — but as you’ll see in two sections, it has one quirky property (it doesn’t add up cleanly across time) that forces quants to invent a second kind of return.

Intuition

A return is “how much wealthier you got, expressed as a fraction of what you started with”. If your $100 grew to $102, the return is +2% — i.e. $2 of profit divided by the $100 you put in. That’s it. Everything below is just notation for this one idea.

Definition

Below, $P_t$ is just shorthand for “the price on day $t$”, and $R_t$ is “the return on day $t$”. Don’t memorise the symbols — just remember the picture: today’s price compared to yesterday’s. Given a price series $P_t$, the simple return over one period is

\[ R_t = \frac{P_t - P_{t-1}}{P_{t-1}} = \frac{P_t}{P_{t-1}} - 1. \]

It is the percentage change in price between $t-1$ and $t$. A 1% return means $P_t = 1.01 \, P_{t-1}$ (today’s price is 1.01 times yesterday’s); a $-2\%$ return means $P_t = 0.98 \, P_{t-1}$. Simple returns are how brokers report performance, how regulators define disclosures, and how investors intuitively reason about gains and losses.

In pandas: `.pct_change()` and the manual form

Pandas exposes this transformation in two equivalent forms. The first is .pct_change() (read it as “percent change” — it’s literally the method that does what its name says), which directly returns the simple return:

qqq["simpleR"] = qqq["Close"].pct_change()

The second writes out the arithmetic explicitly, which is useful when you want to see what is happening:

qqq["simpleR"] = (qqq["Close"] - qqq["Close"].shift(1)) / qqq["Close"].shift(1)

Both produce identical output. .shift(1) slides the entire series down by one row — so qqq["Close"].shift(1) puts yesterday’s closing price on today’s row, lining up “yesterday” and “today” side by side so we can subtract. The first row is NaN because there is no prior price to shift in.

Historical vs forward returns

Two related quantities show up constantly in practice and deserve clean naming.

Historical return at $t$ is computed from $t-1$ to $t$: it is the return you would have earned over the period ending at $t$. This is what .pct_change() gives you.

Forward return at $t$ is computed from $t$ to $t+1$: it is the return you would earn over the next period, conditional on holding from the close at $t$. Forward returns are constructed with .shift(-1):

qqq["H"] = qqq["Close"].pct_change()                            # historical: t-1 -> t
qqq["F"] = qqq["Close"].shift(-1) / qqq["Close"] - 1            # forward:    t   -> t+1

The distinction matters whenever you build predictive models. The dependent variable in a return-prediction model is almost always a forward return — what will the next-period return be, given features observed at or before time $t$? Using historical returns as a target by mistake leaks information from the future into the model and produces backtests that look spectacular and lose money live.

The most common forecasting bug

A model that uses today’s return as both a feature and a target will achieve near-perfect in-sample $R^2$ and zero out-of-sample profit. The bug is always the same: aligning the target to the wrong date. Make it a habit to give forward-return columns a clearly distinct name (fwd_ret_1d, y_1d, target_1d) so they never get confused with historical returns.

A quick numerical sanity check

What this gave us: a tiny three-row example where we can see by hand that .pct_change() produces exactly the formula’s output — useful sanity check before trusting it on millions of rows.

Anatomy: from prices to returns

The diagram below shows the two columns side by side. The price column has five values; the returns column has only four — the first row is NaN because there is no $P_{t-1}$ to subtract from. This one-row offset is the single most common source of confusion when students start computing returns.

Read the diagram from left to right. Each return on the right is built from two prices on the left — the one on the same row, divided by the one on the row above. The very first row has no row above it, which is why returns always start with a NaN. Whenever you join a returns column back onto a price table, expect the first row to be missing and decide on purpose whether to drop it or to fill it.

Two observations: returns are unitless (they are ratios — they have no dollar sign, no percentage sign attached, they’re just a number), and a $+2\%$ followed by a $-1\%$ does not return you to the starting price — after you think about it for a moment, you can see why: the $-1\%$ is applied to the new (post-gain) price of $102, not to the original $100. You end at $100 \cdot 1.02 \cdot 0.99 = 100.98$, slightly above 100. This non-symmetry of gains and losses is the entry point to log returns.

Log returns

Why this matters: simple returns are intuitive, but they don’t add up cleanly over multiple days — to combine them you have to multiply, which is annoying in statistics and finance models. Log returns fix this: they turn multiplication into addition, which is why every statistical model in this course (and most quant papers) is written in log-return space.

Intuition

A log return is just a re-labelling of the simple return that makes the math nicer. Think of it like switching from Celsius to Kelvin: same physical reality, different scale that happens to make formulas cleaner. For tiny daily moves (under a couple of percent), the log return and the simple return are visually identical.

Definition

Below, $\ln$ is the natural logarithm — the log to base $e \approx 2.718$. If you remember log from high-school as base 10, just know that “natural log” is the version mathematicians prefer because of how it interacts with calculus and exponential growth. The log return (also called continuously compounded return) over one period is

\[ \ell_t = \ln\!\left(\frac{P_t}{P_{t-1}}\right) = \ln P_t - \ln P_{t-1}. \]

It is the natural logarithm of the gross return $1 + R_t$ (the “gross return” is just “1 + the return”, e.g. a +3% return has gross return 1.03 — the factor by which your wealth was multiplied). Equivalently:

\[ \ell_t = \ln(1 + R_t), \qquad R_t = e^{\ell_t} - 1. \]

For small returns, $\ln(1 + R) \approx R - R^2/2 + \cdots$, so $\ell_t \approx R_t$ to first order. At a daily frequency, where typical returns are well below 1%, the two are numerically very close — within a few basis points (a basis point, or “bp”, is 0.01% — the standard finance unit for “a tiny amount”).

In pandas

qqq["logR"] = np.log(qqq["Close"]) - np.log(qqq["Close"].shift(1))
# equivalently:
qqq["logR"] = np.log(qqq["Close"] / qqq["Close"].shift(1))

A typical tail of the joint output:

Ticker             QQQ   simpleR      logR
Date
2026-03-05  608.91   -0.003013  -0.003017
2026-03-06  599.75   -0.015043  -0.015158
2026-03-09  607.76    0.013356   0.013267
2026-03-10  607.77    0.000016   0.000016
2026-03-11  607.69   -0.000132  -0.000132

Notice how simple and log returns agree to four decimal places when the magnitude is small, and start to differ noticeably (in the fourth decimal) at $\pm 1.5\%$.

Inside the green band — the typical daily move for an equity — the curve and the 45° line are visually indistinguishable, so simple and log returns are interchangeable in routine work. Outside the band the curve bends below the line: a $+30\%$ simple return is only a $+26\%$ log return, and a $-30\%$ simple return is a $-36\%$ log return. The asymmetry — losses look worse in log space than in simple space — is exactly why compound returns drag below the arithmetic mean (a fact we return to later in the chapter).

What this gave us: a chart that visualises the divergence between simple and log returns — and importantly, it shows that for the everyday ±3% range you’ll see in daily stock data, the two curves are basically the same.

Why log returns exist at all

For an asset that compounds continuously, $\ell_t$ has two properties that simple returns lack.

Time additivity. If you hold the asset from $t$ to $t+k$, the log return over the whole window is the sum of the per-period log returns:

\[ \ell_{t \to t+k} = \ln\!\left(\frac{P_{t+k}}{P_t}\right) = \sum_{j=1}^{k} \ell_{t+j}. \]

This is the central reason quants love log returns: aggregating across time is addition, not multiplication. Means, sums, OLS regressions, time-series models — everything in classical statistics assumes additive structure, which simple returns do not have.

Symmetry around zero. A $+10\%$ simple return is not the inverse of a $-10\%$ simple return — they leave you with $0.99$ of your original capital, not $1.00$. A $+10\%$ log return is the inverse of $-10\%$. This symmetry is convenient when modelling.

Why simple returns still exist

For all the elegance of log returns, simple returns dominate one situation: portfolios. If you hold three assets with weights $w_1, w_2, w_3$ that sum to 1, the portfolio’s one-period simple return is

\[ R_p = w_1 R_1 + w_2 R_2 + w_3 R_3. \]

A weighted sum of simple returns gives the portfolio’s simple return exactly. The analogous identity is not true for log returns: in general,

\[ \ell_p \neq w_1 \ell_1 + w_2 \ell_2 + w_3 \ell_3, \]

except as a first-order approximation.

The rule of thumb most practitioners settle on:

Use simple returns when…	Use log returns when…
Combining assets into a portfolio (cross-section)	Aggregating one asset across time
Reporting performance to a client	Building statistical models on returns
Computing weighted averages	Computing means, OLS, Sharpe ratios on long data
Anything that says “percent gain”	Anything that says “log-normal”, “Brownian”

In practice both columns often coexist in a research dataframe, and you switch fluently between them.

Aggregating returns over time

Where you’ll see this: “this stock returned 10% per month for the past year — so 120% per year, right?” Wrong, and the gap between 120% and the true answer is exactly what this section unpacks. Aggregating returns is also where most spreadsheet errors happen in finance internships, because the rules feel obvious until you actually try them.

This is where the simple-vs-log distinction earns its keep.

Simple-return compounding

If you hold an asset for $k$ periods with simple returns $R_1, R_2, \ldots, R_k$, the gross return over the whole window is the product of the gross per-period returns:

\[ 1 + R_{1 \to k} = (1 + R_1)(1 + R_2)\cdots(1 + R_k) = \prod_{j=1}^{k} (1 + R_j). \]

The net cumulative return is $R_{1 \to k} = \prod (1+R_j) - 1$. In pandas:

cum = (1 + r).prod() - 1   # net cumulative return over the window

Log-return summation

For log returns the same window gives

\[ \ell_{1 \to k} = \sum_{j=1}^{k} \ell_j, \]

and to convert back to the cumulative gross return: $1 + R_{1 \to k} = e^{\ell_{1 \to k}}$.

cum = np.exp(lr.sum()) - 1

The two computations produce identical answers up to floating-point rounding — they are just two ways of writing the same algebra. The choice between them is purely about which form is easier to manipulate in the surrounding code.

Worked numerical example

Suppose a stock has three daily simple returns: $R_1 = +2\%$, $R_2 = -1\%$, $R_3 = +1.5\%$.

Cumulative gross return (simple form):

\[ 1 + R_{1\to 3} = 1.02 \cdot 0.99 \cdot 1.015 = 1.02490. \]

So the three-day return is $+2.49\%$, not $+2.5\%$. The shortfall ($-0.01\%$) is the convexity drag from the loss day.

Equivalently in log form, $\ell_j = \ln(1 + R_j)$:

\[ \ell_1 = 0.019803, \quad \ell_2 = -0.010050, \quad \ell_3 = 0.014889, \] \[ \ell_{1\to 3} = 0.024642, \quad e^{0.024642} - 1 = 0.02490. \checkmark \]

The two routes agree. The log form makes it obvious that the sign of $-1\%$ is the only source of drag; if all three returns were $+2\%$ the cumulative would beat $3\times 2\% = 6\%$ by a small amount, a phenomenon known as positive compounding.

Why averages of returns deceive

The arithmetic mean of $\{+2\%, -1\%, +1.5\%\}$ is $0.833\%$. Compounded over 3 days, $(1.00833)^3 - 1 = 2.52\%$ — close to but not equal to the true cumulative $2.49\%$. The arithmetic mean of returns is not the per-period return that would generate the observed cumulative. The quantity that does is the geometric mean, which is the per-period equivalent of the cumulative product. For investment-performance reporting, geometric means (or equivalently, annualized log-return means) are the honest measure.

Cumulative gross return and the equity curve

Why this matters: the equity curve is the single chart every fund manager, retail investor, and YouTube finance influencer puts at the top of their pitch — it’s a visual answer to “if I had given you $1, what would I have now?” Learning to build one yourself (and read one critically) is the most important visual skill in this course.

Intuition

An equity curve just answers the question “how much would $1 have grown to, day by day?”. Every up-tick is a profitable day; every down-tick is a losing day. The dramatic-looking shape of a stock chart is mostly an equity curve in disguise.

The equity curve

If you put one dollar into an asset at $t = 0$ and reinvest all gains, your wealth at $t$ is

\[ W_t = \prod_{j=1}^{t}(1 + R_j), \]

where $W_0 = 1$ by convention. The series $\{W_t\}$ is called the equity curve (or sometimes the cumulative gross return). It is the single most informative chart in performance analysis: rising stretches are profit, falling stretches are drawdowns, and the steepness encodes the rate of compounding.

In pandas there are two one-liners, mirroring the simple/log split:

eq_simple = (1 + r).cumprod()           # from simple returns
eq_log    = np.exp(lr.cumsum())          # from log returns

Both produce the same $\{W_t\}$ up to floating-point error (tiny rounding errors that come from computers storing decimals in binary; harmless here). cumprod() is short for “cumulative product” — it walks down the column multiplying as it goes, so the value in row $t$ is the product of all values from row 1 through $t$. cumsum() (“cumulative sum”) does the same with addition. These are the two “accumulator” methods you will use constantly in performance analysis.

Code: build an equity curve

The two curves are visually indistinguishable, which is the point: they are the same object expressed two ways. In practice you pick whichever form composes more cleanly with the rest of your code. For instance, when you have a mix of cash periods (return = 0) and invested periods, np.log1p(r).cumsum() handles zeros without precision loss, while (1+r).cumprod() is easier to read.

The four panels are progressively more processed views of the same return series. Panel (a) is the raw atom — a histogram of daily returns. Panel (b) compounds those returns into a wealth path via cumprod. Panel (c) overlays the running maximum, the bookkeeping needed for drawdown. Panel (d) is the wealth path expressed as a percentage shortfall from that peak — the underwater chart — which is the picture investors actually care about because it shows depth and duration of pain simultaneously.

Reading the equity curve

Three quantities you can eyeball off any equity curve:

CAGR (compound annual growth rate). If the curve runs from $W_0 = 1$ at date $t_0$ to $W_T$ at date $t_T$, and the number of years is $\tau = (t_T - t_0)/365.25$, then

\[ \text{CAGR} = W_T^{1/\tau} - 1. \]

Maximum drawdown. At each point, the drawdown is $W_t / \max_{s \leq t} W_s - 1$ — i.e., how far below the running peak you are. The minimum of this series over the whole sample is the worst peak-to-trough loss the investor would have lived through.
Time underwater. The fraction of dates on which the equity is below its previous peak. A strategy with a 10% drawdown that recovers in two weeks feels very different from one with a 10% drawdown that takes three years.

We will compute all three in Chapter 4.

Resampling: daily → weekly → monthly

Why this matters: academic research papers almost always work in monthly returns, while traders almost always work in daily (or faster). To read either literature, you need to be able to convert between them — and the conversion has one small trap that catches almost every beginner.

Intuition

resample is just the pandas way of saying bucket these timestamps into wider windows. Daily → monthly means “for each calendar month, collapse all the daily rows inside it into a single monthly row”. The only question is how you collapse them: take the last price? Sum the returns? Average something? The choice depends on what the column means.

The two flavors of frequency conversion

A daily series can be aggregated to weekly, monthly, quarterly, or annual frequency. Pandas has one universal method — .resample(rule) — that handles this, but what you put inside it depends on whether you are aggregating a price or a return.

For a price series you usually want the last observation in the period: the closing price at the end of the week or the month is what an investor would have realized. Use .last():

monthly_close = price.resample("1M").last()
weekly_close  = price.resample("1W").last()

For a return series you want the compounded return over the period. Each monthly return is the product of the daily gross returns inside the month, minus one:

monthly_ret = (1 + daily_ret).resample("1M").prod() - 1

The two operations are not interchangeable. Taking the last daily return of the month is a one-day return at month-end — it has nothing to do with the monthly return.

Common frequency rules

Rule string	Meaning
`"B"`	Business day
`"1W"`	Weekly (default: Sunday-end)
`"1W-FRI"`	Weekly, anchored to Friday
`"1M"`	Calendar month end
`"BM"`	Business month end
`"1Q"`	Calendar quarter end
`"1Y"` or `"1A"`	Calendar year end

For US equity work, "BM" (business month-end) is the most natural choice, because the last trading day of the calendar month is what an investor would actually transact on.

Code: daily → monthly, two ways

What this gave us: a side-by-side comparison showing both routes produce identical monthly returns — proof that they’re algebraically the same. The two columns agree to many decimal places, as they should: $\prod (1 + R_j) = P_{\text{end}}/P_{\text{start}}$ is an algebraic identity (the product of all the daily gross returns inside a month equals the end-of-month price divided by the start-of-month price — they’re literally the same number). The route you choose is a matter of which intermediate object you want to keep — sometimes you need the monthly price (e.g. to plot it), sometimes only the monthly return.

Why monthly?

Monthly returns are the most common research frequency in academic finance for three reasons. Noise. Daily returns are dominated by microstructure noise (bid-ask bounce, intraday flow), monthly returns less so. Macro alignment. Most macroeconomic series — CPI, unemployment, GDP, factor returns — are released monthly or less often, so monthly is the natural join frequency. Sample size. A 50-year monthly sample is 600 observations, comfortably enough for cross-sectional regressions; a 50-year daily sample is 12,600, which sounds bigger but provides less independent information per observation.

For trading, the choice is the opposite: higher frequency means more independent decisions per year and (if your edge is real — i.e. you genuinely have a small probabilistic advantage over the market) a higher Sharpe ratio. Daily and intraday data dominate quant trading research. The course will keep both perspectives alive.

Rolling statistics: means and volatility

Why this matters: “the market is more volatile than usual” — how would you actually check that? You need a moving (rolling) estimate of volatility that updates each day. Rolling statistics are also how every technical indicator on TradingView is built — the 50-day moving average, the Bollinger Bands, RSI, all of them.

Intuition

A rolling window is like looking at the data through a fixed-width sliding picture frame. Today, you look at the last 22 days. Tomorrow, you slide the frame one day to the right and look at the last 22 days from tomorrow’s vantage point. The “statistic” inside the frame (mean, std, whatever) updates each time.

Rolling windows in pandas

A rolling window slides a fixed-size window across the time index and computes a statistic at each step. Pandas exposes this through .rolling(window) followed by an aggregation method (an aggregation method is just a function that collapses many numbers into one — mean, std, max, min, sum, etc.):

ma_22 = price["Close"].rolling(22).mean()       # 22-day moving average
sd_22 = ret.rolling(22).std()                   # 22-day rolling sample std

The first 21 values of each output are NaN because the window is not yet full — by default min_periods equals the window length.

The choice of window length is partly conventional. 22 trading days is the standard for a monthly window (a calendar month averages ~21 trading days). 63 days is a quarter, 252 days is a year. Always state explicitly which convention you are using.

Rolling mean: the moving average

A rolling mean of returns gives a slow, smoothed estimate of the local drift. A rolling mean of prices gives a smoothed trajectory beloved of technical analysts:

The longer the window, the smoother the line, and the more lagged it is relative to fast price moves. This trade-off — smoothness vs lag — is the single most important design choice in any technical indicator. A 22-day MA captures monthly trends but reacts quickly; a 252-day MA defines the long-run trend but turns slowly.

Rolling volatility: the standard deviation of returns

Intuition

Volatility is just statistics jargon for “how wild are the daily moves?”. Technically it’s the standard deviation of returns — but you can think of it as the typical size (positive or negative) of a daily wiggle. A volatility of 1% means “on a normal day, the price moves by roughly ±1%”. The Greek letter $\sigma$ (sigma) is the standard symbol for it.

For a return series, rolling(window).std() computes the sample standard deviation over the window. The word “sample” just means we’re estimating from observed data, as opposed to knowing the “true” underlying value. This is the empirical analog of $\sigma$, the most common single-number summary of risk for an asset.

vol_22 = ret.rolling(22).std()
vol_22.plot(title="22-day rolling volatility of daily returns")

A typical equity (stock) series has a daily standard deviation in the neighborhood of $0.5\%$–$2\%$. It is not constant in time — periods of calm (vol ~0.5%) alternate with periods of crisis (vol > 3%). This phenomenon, called volatility clustering — turbulent days tend to come in groups, like aftershocks after an earthquake — is one of the empirical regularities every financial model must accommodate.

Why $\sigma$ measures risk

The intuition is mechanical. If returns are roughly symmetric around zero, then the standard deviation tells you the typical size of a deviation from the mean — both upward and downward. A 22-day vol of $0.01$ implies a one-day shock of about 1% is normal; a $0.02$ shock is roughly two standard deviations, and a $0.05$ shock would be a 5-sigma event under a normal distribution.

The reality is messier — return distributions are fat-tailed, with more extreme events than a normal distribution predicts — and we will refine the risk measure in Chapter 4 with Value-at-Risk and Expected Shortfall. For now, $\sigma$ is the right starting point.

Position sizing with rolling vol

A practical application: suppose you have $1M of capital (the money you have available to invest) and want to allocate it to QQQ such that your portfolio’s daily standard deviation does not exceed 1% ($10,000). If today’s rolling 22-day vol of QQQ daily returns is $\sigma = 0.012$, then the dollar position size $\$X$ (the amount you actually put into the trade — could be smaller than your capital) satisfies

\[ \$X \cdot \sigma = \$10{,}000 \implies \$X = \$10{,}000 / 0.012 = \$833{,}333. \]

Equivalently, you would deploy about 83% of your capital. When QQQ vol spikes to $0.025$, the same risk budget (the amount of daily wiggle you’ve decided to tolerate) would require shrinking the position to $400,000 — 40% of capital. This style of position sizing, called volatility targeting (sizing your position up when markets are calm and down when they’re stormy, to keep your daily risk roughly constant), is one of the simplest and most powerful risk-management tools in quant trading.

Annualizing volatility: the $\sqrt{252}$ rule

Where you’ll see this: every hedge fund factsheet and every Bloomberg terminal reports volatility per year, but the raw calculation almost always happens on daily data. The conversion uses one famous number — $\sqrt{252}$ — and applying it incorrectly is the most common mistake in student finance projects.

Intuition

Means add up linearly with time (a 0.04% daily mean over 252 days is roughly 10% per year), but volatilities grow more slowly — only with the square root of time. The deeper reason: daily wiggles partially cancel each other out, so a year of random walks doesn’t accumulate 252× the daily noise, only about $\sqrt{252} \approx 15.9\times$.

The rule

If you have an estimate of $\sigma_{\text{daily}}$ — the standard deviation of daily returns — and you want $\sigma_{\text{annual}}$, the rule is

\[ \sigma_{\text{annual}} = \sigma_{\text{daily}} \cdot \sqrt{252}. \]

The 252 is the number of US trading days in a year (it varies slightly across calendars; people use 252 for US equities, 256 for many futures markets, 260 for FX). Square-root, not linear: this is the famous distinguishing feature of how variance scales with time.

Why $\sqrt{T}$ and not $T$?

The intuition is that returns over different days are approximately independent. Independence is the key assumption: variances of independent random variables add. If $r_1, r_2, \ldots, r_T$ are independent with common variance $\sigma^2$, then

\[ \text{Var}(r_1 + r_2 + \cdots + r_T) = T \sigma^2, \]

and therefore

\[ \sigma(r_1 + \cdots + r_T) = \sqrt{T} \sigma. \]

In contrast, expected values add linearly: $\mathbb{E}[r_1 + \cdots + r_T] = T \mu$. Hence:

\[ \mu_{\text{annual}} = 252 \, \mu_{\text{daily}}, \qquad \sigma_{\text{annual}} = \sqrt{252} \, \sigma_{\text{daily}}. \]

The combination is the annual Sharpe ratio:

\[ \text{SR}_{\text{annual}} = \frac{\mu_{\text{annual}}}{\sigma_{\text{annual}}} = \frac{252 \mu_{\text{daily}}}{\sqrt{252} \sigma_{\text{daily}}} = \sqrt{252} \cdot \text{SR}_{\text{daily}}. \]

Sharpe scales with $\sqrt{T}$ also — a strategy with a daily Sharpe of 0.06 has an annualized Sharpe of about $0.06 \cdot \sqrt{252} \approx 0.95$.

Some practitioner numbers to memorize

Asset	Approx. daily $\sigma$	Approx. annual $\sigma$
Short-term Treasuries (SHY)	0.05%	0.8%
Investment-grade credit (LQD)	0.3%	5%
20-yr Treasuries (TLT)	0.8%	13%
S&P 500 (SPY)	1.0%	16%
NASDAQ-100 (QQQ)	1.3%	20%
Bitcoin	3.5%	56%

These are order-of-magnitude figures over multi-year windows; the realized vol any given year can deviate substantially. The point is to develop a sense of scale — an equity allocation with 5% annualized vol is suspect (probably hedged or stale-priced); a fixed-income strategy with 30% annualized vol is taking equity-like risk.

The independence assumption is approximate

The $\sqrt{T}$ rule assumes daily returns are i.i.d. In reality they exhibit volatility clustering (vol today predicts vol tomorrow) and modest return autocorrelation (especially negative at the daily horizon for individual stocks, and positive at the monthly horizon for momentum). The $\sqrt{T}$ approximation is good enough for back-of-envelope work; for production risk models, GARCH and realized-volatility models do better.

Worked example: SPY end-to-end

Where you’ll see this: SPY (the ETF that tracks the S&P 500, the most-traded fund on earth) is the standard “first analysis” object in quant finance. Pasted-together versions of the script below run on every desk every morning. If you can build this end-to-end from scratch, you have the core of a quant intern’s daily workflow.

We close the chapter with a full end-to-end example that exercises every concept introduced above: load SPY daily prices, build a daily return series, resample to monthly, plot the equity curve, and compute rolling volatility — all in a single self-contained script.

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# 1. Download adjusted-close SPY prices.
spy = yf.Ticker("SPY").history(start="2010-01-01")["Close"]
spy.name = "SPY"

# 2. Daily simple returns.
daily_ret = spy.pct_change().dropna()

# 3. Monthly returns (compound the daily returns within each month).
monthly_ret = (1 + daily_ret).resample("BM").prod() - 1

# 4. Equity curves from monthly and daily returns.
eq_daily   = (1 + daily_ret).cumprod()
eq_monthly = (1 + monthly_ret).cumprod()

# 5. Rolling 22-day annualized volatility.
vol_22d = daily_ret.rolling(22).std() * np.sqrt(252)

# 6. Summary statistics.
mu_d, sd_d = daily_ret.mean(), daily_ret.std()
SR_ann = (mu_d * 252) / (sd_d * np.sqrt(252))
print(f"Daily mean return: {mu_d:.5f}")
print(f"Daily std (vol):    {sd_d:.5f}")
print(f"Annualized mean:    {mu_d*252:.4f}")
print(f"Annualized vol:     {sd_d*np.sqrt(252):.4f}")
print(f"Annualized Sharpe:  {SR_ann:.3f}")

fig, axes = plt.subplots(2, 1, figsize=(9, 6), sharex=True)
eq_daily.plot(ax=axes[0], color="#1a4d80", linewidth=1.0, label="Daily-compounded")
eq_monthly.plot(ax=axes[0], color="#c43d3d", linewidth=1.3, label="Monthly-compounded")
axes[0].set_title("SPY equity curve, starting at \$1")
axes[0].legend(); axes[0].grid(True, alpha=0.3)

vol_22d.plot(ax=axes[1], color="#7a3f9e", linewidth=0.9)
axes[1].set_title("SPY rolling 22-day annualized volatility")
axes[1].set_ylabel("Annual vol")
axes[1].grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

A live version of the same pipeline, with a synthetic SPY stand-in so it runs offline:

A few features to read off the output.

The two equity curves — daily-compounded and monthly-compounded — line up almost exactly, because compounding is associative: aggregating daily returns into monthly first and then compounding gives the same wealth as compounding daily throughout. The monthly curve is just a downsampled version of the daily one.

The volatility series shows the two stress periods clearly as spikes. In the calm middle years annualized vol sits around 12–16%, in line with the long-run SPY number. During the engineered crises it briefly exceeds 40%. The shape of this curve — long quiet stretches punctuated by sharp peaks — is universal across markets.

In practice

At any equity-strategy hedge fund, the script above is the first thing a researcher runs on any new ticker — usually as a Jupyter notebook cell (Jupyter is an interactive Python environment, the standard tool for exploratory data work). The equity curve plus rolling vol plot is the standard “first card” in a strategy review deck. If a researcher cannot reproduce these numbers from scratch, they cannot do the rest of the work. Drill these patterns until they are reflex.

Summary

The big ideas of this chapter, condensed:

A market price series is a time-indexed pandas Series. The index is the ground truth; everything else is computed from it.
Use adjusted close prices (split- and dividend-adjusted) for return calculations. Always check whether the data is adjusted before computing anything.
Simple return $R_t = P_t/P_{t-1} - 1$, in pandas: price.pct_change(). Use for portfolios and reporting.
Log return $\ell_t = \ln(P_t/P_{t-1})$, in pandas: np.log(price/price.shift()). Use for time aggregation and statistical modelling.
Cumulative gross return: (1+r).cumprod() or np.exp(lr.cumsum()). This is the equity curve.
Resampling: .resample("BM").last() on prices, (1+r).resample("BM").prod()-1 on returns.
Rolling stats: .rolling(n).mean(), .rolling(n).std(). Standard windows are 22 (month), 63 (quarter), 252 (year).
Annualize volatility with $\sigma_{\text{annual}} = \sqrt{252} \, \sigma_{\text{daily}}$. Mean returns scale by $252$; Sharpe scales by $\sqrt{252}$.

The next chapter extends this machinery to portfolios — weighted combinations of return series — and to the matrix language of covariance, correlation, and diversification.

Exercises

Exercise 1 — Read and verify a price series

Download AAPL daily history for the last 5,000 trading days with yfinance. Then:

Print the dtype of the index and confirm it is a DatetimeIndex.
Print the first and last dates, and the number of rows.
Plot the Close column. On the same axes, overlay a 252-day moving average.
Identify the day of the 4-for-1 split on 31 August 2020. What value does the Stock Splits column take on that date? What does the Close column do across that date — does it jump or is it adjusted smooth?

Exercise 2 — Simple vs log returns

Using the AAPL series from Exercise 1, construct two columns: simpleR = Close.pct_change() and logR = np.log(Close).diff(). Then:

Plot both as histograms on the same figure. How different do they look?
Compute and report simpleR.mean(), logR.mean(), simpleR.std(), logR.std() over the full sample.
Verify numerically that $\ell_t = \ln(1 + R_t)$ for the most recent 5 observations. Compute the maximum absolute discrepancy across the entire series.

Exercise 3 — Build an equity curve two ways

Continuing with AAPL:

Build eq1 = (1 + simpleR).cumprod().
Build eq2 = np.exp(logR.cumsum()).
Plot them on the same axes. They should be visually identical.
Compute (eq1 - eq2).abs().max() to verify the equivalence numerically. What order of magnitude is the discrepancy, and where does it come from?
Report the final wealth from $1 over the full sample, and the implied CAGR over the period.

Exercise 4 — Daily to monthly

From the AAPL daily return series:

Compute monthly returns two ways: (a) resample the price with .resample("BM").last() and take pct_change(); (b) resample the return with (1 + ret).resample("BM").prod() - 1. Confirm they agree.
Report the monthly mean and standard deviation of returns. Compare to the daily mean and standard deviation scaled by 21 and $\sqrt{21}$ respectively (since there are ~21 trading days per month). How close is the empirical scaling to the i.i.d. prediction?
Plot monthly returns as a bar chart. Highlight in red any month in which the absolute return exceeds 10%. How many such months are there?

Exercise 5 — Rolling vol and position sizing

Suppose you manage $1M and target a daily portfolio standard deviation of $10,000 (1% of capital). Using AAPL:

Compute a 22-day rolling standard deviation of daily returns.
For each date, compute the dollar position size that would have hit the 1% vol target.
Plot the time series of position sizes (in $). On which dates was the position smallest (i.e. when was AAPL most volatile)?
What is the average position size over the sample? How does it compare to the $1M cap?
Bonus: clip the position size at $1M (you cannot lever beyond your capital in this exercise). What fraction of dates is the position constrained at the cap?

A discipline checklist before any return calculation

Before computing returns on any new dataset, work through this checklist:

Is the index a DatetimeIndex? print(type(df.index)).
Is the index sorted ascending? df.index.is_monotonic_increasing.
Are there missing dates inside the range? pd.date_range(df.index.min(), df.index.max(), freq="B").difference(df.index).
Are prices adjusted for splits and dividends? Spot-check a known split date.
Are there NaN values in the price column? df["Close"].isna().sum().

A return series computed on a dataset that fails any of these checks is suspect — and finance has a long tradition of strategies that looked profitable until someone re-ran the analysis with a properly cleaned input.

Chapter Introduction

Where you’ll see this: the second half of this chapter is what every “fund report” you’ll ever read is really made of. When CNBC shows a graphic with bullet points “annual return: 12% — volatility: 18% — max drawdown: -25% — Sharpe: 0.6”, those four numbers are exactly what we now compute, by hand, in Python.

From Returns to Risk

Chapter 3 stopped at the construction of returns. Returns are the raw object — a clean, dimensionless number per period — and they are nearly useless on their own. A stock that returned $+0.4\%$ yesterday could be a quiet dividend payer or a leveraged technology name caught in a temporary pause. Two funds that delivered the same $12\%$ annual return could have travelled there along radically different paths: one ground steadily upward, the other careened through a $40\%$ drawdown and a near-bankruptcy event before recovering. The investor who finds out only at year-end what happened in between has been flying blind.

This chapter is about the diagnostics that make a return series legible. We will compute four numbers — volatility, drawdown, Sharpe ratio, correlation — that together transform a column of returns into a defensible risk profile (the standard summary of “how risky is this thing, in what ways?”). Each is one line of pandas code in real work. Each carries a non-trivial amount of statistical theory underneath. The combination is what every institutional report, every hedge-fund tear sheet (a one- or two-page fund performance summary), and every robo-advisor dashboard rests on.

Why These Four Statistics, in This Order

There is a deliberate logic to the sequence. Volatility is the simplest one-number summary of risk: the standard deviation of returns. It captures short-term fluctuation. It is symmetric in gains and losses, and it scales with the square root of the horizon — a fact we will see force itself onto every annualization choice you ever make. Drawdown captures something volatility cannot: persistent loss. A portfolio that drifts down 30% over six months and stays there has caused real pain to its holders, even if its daily standard deviation is unremarkable. Drawdown is the metric an end investor actually feels in their stomach. Sharpe ratio is the per-unit-of-risk return, the cleanest comparator across strategies of different scale. Correlation is the bridge from one asset to two — and from two assets to a portfolio. Correlation is what diversification trades in. Without it, the entire portfolio-construction industry has no language.

We close with two ideas that connect the dots: the volatility drag that makes geometric return strictly below arithmetic return for any risky asset, and a complete worked example — a 60/40 SPY/AGG portfolio analyzed end-to-end.

The big picture

Returns describe outcomes. The four statistics in this chapter describe the path that produced those outcomes.

Two assets can have identical mean returns and tell completely different risk stories.

Volatility and the Square-Root-of-Time Rule

Why this matters: “this fund has 12% volatility” — what does that even mean, and how do you tell whether that’s risky? This section makes the number concrete and gives you a mental yardstick (e.g. SPY ≈ 16%, Bitcoin ≈ 55%) so you can immediately sanity-check any volatility claim you see.

Intuition

Volatility is the typical size of a price wiggle. If a fund has 20% annualised volatility, then in a typical year its return will be roughly within ±20% of its mean. Higher volatility = wilder ride. The Sharpe ratio (later in the chapter) is essentially “return per unit of this wiggle”.

Definition

The starting point is simple. Given a series of period returns $r_1, r_2, \ldots, r_T$ (just “returns on day 1, day 2, …, day T” — capital $T$ is the total number of days in the sample), the sample volatility is the sample standard deviation:

\[ \hat{\sigma} = \sqrt{\frac{1}{T-1}\sum_{t=1}^{T}(r_t - \bar{r})^2}, \]

where $\bar{r} = \frac{1}{T}\sum_{t=1}^{T} r_t$ is the sample mean return (the average daily return). The $T-1$ in the denominator is the standard Bessel correction — a tiny statistical adjustment that fixes a bias when you estimate a standard deviation from a sample (you lost “one degree of freedom” by using the sample mean instead of the true mean). In pandas, this is what returns.std() returns by default — and the choice matters: NumPy’s np.std() defaults to $T$ in the denominator instead. For any return series you will care about in this course, $T$ is large enough that the difference is cosmetic, but you should know which library uses which convention before you reproduce someone else’s number.

Why Standard Deviation, Not Variance?

Variance has the wrong units. If returns are reported in decimal form, variance is in $\text{decimal}^2$, which is uninterpretable. Standard deviation has the same units as the returns themselves — a daily volatility of $0.012$ means “a typical daily move is roughly $1.2\%$.” This makes it directly comparable to the return itself, and that comparability is exactly what powers the Sharpe ratio later in the chapter. Variance is what the math uses internally (it adds linearly when returns are independent); volatility is what the human reports.

The Square-Root-of-Time Scaling Rule

Almost every volatility number you will read in a research note, a fund factsheet, or a Bloomberg terminal is annualized. The raw computation, however, is almost always done at the daily frequency, because that is the highest-frequency clean data most investors have. The conversion from daily to annual uses a single, simple, frequently-misapplied rule:

\[ \sigma_{\text{annual}} = \sigma_{\text{daily}} \cdot \sqrt{252}. \]

The factor $\sqrt{252}$ is not arbitrary. It comes from an assumption: if daily returns are independent and have the same variance each day, then variance of the $T$-day sum is $T$ times the one-day variance, and standard deviation is $\sqrt{T}$ times the one-day standard deviation. The U.S. equity market trades roughly $252$ days per year (every business day minus holidays), so a one-year horizon corresponds to $T \approx 252$ in this formula.

More generally, for any aggregation factor $k$:

\[ \sigma_{k\text{-period}} = \sigma_{\text{1-period}} \cdot \sqrt{k}. \]

Monthly returns annualize by $\sqrt{12}$. Weekly returns annualize by $\sqrt{52}$. Daily returns annualize by $\sqrt{252}$. Hourly returns over a $24/7$ market like crypto annualize by $\sqrt{24 \cdot 365}$.

The two assumptions hidden inside √T

The square-root rule assumes independence and identical distribution of returns across periods. Both fail in the real world:

Autocorrelation. Returns exhibit small but non-zero autocorrelation at daily frequencies and substantial autocorrelation at longer horizons (momentum, mean reversion).
Volatility clustering. Big moves cluster — a $-3\%$ day is more likely to be followed by another large-magnitude day than the i.i.d. assumption allows. GARCH models exist precisely to model this departure.

In practice, $\sqrt{T}$ scaling is still the industry default — it is wrong in the small, defensible in the large, and almost always the number you will be asked to compare to.

A Worked Numerical Example

Suppose a stock has a daily volatility of $\hat{\sigma}_{\text{d}} = 0.015$, i.e. $1.5\%$ per day. Then:

$\sigma_{\text{weekly}} = 0.015 \cdot \sqrt{5} \approx 0.0335$ (about $3.35\%$ per week)
$\sigma_{\text{monthly}} = 0.015 \cdot \sqrt{21} \approx 0.0687$ (about $6.87\%$ per month)
$\sigma_{\text{annual}} = 0.015 \cdot \sqrt{252} \approx 0.2381$ (about $23.8\%$ per year)

A $23.8\%$ annualized volatility is a fair description of a typical large-cap U.S. equity over the last twenty years — for reference, SPY itself has run between $13\%$ and $22\%$ annualized depending on the regime, with a long-run average near $16\%$.

The blue bars are the volatilities measured directly on resampled return series at each horizon; the red bars are what the $\sqrt{k}$ rule predicts from the daily number alone. They line up closely because the simulated returns are genuinely i.i.d. The “×” annotations above each pair are the volatility ratios relative to daily — $\sqrt{5} \approx 2.2$, $\sqrt{21} \approx 4.6$, $\sqrt{252} \approx 15.9$ — exactly the multipliers practitioners memorise. In real data the agreement is approximate (volatility clustering distorts it), but the qualitative shape is universal: variance adds with time, so standard deviation grows with $\sqrt{T}$.

Volatility in Code

The pattern is so common in practice that you should memorize the line:

ann_vol = ret.std() * np.sqrt(252)

It is the one-liner that converts a raw return series into the number every risk report displays.

Pitfalls When Annualizing

Three mistakes recur even among experienced practitioners.

Pitfall 1 — Wrong calendar factor. The factor $252$ is for U.S. equities. International markets vary ($245$ for the UK, $246$ for Japan, $250$ for the Eurozone in some conventions). FX trades around the clock and conventions differ: some shops use $252$, some use $260$. Crypto is genuinely $365$. Mixing conventions across asset classes will produce volatilities that are off by several percent.

Pitfall 2 — Annualizing a number that is not a daily return. If you accidentally annualize a monthly return series with $\sqrt{252}$, you will multiply by approximately $15.87$ instead of $\sqrt{12} \approx 3.46$. The resulting “annualized volatility” of $300\%$ on a normal equity portfolio is a giveaway, but more subtle mistakes — using a $\sqrt{252}$ factor on weekly data, for instance — produce numbers that look superficially plausible.

Pitfall 3 — Annualizing volatility but not the mean, or vice versa. Means scale linearly with horizon, volatilities scale with $\sqrt{T}$. Annualizing mean return with a factor of $252$ and volatility with a factor of $\sqrt{252}$ is correct. Using $252$ for both, or $\sqrt{252}$ for both, is a common bug in homework solutions and student code.

A sanity check

The annualized Sharpe ratio for a well-diversified equity portfolio sits roughly between $0.3$ and $0.7$ over long horizons. If your annualized mean is $25\%$ and your annualized volatility is $5\%$, giving a Sharpe of $5.0$, you have almost certainly mis-annualized one of the two numbers.

Drawdowns and the Running Maximum

Where you’ll see this: when a fund manager hides behind “but my volatility is only 10%!” while their investors are panicking — that’s the gap drawdown reveals. A YouTube finance bro proudly showing his portfolio is at an all-time high tells you nothing about the 40% drop he might have lived through last year. Drawdowns are how you measure that pain.

Intuition

A drawdown is how far below your previous best you currently are. If your portfolio peaked at $100, then fell to $70, your drawdown right now is -30%. The maximum drawdown over your whole history is the worst pain you ever endured — the question “how bad was the worst slump?” reduced to a single number.

Volatility tells you the average magnitude of daily wiggles. It says nothing about the worst experience an investor has actually lived through. For that, we need the drawdown — the percentage decline from a portfolio’s running peak (its previous best level).

Definition

Let $W_t$ denote the cumulative wealth at time $t$ — that is, the value of one dollar invested at time $0$ and grown by the realized returns through time $t$:

\[ W_t = \prod_{s=1}^{t}(1 + r_s). \]

The running maximum is the largest value of wealth attained up to and including time $t$:

\[ M_t = \max_{1 \le s \le t} W_s. \]

The drawdown at time $t$ is the percentage shortfall from that running peak:

\[ D_t = \frac{W_t - M_t}{M_t} = \frac{W_t}{M_t} - 1. \]

By construction $D_t \le 0$ for all $t$: drawdown is either zero (the portfolio is at a new all-time high) or negative (the portfolio is below its previous best). The maximum drawdown is the worst of these values across the sample:

\[ \text{MDD} = \min_{1 \le t \le T} D_t. \]

A maximum drawdown of $-0.45$ means: “at the worst point of the sample, the portfolio had lost $45\%$ of its peak value.” For context: SPY’s drawdown from October 2007 to March 2009 (the global financial crisis) was about $-55\%$; QQQ’s drawdown from March 2000 to October 2002 (the dot-com bust) was about $-83\%$ — meaning a tech investor who bought at the peak had to wait fifteen years to break even.

Why Drawdown Is the Risk Metric Investors Actually Feel

Volatility is a property of the return distribution. Drawdown is a property of the price path. Two return distributions with identical means and variances can produce wildly different drawdown experiences if the timing of the negative returns differs. A long run of small losses concentrated together produces a deep drawdown; the same losses scattered randomly across the sample produce a much milder one.

End investors — especially retail investors and pension fund trustees — almost always react to drawdown, not volatility. When the news headline reads “Portfolio down 30% from peak,” redemptions follow. When it reads “Portfolio volatility 18% annualized,” nothing happens. This asymmetry is what makes drawdown the single most important number on a fund tear sheet, even though academic finance has spent fifty years developing more sophisticated risk measures.

Computing Drawdowns in Pandas

The cummax() method on a pandas Series returns the running maximum (cummax = “cumulative maximum” — for each row $t$, it gives you the largest value seen so far, i.e. from row 1 through $t$). This is exactly the $M_t$ we need. The full computation is three lines:

What this gave us: four numbers that together describe the painful side of the track record — where you ended, how deep the worst slump was, when it happened, and whether you’ve recovered. .iloc[-1] is pandas slang for “the last row” (negative indices count from the end, just like in plain Python lists).

Anatomy of a drawdown calculation

A drawdown is built in three stages, each producing its own Series. The diagram below stacks the three Series vertically — equity curve on top, running max in the middle, drawdown on the bottom — and labels what each step does to the row above it.

Read the diagram top-to-bottom. The green dots on the equity curve mark dates where $W_t$ ties its own running peak — those are the only dates where the drawdown is exactly zero. The middle red step function is the non-decreasing envelope: it only ever ratchets up, never down. The bottom red filled region is the gap between the two — exactly the drawdown — and its lowest point is the maximum drawdown. Three Series, two lines of pandas, one risk number.

The drawdown series is also useful in its own right, not just its minimum. Plotting $D_t$ over time produces an “underwater chart” — a visualization beloved by hedge-fund allocators because it shows simultaneously how deep and how long the drawdown was.

Visualizing the Underwater Chart

The underwater chart visualizes two distinct dimensions of pain: the depth (how far below water did we go?) and the duration (how long did it take to climb back to a new high?). A shallow but multi-year drawdown is a different animal from a deep but quickly-recovered one, and they require different conversations with investors. The duration of the longest drawdown — sometimes called the time underwater — is a separate metric that institutional allocators frequently compute alongside the max drawdown.

Drawdown is sample-dependent

Maximum drawdown is a minimum-order statistic over the sample. It is therefore extremely sensitive to the sample endpoints. Reporting “max drawdown of $-12\%$” on a five-year sample that happens to omit 2008 and 2020 is technically true and economically misleading. When you read a tear sheet, the first thing to check is the sample period.

The Sharpe Ratio

Where you’ll see this: every fund pitch deck contains a line like “we achieved a Sharpe ratio of 2.5!” — and most of those claims are either over-fitted, computed on too-short a sample, or use leverage that the volatility hides. By the end of this section you’ll know exactly how to compute it yourself, what numbers are plausible, and what counts as a red flag.

Intuition

The Sharpe ratio is “reward divided by risk”. If two funds both made 10% last year, but Fund A’s volatility was 5% while Fund B’s was 30%, Fund A is the better fund — it earned the same reward with less stomach-churning. Sharpe makes that comparison numerical: higher Sharpe = more return per unit of risk.

From Two Numbers to One

So far we have produced two numbers from a return series: a mean $\bar{r}$ (average return) and a volatility $\hat{\sigma}$ (the wiggle size). Neither alone tells you whether the investment was worth it. A $20\%$ return is excellent if it came with $10\%$ volatility; it is mediocre at best if it came with $40\%$ volatility, because at that risk level you could have built a leveraged Treasury position — a borrowed-money bet on safe US government bonds — that produced the same expected return with cleaner downside behavior.

William Sharpe’s 1966 paper proposed the cleanest one-number summary of this trade-off: the Sharpe ratio, defined as the excess return per unit of volatility. In the formula below, $\bar{r}$ is your investment’s average return, $r_f$ is what you could have earned doing nothing risky (the “risk-free rate” — see below), and $\hat{\sigma}$ is the volatility we just defined:

\[ \text{SR} = \frac{\bar{r} - r_f}{\hat{\sigma}}. \]

Here $r_f$ is the risk-free rate — the return on essentially safe assets like short-term US government bills, your “do nothing risky” benchmark — measured over the same horizon as the returns. The numerator measures how much you earned over and above that safe alternative. The denominator scales by how much fluctuation you had to live through to earn it. The ratio is a slope: the steeper the slope, the more reward per unit of risk.

For decades, Sharpe ratio has been the single most cited performance metric in finance. It appears in every fund factsheet, every consultant evaluation, every robo-advisor dashboard. Sharpe himself shared the 1990 Nobel Memorial Prize in part for this insight.

Annualizing the Sharpe Ratio

Because mean returns scale linearly with horizon and volatilities scale with $\sqrt{T}$, the Sharpe ratio scales with $\sqrt{T}$:

\[ \text{SR}_{\text{annual}} = \text{SR}_{\text{daily}} \cdot \sqrt{252}. \]

The derivation is a one-line calculation. If daily excess return has mean $\mu_d$ and standard deviation $\sigma_d$, then annual excess return has mean $252 \mu_d$ and standard deviation $\sigma_d \sqrt{252}$, so

\[ \text{SR}_{\text{annual}} = \frac{252 \mu_d}{\sigma_d \sqrt{252}} = \frac{\mu_d}{\sigma_d} \cdot \sqrt{252} = \text{SR}_{\text{daily}} \cdot \sqrt{252}. \]

This is why a daily Sharpe ratio of $0.05$ is actually excellent ($0.05 \cdot \sqrt{252} \approx 0.79$ annualized), and a daily Sharpe of $0.01$ is mediocre ($\approx 0.16$ annualized). The raw daily number looks tiny because $\sqrt{252}$ is large.

Benchmarks for Sharpe Ratios

Calibration is everything. The following table summarizes roughly what to expect across asset classes and strategies, based on long-run U.S. data:

Strategy	Typical annualized Sharpe
Cash / T-bills (the risk-free leg)	$0.00$ by construction
U.S. equities (SPY, long-only)	$0.4$ – $0.5$
U.S. aggregate bonds (AGG)	$0.3$ – $0.5$
60/40 balanced portfolio	$0.5$ – $0.7$
Diversified hedge-fund composite	$0.5$ – $0.8$
Top-decile quant equity market-neutral	$1.0$ – $1.5$
High-frequency market-making (private)	$3.0$ – $10.0$+

A claim of annualized Sharpe above $2$ on a long-only equity strategy (one that only buys stocks, no shorting, no derivatives) should be treated with deep skepticism — either the sample is too short, the strategy is using leverage (borrowed money) that the reported volatility ignores, or there is a methodological error somewhere. Sharpe ratios above $1$ are rare and almost always come from short-horizon, high-turnover strategies — i.e. ones that trade many times a day and can only manage a small amount of money before their edge disappears.

Computing the Sharpe Ratio

What this gave us: the four numbers you’d put on a one-page strategy summary — average return, volatility, daily Sharpe (which always looks tiny), and the annualised Sharpe that everyone actually quotes.

A few real-world details that matter:

Use excess returns, not raw returns, when computing the standard deviation in the denominator. (Excess return = your return minus the risk-free rate.) In practice, the difference is tiny at daily frequencies because $r_f$ is small and nearly constant, but the correct definition uses the std of the excess series.
Be honest about $r_f$. A 5-year sample spanning 2020–2024 saw the U.S. risk-free rate vary from near $0\%$ to over $5\%$. Using a single average is fine for a rough number; using a time-varying $r_f$ from the FRED 3-month T-bill series (FRED is the free macro database run by the St. Louis Fed; T-bill = short-term Treasury bill, the canonical “safe asset”) is what a real performance-attribution system does.
The Sharpe ratio is itself a statistic with sampling error — meaning that if you re-ran history with different luck, you’d get a different Sharpe even from the same strategy. The standard error (a measure of how much that “luck wobble” affects the number) of an annualized Sharpe ratio computed from $T$ daily observations is approximately $\sqrt{(1 + 0.5 \cdot \text{SR}^2)/T} \cdot \sqrt{252}$. For a five-year sample ($T = 1260$) and a true Sharpe of $0.5$, that standard error is around $0.32$. Two strategies with reported Sharpes of $0.6$ and $0.9$ are not reliably distinguishable on a five-year sample. Most casual comparisons of Sharpe ratios ignore this.

Why the Sharpe Ratio Alone Is Not Enough

The Sharpe ratio compresses a return distribution into two moments — mean and volatility (in statistics, “moments” are summary numbers like mean, variance, skewness, kurtosis that describe the shape of a distribution) — and discards everything else. This is a feature when those two moments are sufficient (i.e. when returns are approximately normally distributed, the classic bell-curve shape). It is a bug when they are not, which is essentially always in finance.

Tails. Real return distributions have fatter tails than the normal distribution — meaning extreme events (big crashes, big rallies) happen much more often than a bell curve would predict. A strategy that earns small positive returns most days and occasionally suffers a catastrophic loss — selling out-of-the-money options (insurance-like contracts that pay zero most of the time but can lose huge amounts in a crash) is the canonical example — can have a beautiful Sharpe ratio for years and blow up in a single afternoon. The 1998 collapse of Long-Term Capital Management, a hedge fund run by Nobel laureates that lost $4.6 billion in months, followed exactly this profile. Volatility does not see tail risk, so Sharpe does not see tail risk.

Skewness. Two strategies with the same mean and same volatility can have very different skewness — the lopsidedness of the return distribution. Insurance-like strategies have negative skew (many small wins, occasional large losses — think of selling earthquake insurance: you collect premiums until the big one hits). Lottery-like strategies have positive skew (many small losses, occasional large wins). Investors strongly prefer positive skew, all else equal, but Sharpe is blind to skew.

Drawdown blindness. Sharpe says nothing about how losses are clustered in time. A strategy with steady drip-drip losses concentrated in a six-month window has the same Sharpe as one whose losses are scattered uniformly through the sample — but the first one terrifies investors. This is precisely the case we examined in the previous section.

This is why the modern performance report displays Sharpe alongside drawdown, alongside skewness and kurtosis (kurtosis = “how fat are the tails of the distribution” — higher kurtosis means more outliers than a bell curve), and increasingly alongside a tail-risk measure such as conditional value-at-risk (CVaR — the average loss on your worst-5% days, a more honest summary of crash risk than vol alone). Sharpe is the entry ticket. It is not the full performance picture.

The six portfolios are simulated tracks calibrated to occupy different corners of the risk plane. The “High-Sharpe trap” is the cautionary tale: its day-to-day volatility is low, so the Sharpe ratio is flattering, yet a single concentrated stress event pulls its drawdown deep into the red. A practitioner who screens funds on Sharpe alone would have ranked it near the top; an allocator who also checked the drawdown column would have rejected it on sight. Reading risk in two dimensions — Sharpe and drawdown — is the minimum defensible standard.

Key takeaway

A high Sharpe ratio is necessary but not sufficient evidence of a good strategy. Always pair Sharpe with maximum drawdown and a glance at the return distribution shape (skewness, kurtosis, worst-day return) before drawing conclusions.

Cross-Asset Correlation

Why this matters: “diversification reduces risk” is the most repeated cliché in personal finance. But how much it reduces risk depends entirely on a single number: the correlation between the things you bought. If you own AAPL and MSFT, you’re not very diversified — they march together. Correlation tells you, precisely, what counts as diversified.

Intuition

Correlation is a number between −1 and +1 that measures how synchronously two assets move. +1 means “they march in lockstep”. 0 means “they have nothing to do with each other”. −1 means “when one goes up, the other goes down by a proportional amount”. Diversification benefits come from correlations below +1.

From One Asset to Two

Every statistic so far has been a property of a single return series. Once we hold two assets, a new question becomes the dominant one: how do they move together? Correlation is the answer.

Recall from Chapter 3 (and from any prior statistics course) that the Pearson correlation between two return series $\{r^{(1)}_t\}$ and $\{r^{(2)}_t\}$ is:

\[ \rho_{1,2} = \frac{\text{Cov}(r^{(1)}, r^{(2)})}{\sigma_1 \, \sigma_2} = \frac{\sum_{t=1}^T (r^{(1)}_t - \bar{r}^{(1)})(r^{(2)}_t - \bar{r}^{(2)})}{\sqrt{\sum_{t=1}^T (r^{(1)}_t - \bar{r}^{(1)})^2 \cdot \sum_{t=1}^T (r^{(2)}_t - \bar{r}^{(2)})^2}}. \]

The metric lives in $[-1, +1]$. A value near $+1$ says the two assets march in lock-step; near $-1$ says they move oppositely; near $0$ says they are uncorrelated, at least in the linear sense. The pairwise correlations across $N$ assets form an $N \times N$ symmetric matrix with ones on the diagonal — the correlation matrix, which is the input every portfolio optimizer ever written consumes.

Typical Cross-Asset Correlations

Some long-run correlations from U.S. data are worth committing to memory because they organize how you think about diversification:

SPY and large-cap U.S. stocks (e.g. AAPL, MSFT): $\rho \approx 0.6$ to $0.8$. Large-caps inherit most of their daily variation from the market.
SPY and small-cap U.S. stocks (IWM): $\rho \approx 0.85$ to $0.95$ — closer than people expect.
SPY and developed international equities (EFA): $\rho \approx 0.7$ to $0.9$.
SPY and U.S. aggregate bonds (AGG): $\rho \approx -0.1$ to $+0.3$, highly regime-dependent. The post-2000 stock-bond correlation was generally negative; the post-2022 regime flipped it back to positive.
SPY and gold: $\rho \approx -0.1$ to $+0.2$, weakly negative on average.
SPY and Bitcoin: $\rho \approx 0.2$ to $0.5$ since 2020, despite the marketing.
Two random S&P 500 stocks in the same sector: $\rho \approx 0.5$ to $0.7$.
Two random S&P 500 stocks in different sectors: $\rho \approx 0.3$ to $0.5$.

The stock-bond correlation deserves its own warning: it is not a constant. Investors who built 60/40 portfolios in 1995–2020 on the assumption of $\rho \approx -0.3$ discovered in 2022 that the correlation can flip to $+0.5$ in a regime shift, and the portfolio’s diversification benefit largely evaporates in that regime.

Correlation Matrix in Pandas

The diagonal is identically one. The off-diagonal entries are the pairwise correlations. For visual analysis at larger scale, you would pass the matrix to a heatmap (e.g. seaborn.heatmap); the matrix itself is the object every risk model consumes.

Rolling Correlation

A single correlation number computed over the entire sample masks an important fact: correlations move. A rolling correlation computed over a 60-day window reveals when two assets are coupling and when they are decoupling. This is the diagnostic that flagged the 2022 stock-bond correlation flip months before consensus caught up.

The window length is a modeling choice: 20 days is reactive but noisy, 252 days (one year) is stable but slow to detect regime change. Sixty days is a common compromise. In any case, plotting the rolling correlation is usually more informative than reporting the full-sample number.

Correlation is linear association only

A correlation of zero does not mean two assets are independent. It means there is no linear association. Two assets can have $\rho = 0$ and still co-move strongly via a quadratic relationship (e.g. both move sharply when a third variable moves, regardless of direction). In finance, the most important non-linear effect is the tail correlation jump: pairs of assets that look uncorrelated in normal markets often become highly correlated during crises. The 2008 financial crisis is the canonical case — virtually every risky asset class became highly correlated as liquidity dried up.

Diversification: The Two-Asset Portfolio Variance

Why this matters: when a robo-advisor builds a “balanced portfolio” for you, this formula is doing the work behind the scenes. It is also the single insight Harry Markowitz won a Nobel Prize for in 1990 — so it’s worth understanding rather than just trusting the app.

Intuition

Suppose you hold two assets. The portfolio’s return is just a weighted average of the two returns (linear, intuitive). But the portfolio’s risk is not — it has an extra term that depends on how the two assets move together. When they’re less than perfectly correlated, that term makes the combined risk smaller than the average of the two individual risks. That gap is the “free lunch” of diversification.

Why Correlation Pays the Bills

Now the punchline. Correlation is not just a descriptive statistic — it directly determines how much risk reduction you get from holding more than one asset. This is the mathematical core of diversification, and it was the insight for which Harry Markowitz shared the 1990 Nobel Prize.

Consider a portfolio of two assets with weights $w_1$ and $w_2 = 1 - w_1$. A weight is just the fraction of your money in each asset — if you put 60% in stocks and 40% in bonds, then $w_1 = 0.6$ and $w_2 = 0.4$. They have to add up to 1 because you can’t allocate more than 100% of what you have (without borrowing). Let the assets have expected returns $\mu_1, \mu_2$ (Greek letter “mu” — the standard symbol for “mean”), volatilities $\sigma_1, \sigma_2$, and correlation $\rho$ (Greek letter “rho”). Portfolio return is the weighted average:

\[ r_p = w_1 r_1 + w_2 r_2, \qquad \mathbb{E}[r_p] = w_1 \mu_1 + w_2 \mu_2. \]

Portfolio variance, by contrast, is not a weighted average. It includes a cross-term:

\[ \sigma_p^2 = w_1^2 \sigma_1^2 + w_2^2 \sigma_2^2 + 2 w_1 w_2 \rho \sigma_1 \sigma_2. \]

The volatility is the square root:

\[ \sigma_p = \sqrt{w_1^2 \sigma_1^2 + w_2^2 \sigma_2^2 + 2 w_1 w_2 \rho \sigma_1 \sigma_2}. \]

The entire diversification story is in the third term, $2 w_1 w_2 \rho \sigma_1 \sigma_2$.

Three Special Cases

Case 1: $\rho = +1$ (perfect positive correlation). The variance formula collapses to $\sigma_p^2 = (w_1 \sigma_1 + w_2 \sigma_2)^2$, so $\sigma_p = w_1 \sigma_1 + w_2 \sigma_2$. Volatility is a weighted average; there is no diversification benefit at all. The two assets are economically the same risk in different packaging.

Case 2: $\rho = 0$ (uncorrelated). The cross-term vanishes. Variance is the weighted sum of variances: $\sigma_p^2 = w_1^2 \sigma_1^2 + w_2^2 \sigma_2^2$. For an equal-weighted portfolio with $\sigma_1 = \sigma_2 = \sigma$, this gives $\sigma_p = \sigma / \sqrt{2}$ — volatility drops by roughly $30\%$ just from holding two equal-volatility uncorrelated assets instead of one.

Case 3: $\rho = -1$ (perfect negative correlation). With the right weights, $\sigma_p$ can be driven all the way to zero. Specifically, setting $w_1 = \sigma_2 / (\sigma_1 + \sigma_2)$ produces $\sigma_p = 0$ — a risk-free portfolio out of two risky assets. This is the theoretical limit. In practice, no two real assets have $\rho = -1$, but pairs with $\rho$ near $-0.5$ exist (long-duration Treasuries vs. equities in some regimes) and produce meaningful risk reduction.

Why Less-Than-Perfect Correlation Is Free Money

This is the punchline of modern portfolio theory: as long as $\rho < 1$, combining two risky assets always produces a portfolio whose volatility is strictly less than the weighted average of the two individual volatilities. The reduction is proportional to $(1 - \rho)$. Diversification is a free lunch in volatility terms — the only free lunch most academics will admit exists in finance.

The intuition: when one asset zigs and the other zags, their fluctuations partially cancel, leaving a smoother combined path. The deeper the cancellation (more negative $\rho$), the smoother the path. Even small reductions in $\rho$ produce measurable risk reduction; this is why a portfolio of $50$ stocks from different sectors has dramatically lower volatility than a single stock, even though each individual stock might have $\rho \approx 0.5$ with each other.

A Numerical Illustration

The pattern: at $\rho = +1$ there is zero reduction; at $\rho = 0$ the portfolio volatility falls by roughly $29.3\%$ relative to the weighted average; at $\rho = -0.5$ it falls by about $50\%$.

Each curve plots portfolio volatility as a function of the weight in the first asset, for a different correlation. At $\rho = +1$ the curve coincides with the naive weighted-average dotted line — there is no diversification at all. As $\rho$ falls, the curve sags below that line, and the gap is the diversification benefit. At $\rho = -1$ the curve touches zero at the 50/50 point, which is the mathematical limit of two-asset diversification. Real cross-asset pairs sit in the $\rho \in [-0.3, 0.5]$ band, well inside this picture.

Beyond Two Assets

For $N$ assets, the portfolio variance formula generalizes to:

\[ \sigma_p^2 = \sum_{i=1}^N \sum_{j=1}^N w_i w_j \rho_{ij} \sigma_i \sigma_j = \mathbf{w}^\top \Sigma \mathbf{w}, \]

where $\Sigma$ is the $N \times N$ covariance matrix. The structure is identical — it is just a quadratic form. As $N$ grows, the average pairwise correlation $\bar{\rho}$ becomes the dominant driver of portfolio volatility; the individual variances matter less. This is why, when an equity portfolio grows to $30$ or $50$ stocks, additional names produce diminishing risk reduction: you have already absorbed most of the diversification you can get given the average pairwise correlation in the asset class.

Volatility Drag and the CAGR Approximation

Where you’ll see this: “this strategy averaged 15% per year!” — sounds great, except average can mean two different things, and which one the marketer uses changes the actual wealth you end up with by a lot. This section explains the silent gap that swallows real investor money.

Intuition

If a stock goes -50% one year and +50% the next, the arithmetic average is 0%. But you actually ended at $0.75 from each $1, i.e. a loss of 25%. The arithmetic mean is misleading whenever the asset is volatile, and the size of the lie grows with the square of the volatility. This invisible tax is called volatility drag.

The Arithmetic-Geometric Gap

There is one more piece of vocabulary every investor needs: the gap between arithmetic mean return (the everyday “add them up and divide by N” average) and geometric (compound) mean return (the per-period rate that actually generates the observed final wealth). They are not the same number, and the difference — the volatility drag — grows with volatility.

The arithmetic mean of a return series is what we have been computing all along:

\[ \bar{r}_{\text{arith}} = \frac{1}{T} \sum_{t=1}^T r_t. \]

The geometric mean is what an investor actually earned per period, accounting for compounding:

\[ \bar{r}_{\text{geom}} = \left( \prod_{t=1}^T (1 + r_t) \right)^{1/T} - 1. \]

The geometric mean is what is sometimes called the CAGR (Compound Annual Growth Rate) when computed at the annual frequency — it’s the “if I plug a single constant growth rate into a compound interest formula, what rate would reproduce the actual final wealth?” number.

The Volatility Drag Approximation

For “small” returns (which daily and even monthly returns are), there is a beautiful approximation that connects the two:

\[ \bar{r}_{\text{geom}} \approx \bar{r}_{\text{arith}} - \frac{\sigma^2}{2}. \]

The correction term $\sigma^2/2$ is the volatility drag. It is non-negative — volatility always pulls the realized geometric return below the arithmetic mean. The intuition is elementary: a $-50\%$ return followed by a $+50\%$ return averages to $0\%$ arithmetically, but leaves you at $75\%$ of your starting capital (because the second year’s +50% applies to your reduced $0.50, not your original $1). The volatility forced a permanent loss of capital that the arithmetic mean cannot see.

The derivation comes from a second-order Taylor expansion of $\log(1 + r)$ around $r = 0$:

\[ \log(1 + r) \approx r - \frac{r^2}{2}. \]

Taking expectations on both sides and recognizing that the geometric mean of $(1 + r_t)$ is $\exp(\mathbb{E}[\log(1 + r_t)])$:

\[ \mathbb{E}[\log(1 + r)] \approx \mu - \frac{\mathbb{E}[r^2]}{2} = \mu - \frac{\mu^2 + \sigma^2}{2} \approx \mu - \frac{\sigma^2}{2}, \]

where the last step drops $\mu^2/2$ as a higher-order term when $\mu$ is small.

Why This Matters in Practice

For an equity portfolio with annualized arithmetic mean $10\%$ and annualized volatility $20\%$, the volatility drag is:

\[ \frac{\sigma^2}{2} = \frac{0.20^2}{2} = 0.02 = 2\%. \]

That is the gap between the headline arithmetic mean ($10\%$) and the CAGR the investor actually compounds at ($\approx 8\%$). Two percent per year, compounded over a 40-year career, is the difference between $1.10^{40} \approx 45.3\times$ and $1.08^{40} \approx 21.7\times$ — roughly half the terminal wealth. Volatility drag is not a footnote. It is one of the largest line items in a long-horizon investor’s lifetime P&L.

Two operational consequences follow:

Always quote CAGR, not arithmetic mean, when reporting long-horizon returns. A fund that advertises “12% average annual return” while running a 30% volatility is misleading its investors. The actual compound rate is closer to $12\% - 0.30^2/2 = 7.5\%$.
Risk reduction has direct return consequences. Cutting volatility from $30\%$ to $20\%$ (via diversification) recovers $0.30^2/2 - 0.20^2/2 = 2.5\%$ of geometric return per year, even if the arithmetic mean is unchanged. This is the most overlooked argument for diversification: it does not just reduce risk, it raises long-run compound growth.

Memory aid

$\text{CAGR} \approx \text{Arithmetic mean} - \frac{1}{2} \sigma^2$.

For most equity portfolios you encounter, $\sigma^2/2$ is in the range of $1\%$ to $3\%$ per year. That is the volatility tax you pay every year for taking on risk.

Computing CAGR and Drag

What this gave us: five numbers that, side by side, show the volatility-drag formula working in practice: arithmetic mean minus drag really is approximately CAGR, validating the rule of thumb.

The “approx” line and the exact CAGR should agree to within a few basis points. When they disagree by more than that, the issue is usually a long sample over which the approximation $\mu^2/2 \approx 0$ breaks down.

Worked Example: A 60/40 SPY/AGG Portfolio

Why this matters: if you’ve ever heard the phrase “balanced portfolio” — in a personal-finance article, in an MBA class, from your parents’ financial advisor — they’re almost always talking about something close to 60/40. It is the default retirement portfolio for hundreds of millions of people, and this final worked example shows you every diagnostic you’d run on it before recommending it.

We now assemble every tool in this chapter into a single, end-to-end analysis of a classic balanced portfolio: $60\%$ U.S. equities (proxied by SPY) and $40\%$ U.S. investment-grade bonds — bonds rated as low-default-risk by credit agencies — proxied by AGG (an ETF tracking the broad US bond market). The $60/40$ allocation has been the default mix for U.S. pensions and individual retirement accounts for forty years. Understanding its risk-return profile is foundational.

For reproducibility in the browser, we simulate plausible daily return paths for SPY and AGG with statistical properties calibrated to long-run historical data: SPY at roughly $10\%$ annual mean return and $16\%$ annual volatility, AGG at $4\%$ mean and $5\%$ volatility, with a mildly negative correlation of $-0.2$.

Step 1: Generate the Return Data

Step 2: Build the 60/40 Portfolio

What this gave us: a new column "60/40" whose value each day is just 0.6 × SPY return + 0.4 × AGG return. The @ symbol in ret @ weights is Python’s matrix-multiplication operator — here it’s just a slick way to compute the weighted sum without a loop.

The portfolio return on day $t$ is $r_{p,t} = 0.6 \cdot r_{\text{SPY},t} + 0.4 \cdot r_{\text{AGG},t}$. That is the entire portfolio-construction step; the daily rebalancing assumption (rebalancing = adjusting the holdings back to the target 60/40 split after they drift) is the only thing keeping the weights fixed at $60/40$ over the sample. In real work, monthly or quarterly rebalancing is closer to industry practice — daily rebalancing produces nearly identical results when transaction costs are ignored.

Step 3: Volatility, Mean, and Sharpe for Each Series

The expected pattern: the $60/40$ portfolio has volatility much closer to AGG than to SPY (about $10\%$ versus SPY’s $16\%$ and AGG’s $5\%$), and its Sharpe ratio is higher than either component’s. This is the entire point of diversification — the portfolio’s risk-adjusted return exceeds that of its parts.

Step 4: Drawdowns and the Underwater Chart

Expect the $60/40$ portfolio’s maximum drawdown to be deeper than AGG’s but substantially shallower than SPY’s — typically around half of SPY’s. That is the risk-reduction benefit of the bond sleeve, made visible.

Step 5: Volatility Drag and CAGR

The volatility drag for SPY (around $1.3\%$ annually) is roughly four times that of AGG (about $0.13\%$). The $60/40$ portfolio’s drag sits in between but is much closer to the lower end because portfolio volatility is non-linear in the weights — diversification disproportionately cuts the drag.

Step 6: Putting It All Together

The full picture for this simulated sample:

Series	Annualized mean	Annualized vol	Sharpe	Max DD	CAGR
SPY	~10%	~16%	~0.44	~-25% to -35%	~8.7%
AGG	~4%	~5%	~0.20	~-7% to -10%	~3.9%
60/40	~7.6%	~10%	~0.46	~-13% to -18%	~7.1%

Three things to notice:

The $60/40$ Sharpe ratio is higher than either SPY or AGG individually. This is diversification at work — the portfolio has earned more return per unit of risk than any single component could.
The $60/40$ maximum drawdown is roughly half of SPY’s. The bond sleeve cushions equity crashes (assuming the negative or near-zero stock-bond correlation regime holds).
The $60/40$ CAGR is closer to SPY than to AGG, but the path it took to get there was much smoother. For an investor with finite tolerance for drawdown — i.e. every investor — that smoother path is the entire reason to hold AGG at all.

The recurring lesson

Diversification raises Sharpe, reduces drawdown, and improves compounding — simultaneously. It is the closest thing to a free lunch in finance, and the entire portfolio-management industry exists to exploit it. The only price is that the diversifying asset (here, AGG) must have $\rho < 1$ with the core asset. When that correlation flips — as it did in 2022 — the diversification benefit shrinks dramatically, and 60/40 portfolios deliver worse drawdowns than expected. Always monitor the rolling correlation.

Exercises

Exercise 1 — Annualizing volatility correctly

A research analyst sends you the following statistics for an emerging-markets equity fund:

Mean return: $0.08\%$ per trading day
Standard deviation of returns: $1.4\%$ per trading day

A second analyst sends you the equivalent statistics from a monthly dataset for the same fund:

Mean return: $1.7\%$ per month
Standard deviation of returns: $6.2\%$ per month

Compute the annualized mean and annualized volatility from each dataset (using $252$ trading days and $12$ months per year). They should approximately agree. Do they? If not, what is the most likely explanation? Hint: think about the assumptions underlying $\sqrt{T}$ scaling.

Exercise 2 — Building a drawdown function

Write a Python function drawdown_stats(returns) that takes a pandas Series of daily returns and returns a dictionary containing:

The maximum drawdown (a negative number).
The date (or integer index) at which the maximum drawdown was reached.
The peak date (or index) that preceded the maximum drawdown — i.e. where the wealth was highest before the worst loss.
The recovery date — the first date after the trough at which wealth returned to the previous peak. If the recovery has not occurred by the end of the sample, return None for this field.

Test your function on the simulated SPY series from the 60/40 worked example.

Exercise 3 — Sharpe under a time-varying risk-free rate

In the worked example we assumed a constant $3\%$ annual risk-free rate. In reality, the U.S. 3-month T-bill rate moved from approximately $0.05\%$ in 2021 to over $5.3\%$ in 2023. Simulate a daily risk-free rate path that linearly rises from $0.0001$ (about $2.5\%$ annual) to $0.0002$ (about $5\%$ annual) over a 5-year sample, and recompute the Sharpe ratio of a fixed return series under (a) the average $r_f$ and (b) the time-varying $r_f$. How much does the choice matter? Under what circumstances would it matter more?

Exercise 4 — Diversification under regime change

Consider two assets with annualized volatilities $\sigma_1 = 0.18$ and $\sigma_2 = 0.18$, equal weights $w_1 = w_2 = 0.5$, and correlation $\rho$. Compute portfolio volatility for $\rho \in \{-0.5, 0, 0.3, 0.6, 0.9, 1.0\}$. Plot portfolio volatility against $\rho$. By what percentage does portfolio volatility increase when $\rho$ rises from $0$ to $0.6$ (a typical “all assets going down together” regime)? Discuss the implication for a fund manager whose stress tests assume $\rho = 0$.

Exercise 5 — Volatility drag over a long horizon

A portfolio has arithmetic annual mean $12\%$ and annual volatility $\sigma$. The investor holds it for $30$ years.

Compute the terminal wealth from $\$1$ starting capital using the CAGR approximation $\bar{r}_{\text{geom}} \approx \bar{r}_{\text{arith}} - \sigma^2/2$, for $\sigma \in \{0.10, 0.20, 0.30, 0.40\}$.
The investor’s financial advisor is using the arithmetic mean of $12\%$ to project terminal wealth (i.e. ignoring volatility drag entirely). For each $\sigma$, compute the projection error — the ratio of the advisor’s predicted wealth to the true expected compound wealth.
At what level of $\sigma$ does the advisor’s projection overstate terminal wealth by a factor of $2$ or more? Comment on the implications for retirement planning under high-volatility strategies.

Where this leaves us

You now have the four core diagnostics — volatility, drawdown, Sharpe, correlation — and the two derived ideas — diversification benefit and volatility drag — that govern essentially every conversation in performance evaluation, asset allocation, and risk management. Chapter 5 lifts these tools from a single portfolio to cross-sectional questions: how do we compare hundreds of assets at once? How do we screen, rank, and build portfolios systematically? The statistics of this chapter become the inputs to the optimization machinery of the next.

Chapter Introduction

Why “data objects” instead of “data”

What you bring in, what you take out

Table of Contents

Prices and the time index

What a price series actually is

Acquiring price data with yfinance

Anatomy of an adjusted-close price series

Beyond a single stock: ETFs, bonds, options

Raw close vs adjusted close: dividends and splits

What corporate actions do to a price series

Which one does yfinance give you?

Dividend, split, and total return defined

Simple returns

Definition

In pandas: .pct_change() and the manual form

Historical vs forward returns

A quick numerical sanity check

Anatomy: from prices to returns

Log returns

Definition

In pandas

Why log returns exist at all

Why simple returns still exist

Aggregating returns over time

Simple-return compounding

Log-return summation

Worked numerical example

Cumulative gross return and the equity curve

The equity curve

Code: build an equity curve

Reading the equity curve

Resampling: daily → weekly → monthly

The two flavors of frequency conversion

Common frequency rules

Code: daily → monthly, two ways

Why monthly?

Rolling statistics: means and volatility

Rolling windows in pandas

Rolling mean: the moving average

Rolling volatility: the standard deviation of returns

Why \(\sigma\) measures risk

Position sizing with rolling vol

Annualizing volatility: the \(\sqrt{252}\) rule

The rule

Why \(\sqrt{T}\) and not \(T\)?

Some practitioner numbers to memorize

Worked example: SPY end-to-end

Summary

Exercises

Chapter Introduction

From Returns to Risk

Why These Four Statistics, in This Order

Volatility and the Square-Root-of-Time Rule

Definition

Why Standard Deviation, Not Variance?

The Square-Root-of-Time Scaling Rule

A Worked Numerical Example

Volatility in Code

Pitfalls When Annualizing

Drawdowns and the Running Maximum

Definition

Why Drawdown Is the Risk Metric Investors Actually Feel

Computing Drawdowns in Pandas

Anatomy of a drawdown calculation

Visualizing the Underwater Chart

The Sharpe Ratio

From Two Numbers to One

Annualizing the Sharpe Ratio

Benchmarks for Sharpe Ratios

Computing the Sharpe Ratio

Why the Sharpe Ratio Alone Is Not Enough

Cross-Asset Correlation

From One Asset to Two

Typical Cross-Asset Correlations

Correlation Matrix in Pandas

Rolling Correlation

Diversification: The Two-Asset Portfolio Variance

Why Correlation Pays the Bills

Three Special Cases

Acquiring price data with `yfinance`

Which one does `yfinance` give you?

In pandas: `.pct_change()` and the manual form