Discrete-continuous choice: the Iskhakov et al. (2017) retirement model

This notebook illustrates the central numerical challenge of discrete-continuous dynamic choice models on the deterministic retirement model of Iskhakov, Jørgensen, Rust & Schjerning (2017), “The endogenous grid method for discrete-continuous dynamic choice models with (or without) taste shocks”, Quantitative Economics 8(2), 317–365, Iskhakov et al. (2017). The model ships as lcm_examples.iskhakov_et_al_2017 and has a closed-form solution that pylcm’s test suite uses as an analytical oracle.

A worker chooses consumption and whether to keep working or to retire; retirement is absorbing. The discrete retirement choice destroys the concavity of the value function and produces the paper’s signature saw-tooth consumption function, which we plot below — first by brute-force grid search, then side by side with the DC-EGM solver, which needs no consumption grid at all.

import functools

import jax.numpy as jnp
import plotly.graph_objects as go

from lcm import AgeGrid, LinSpacedGrid, Model
from lcm_examples.iskhakov_et_al_2017 import (
    RegimeId,
    dead,
    get_model,
    get_params,
    retirement,
    working_life,
)

Solving the model¶

We use the paper’s analytical-solution parametrization: log utility, work disutility $\delta = 1$ , wage 20, discount factor 0.98, zero interest. With n_periods=6 the agent makes decisions in five periods and is dead in the last one.

Grid search resolves the kinks of the consumption function only up to the resolution of its grids, so for crisp figures we customize the packaged regimes (via .replace()) with grids finer than the defaults. The consumption policy in the first period is read off a forward simulation that starts one agent at each point of a dense wealth grid.

PAPER_PARAMS = {
    "discount_factor": 0.98,
    "disutility_of_work": 1.0,
    "interest_rate": 0.0,
    "wage": 20.0,
}


# Cached so repeat uses of the same horizon below reuse one model object and
# hence one set of compiled kernels - re-solves are execution-only.
@functools.cache
def build_model(n_periods, n_wealth=500, n_consumption=2500):
    wealth_grid = LinSpacedGrid(start=1, stop=400, n_points=n_wealth)
    consumption_grid = LinSpacedGrid(start=1, stop=400, n_points=n_consumption)
    ages = AgeGrid(start=40, stop=40 + (n_periods - 1) * 10, step="10Y")
    last_age = ages.exact_values[-1]
    return Model(
        regimes={
            "working_life": working_life.replace(
                states={"wealth": wealth_grid},
                actions={
                    **dict(working_life.actions),
                    "consumption": consumption_grid,
                },
                active=lambda age, la=last_age: age < la,
            ),
            "retirement": retirement.replace(
                states={"wealth": wealth_grid},
                actions={"consumption": consumption_grid},
                active=lambda age, la=last_age: age < la,
            ),
            "dead": dead,
        },
        ages=ages,
        regime_id_class=RegimeId,
    )


def first_period_consumption(model, params, wealth):
    V_arrs = model.solve(params=params, log_level="warning")
    result = model.simulate(
        params=params,
        initial_conditions={
            "age": jnp.full(wealth.size, model.ages.values[0]),
            "wealth": wealth,
            "regime_id": jnp.full(
                wealth.size, model.regime_names_to_ids["working_life"]
            ),
        },
        period_to_regime_to_V_arr=V_arrs,
        log_level="warning",
    )
    df = result.to_dataframe()
    return df.query("period == 0").sort_values("wealth")


WEALTH = jnp.linspace(1.0, 120.0, 600)
policy = first_period_consumption(
    build_model(n_periods=6), get_params(6, **PAPER_PARAMS), WEALTH
)

The saw-tooth consumption function¶

The first-period consumption function of a worker is not smooth — it jumps downward at a sequence of wealth thresholds:

fig = go.Figure(
    go.Scatter(
        x=policy["wealth"],
        y=policy["consumption"],
        mode="lines",
        line={"color": "steelblue"},
    )
)
fig.update_layout(
    title="First-period consumption of a worker (5 decision periods)",
    xaxis_title="first-period wealth",
    yaxis_title="consumption",
    template="simple_white",
)
fig.show()

Each tooth corresponds to a different optimal retirement age. At low levels of initial wealth, the agent works in every period in which working can still pay off (income arrives one period later, so the final decision period is always spent retired) and consumes a fraction of lifetime wealth each period. At an initial wealth of about 32.4 in this parametrization, retiring one period earlier becomes optimal: the marginal value of leisure exceeds the marginal utility of the consumption that the extra wage would buy. Lifetime wealth is therefore one period’s income smaller than just below the threshold, and the optimal level of consumption drops. The same happens again at initial wealth levels of about 50.4, 68.3, and 86.4; beyond 86.5 the agent does not work at all.

How many teeth there are depends on how many retirement dates are in play, which the horizon comparison below makes visible:

With a single decision period there is no kink at all — and nobody works. In this model, income earned by working arrives in the following period’s budget, so with no period left to enjoy it, working is pure disutility and retiring is optimal at every wealth level. The consumption function is smooth.
With two decision periods the work choice is genuinely in play and the first kink appears: below the threshold the agent works today (high lifetime income, high consumption), above it they retire today and consumption drops.
Every additional decision period adds future retirement dates whose value-function crossings propagate backward as further secondary kinks.

fig = go.Figure()
horizons = [2, 3, 4, 5, 6]
colors = ["#cccccc", "#aaaaaa", "#888888", "#666666", "steelblue"]
for n_periods, color in zip(horizons, colors, strict=True):
    pol = first_period_consumption(
        build_model(n_periods), get_params(n_periods, **PAPER_PARAMS), WEALTH
    )
    n_decisions = n_periods - 1
    label = "decision period" if n_decisions == 1 else "decision periods"
    fig.add_trace(
        go.Scatter(
            x=pol["wealth"],
            y=pol["consumption"],
            mode="lines",
            line={"color": color},
            name=f"{n_decisions} {label}",
        )
    )
fig.update_layout(
    title="More remaining work-life, more teeth",
    xaxis_title="first-period wealth",
    yaxis_title="first-period consumption",
    template="simple_white",
)
fig.show()

Grid resolution matters near the kinks¶

Grid search can distort the teeth. At the packaged default resolution (100 wealth points, 500 consumption points), the sharp drop near a retirement threshold can show up as a two-step drop with a spurious intermediate plateau: linear interpolation smears the kink of next-period’s value function across a whole wealth-grid cell, and the consumption grid quantizes the policy, so the argmax flips between the two nearly-optimal plans over a band of wealth instead of at a single point. Refining the grids shrinks that band back to the single sharp jump of the analytical solution:

ZOOM_WEALTH = jnp.linspace(25.0, 40.0, 400)
zoom_default = first_period_consumption(
    get_model(6), get_params(6, **PAPER_PARAMS), ZOOM_WEALTH
)
zoom_fine = first_period_consumption(
    build_model(6), get_params(6, **PAPER_PARAMS), ZOOM_WEALTH
)

fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=zoom_default["wealth"],
        y=zoom_default["consumption"],
        mode="lines",
        line={"color": "#bbbbbb"},
        name="default grids (100 × 500)",
    )
)
fig.add_trace(
    go.Scatter(
        x=zoom_fine["wealth"],
        y=zoom_fine["consumption"],
        mode="lines",
        line={"color": "steelblue"},
        name="fine grids (500 × 2500)",
    )
)
fig.update_layout(
    title="Zoom on the first retirement threshold (5 decision periods)",
    xaxis_title="first-period wealth",
    yaxis_title="first-period consumption",
    template="simple_white",
)
fig.show()

Brute force vs DC-EGM¶

The figures above come from pylcm’s brute-force solver: consumption is chosen from a dense grid, so each tooth is resolved only up to the grid’s resolution — as the zoom shows, getting the kinks right means paying for ever-finer grids, and the cost of the solve scales with their size.

This is exactly the situation the DC-EGM algorithm of Iskhakov et al. (2017) is built for. Instead of searching a consumption grid, it inverts the Euler equation on an exogenous end-of-period savings grid, locates the kinks with an upper-envelope scan, and handles the borrowing constraint in closed form. Selecting it is a per-regime solver=DCEGM(...) declaration plus three regime functions (resources, savings, and inverse_marginal_utility) and a wealth transition written in terms of savings; the borrowing constraint is dropped because DC-EGM enforces the budget identity and the savings-grid lower bound intrinsically. get_dcegm_model builds exactly this variant of the model above — get_params works unchanged.

To compare accuracy, we treat the fine-grid brute-force solution (500 × 2500) as the reference and look at the period-0 value function of the working regime on the default 100-point wealth grid.

import time

from lcm_examples.iskhakov_et_al_2017 import get_dcegm_model

PARAMS_6 = get_params(6, **PAPER_PARAMS)
WEALTH_NODES = jnp.linspace(1, 400, 100)  # the default wealth grid

V_default = get_model(6).solve(params=PARAMS_6, log_level="warning")
V_dcegm = get_dcegm_model(6).solve(params=PARAMS_6, log_level="warning")
V_fine = build_model(6).solve(params=PARAMS_6, log_level="warning")

reference = jnp.interp(
    WEALTH_NODES, jnp.linspace(1, 400, 500), V_fine[0]["working_life"]
)
err_default = jnp.abs(V_default[0]["working_life"] - reference)
err_dcegm = jnp.abs(V_dcegm[0]["working_life"] - reference)

for label, model in [
    ("brute force, default grids (100 × 500)", get_model(6)),
    ("DC-EGM, no consumption grid", get_dcegm_model(6)),
]:
    start = time.perf_counter()
    model.solve(params=PARAMS_6, log_level="off")  # warm: compilation cached above
    print(f"{label}: {time.perf_counter() - start:.3f}s per solve")

brute force, default grids (100 × 500): 1.712s per solve

DC-EGM, no consumption grid: 1.715s per solve

fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=WEALTH_NODES,
        y=err_default,
        mode="lines",
        line={"color": "#bbbbbb"},
        name="brute force, default grids (100 × 500)",
    )
)
fig.add_trace(
    go.Scatter(
        x=WEALTH_NODES,
        y=err_dcegm,
        mode="lines",
        line={"color": "steelblue"},
        name="DC-EGM, no consumption grid",
    )
)
fig.update_layout(
    title="Distance to the fine-grid solution (500 × 2500), period 0, working life",
    xaxis_title="wealth",
    yaxis_title="|V − V_fine|",
    yaxis_type="log",
    template="simple_white",
)
fig.show()

DC-EGM tracks the fine-grid reference much more closely than the equally-sized brute-force solve — without any consumption grid — and the contrast is sharpest at low wealth, where the borrowing constraint binds: DC-EGM computes the constrained segment in closed form, while grid search has to approximate it from feasible grid points. At this resolution the comparison is limited by the reference’s own grid error, so the DC-EGM curve is best read as an upper bound on its distance to the exact solution.

Two practical notes. The value functions DC-EGM returns live on the same wealth grid as the brute-force ones, so everything downstream of solve() is unchanged, and the two solvers can be mixed freely across regimes. Forward simulation works; simulated consumption is chosen from the consumption grid (so the policy figures above look the same under either solver), while the solve itself needs no consumption grid at all.

References¶

Iskhakov, F., Jørgensen, T. H., Rust, J., & Schjerning, B. (2017). The endogenous grid method for discrete-continuous dynamic choice models with (or without) taste shocks. Quantitative Economics, 8(2), 317–365. 10.3982/QE643