WORK / 02

Forecasting Engine

In progress

A from-scratch Holt-Winters additive triple exponential smoothing engine for SKU-level demand forecasting, with parameter tuning, batch inference, and an experimental ML residual-correction layer.

Role: Sole engineer
Timeline: 2025 – present
Status: In progress
Stack: PythonNumPyPandasPyTorchscikit-learn

Problem

Stockman's reorder suggestions need a defensible demand model. Pulling a library off the shelf was tempting, but I wanted to know exactly what was happening inside the forecast — both because inventory decisions are real money, and because the moment something goes wrong I want to be able to point at a specific component.

So I built it. Holt-Winters additive triple exponential smoothing, from scratch, with parameter tuning, holdout evaluation, and an optional ML residual-correction layer for the tail.

Architecture

The model decomposes a series into level, trend, and seasonality, then projects forward.

Diagram

Forecasting pipeline

layer reveal

Demand series

Level + trend

Seasonality

Forecast

Residual check

Level, trend, seasonality, and residuals are evaluated before a reorder signal is emitted.

\ell_t = \alpha(y_t - s_{t-m}) + (1-\alpha)(\ell_{t-1} + b_{t-1})

b_t = \beta(\ell_t - \ell_{t-1}) + (1-\beta)\, b_{t-1}

s_t = \gamma(y_t - \ell_{t-1} - b_{t-1}) + (1-\gamma)\, s_{t-m}

\hat{y}_{t+h} = \ell_t + h\, b_t + s_{t+h-m}

The pipeline is:

Seasonal period detection

Spectral peak + autocorrelation cross-check; defaults to 7 if signals disagree.

Parameter optimization

Grid search over α/β/γ, then L-BFGS refinement. Loss is RMSE on a rolling-origin holdout, not in-sample fit.

Batch inference

Vectorized across SKUs. CPU is sufficient for inference; GPU only helped on the cross-product parameter sweep.

ML residual layer (experimental)

A small gradient-boosted model on the residuals, with calendar and price features. Improves long-tail RMSE; off by default.

Evaluation

Twelve-week holdout against a naive seasonal baseline. Metrics: MAE, RMSE, R². Holt-Winters wins; the residual layer adds a small but real lift on SKUs with sparse, lumpy demand.

What I'd do differently

Build the rolling-origin evaluator before the model. I built it second, and ate a few weeks of false positives.
Treat seasonal period detection as a first-class problem, not a heuristic. The places where the model was wrong were almost always places where the period was wrong.
Keep the residual layer in its own repo. It changes shape every two weeks; the base engine doesn't.

Outcomes

1.00x

Baseline RMSE

naive seasonal

0.71x

Holt-Winters

vs baseline

0.66x

+ ML residuals

long-tail SKUs

CPU-fast

Inference

batch over 3k SKUs