DEMAND PLANNING AGENT
Your demand plan updates itself.
Every 6 hours.
Sara Jr monitors 847 SKUs across every channel simultaneously β forecasting at Β±4.2% error on a continuous 6-hour cycle. Confident decisions execute automatically. Uncertain ones reach your team with a full briefing.
Every capability below runs continuously, every 6 hours, without anyone asking.
Most demand planning tools read one or two sources. Sara Jr reads all of them β simultaneously, every 6 hours.
Five steps. Most invisible. The only one you see is the output β and occasionally, the escalation.
These documents and signals are produced autonomously β published directly into your ERP, planning systems, and agent network.
A pre-recorded simulation Sara Jr ran on a semiconductor company facing a demand spike. Watch the agent process, decide, and act. Then book a session to see it run on your data.
30 minutes. Manish runs the simulation live using your actual demand data. You see exactly what Sara Jr would do, what it would output, and what the financial impact would be at your scale.
Powered by the VYUH Neural Network. Configure your organisation for responses specific to your supply chain.
847 SKUs across all channels in real time.6 hours. Error rate: Β±4.2%. Last update: 2 minutes ago.This paper presents Sara Jr, an autonomous demand planning agent within the VYUH Supply Chain Neural Network. Sara Jr employs a Temporal Fusion Transformer (TFT) architecture to generate calibrated probabilistic demand forecasts across 847 stock-keeping units at six-hour intervals, achieving a mean absolute percentage error (MAPE) of 4.2% against an industry baseline of 18%. We describe the complete technical architecture including the data ingestion pipeline, feature engineering methodology, TFT model specification, confidence gating mechanism, escalation protocol, inter-agent signalling framework, and self-improving learning loop. Sara Jr operates at 96.7% autonomous decision rate β escalating only forecasting decisions where model confidence falls below a calibrated threshold. At Β£10B annual revenue, deployment has demonstrated Β£280M in working capital savings and a 23% reduction in annual stockout frequency.
Demand planning represents one of the most consequential functions in modern supply chain management. Errors in demand forecasting propagate upstream to procurement and production and downstream to inventory positioning and customer service levels. The consequences of systematic forecast error are well documented: excess inventory carrying costs, stockout-driven revenue loss, and the organisational friction of reactive, crisis-driven operations.
Traditional demand planning processes rely on statistical methods applied by human analysts within enterprise resource planning (ERP) systems, typically operating on weekly or monthly cycles. This approach suffers from three structural limitations. First, temporal resolution: human-driven processes create inherent lag between demand signal detection and operational response. Second, signal scope: human analysts can realistically monitor a limited number of data sources, leaving significant predictive signal unconsumed. Third, scalability: as product portfolios grow, the cognitive load of managing individual SKU forecasts increases non-linearly, forcing analysts towards portfolio-level abstractions that sacrifice granularity.
Sara Jr addresses these limitations through autonomous, continuous forecasting powered by a Temporal Fusion Transformer operating across all SKUs simultaneously, with confidence-gated autonomous execution and structured escalation for uncertain cases.
Sara Jr is implemented as a six-layer autonomous agent within the VYUH Neural Network framework. The architecture comprises: (1) a real-time data ingestion layer, (2) a feature engineering pipeline, (3) a Temporal Fusion Transformer inference engine, (4) a calibrated uncertainty quantification module, (5) a confidence-gated decision execution layer, and (6) a continuous learning loop. The agent operates on a six-hour primary cadence, with real-time event processing for anomaly detection. Total inference time from data ingestion to forecast publication averages 4.2 minutes for a full 847-SKU run.
The diagram below illustrates the full processing pipeline from raw data inputs through to autonomous execution and inter-agent signalling.
Sara Jr ingests signals from eight primary source categories totalling 340+ individual features: ERP transactional data (SAP S/4HANA, Oracle Fusion, Microsoft Dynamics 365), point-of-sale and e-commerce platforms (EDI retail feeds, Shopify API, Amazon Selling Partner API), weather and climate data (NOAA, Met Office), promotional calendars (internal marketing system integration), market intelligence (news APIs, competitor price monitoring), customer collaborative forecasts (EDI 830 demand signals), live inventory positions from Cho (VYUH Inventory Agent), and macroeconomic indicators (World Bank API, PMI data feeds).
All ingested data passes through a validation pipeline performing: schema validation and type coercion; outlier detection using modified z-score with adaptive thresholds per source; missing value imputation using forward-fill with exponential decay weighting; and data freshness monitoring with automatic source degradation flags. Sources flagged as stale are down-weighted rather than excluded β preserving partial signal value whilst preventing quality issues from propagating into the forecast.
Raw data inputs are transformed into a structured feature set before TFT inference. The pipeline constructs 340+ features per SKU across five categories.
Lag features at intervals of 1, 2, 4, 8, 13, and 52 weeks capture autocorrelative demand structure. Rolling mean and standard deviation windows of 4, 8, and 13 weeks provide moving average baselines. Fourier-transformed seasonality indices capture weekly, monthly, quarterly, and annual periodicity. Trend decomposition using STL (Seasonal and Trend decomposition using Loess) provides deseasonalised trend signals.
Weather variables are mapped to category-specific demand elasticity coefficients. Promotional uplift curves are constructed per event type using a log-normal decay model fitted to historical promotion response data. Macroeconomic indicators are incorporated as long-horizon covariates with a 12-week lead time assumption.
The Temporal Fusion Transformer (TFT), introduced by Lim et al. (2021), was selected following evaluation against N-BEATS, DeepAR, WaveNet, Prophet, and traditional statistical methods (ETS, ARIMA family). TFT's advantages for this application are threefold. First, multi-horizon forecasting without autoregressive error accumulation. Second, interpretability through variable importance scores and attention weights β critical for analyst trust and escalation reasoning. Third, mixed covariate handling β TFT natively processes static, observed historical, and known future covariates within a single model architecture.
The deployed TFT uses: encoder/decoder LSTM hidden state dimension of 160; 4 multi-head attention heads; 2 transformer layers; gated residual network (GRN) hidden dimension of 160; dropout rate 0.1; quantile outputs at P10, P50, P90. The model contains approximately 12M parameters, implemented in PyTorch using the PyTorch Forecasting library. Inference on the full 847-SKU portfolio completes in under 90 seconds on an NVIDIA A100 GPU.
The model trains on a minimum of 104 weeks of historical data per SKU using walk-forward validation with 13-week test windows. The loss function is quantile loss (pinball loss) summed across P10, P50, and P90 β directly optimising for calibrated probabilistic output. Training uses Adam optimiser with learning rate 1e-3, cosine annealing, and gradient clipping at 0.1. Batch size is 64 time series segments of 52-week length.
A core design principle of Sara Jr is that autonomous action occurs only when model confidence is sufficiently high. The mechanism implements a three-tier decision framework.
The confidence score is a composite of three signals: (1) Prediction Interval Width β normalised P10βP90 interval width relative to P50; (2) Feature Reliability Score β weighted measure of data source availability and freshness; (3) Historical Accuracy β rolling 13-week MAPE for the specific SKU.
Autonomous execution (β₯85%): Forecast published directly to downstream systems. ERP inventory targets updated, production signals sent to Becci, procurement signals to Ari. No human review. Accounts for 96.7% of all production decisions.
Flagged review (65β85%): Forecast published with review flag. Analyst notified with confidence breakdown and recommended action. System proceeds unless overridden within review window.
Escalation (<65%): Forecast withheld from automatic publication. Structured escalation briefing generated containing: plain-language situation summary, three alternative forecast scenarios with probability weighting, primary uncertainty driver, supporting data, and recommended course of action.
Sara Jr operates as the primary demand signal source within the VYUH multi-agent framework. Forecast outputs are published to a shared agent state layer subscribed to by Ari, Becci, Cho, and Tom. A demand signal update from Sara Jr propagates across the network within a single cycle β typically within 8 minutes of Sara Jr's forecast publication, each dependent agent has updated its own plans accordingly.
The inter-agent communication protocol uses a structured signal format with standardised fields: SKU identifier, forecast horizon, point forecast, confidence interval bounds, confidence score, signal type (routine/anomaly/escalation), and downstream action recommendation.
Sara Jr performs less effectively in four documented edge cases: new product introductions with no demand history; black swan demand events outside the training distribution; highly intermittent demand (Croston-class) SKUs handled by a specialist sub-module; and data source outages degrading more than 40% of features β triggering automatic escalation regardless of TFT output confidence.