TECHNICAL WHITE PAPER · VYUH NEURAL NETWORK

Sara Jr: Architecture and Methodology of an Autonomous Demand Planning Agent for Enterprise Supply Chains

VYUH Research — Manish Varma Datla · Unchaining Supply Chains · unchainingsupplychains.com

ABSTRACT

This paper presents Sara Jr, an autonomous demand planning agent within the VYUH Supply Chain Neural Network. Sara Jr employs a Temporal Fusion Transformer (TFT) architecture to generate calibrated probabilistic demand forecasts across 847 stock-keeping units at six-hour intervals, achieving a mean absolute percentage error (MAPE) of 4.2% against an industry baseline of 18%. We describe the complete technical architecture including the data ingestion pipeline, feature engineering methodology, TFT model specification, confidence gating mechanism, escalation protocol, inter-agent signalling framework, and self-improving learning loop. Sara Jr operates at 96.7% autonomous decision rate — escalating only forecasting decisions where model confidence falls below a calibrated threshold. At £10B annual revenue, deployment has demonstrated £280M in working capital savings and a 23% reduction in annual stockout frequency.

1. Introduction

Demand planning represents one of the most consequential functions in modern supply chain management. Errors in demand forecasting propagate upstream to procurement and production and downstream to inventory positioning and customer service levels. The consequences of systematic forecast error are well documented: excess inventory carrying costs, stockout-driven revenue loss, and the organisational friction of reactive, crisis-driven operations.

Traditional demand planning processes rely on statistical methods applied by human analysts within enterprise resource planning (ERP) systems, typically operating on weekly or monthly cycles. This approach suffers from three structural limitations. First, temporal resolution: human-driven processes create inherent lag between demand signal detection and operational response. Second, signal scope: human analysts can realistically monitor a limited number of data sources, leaving significant predictive signal unconsumed. Third, scalability: as product portfolios grow, the cognitive load of managing individual SKU forecasts increases non-linearly, forcing analysts towards portfolio-level abstractions that sacrifice granularity.

Sara Jr addresses these limitations through autonomous, continuous forecasting powered by a Temporal Fusion Transformer operating across all SKUs simultaneously, with confidence-gated autonomous execution and structured escalation for uncertain cases.

2. Architecture Overview

Sara Jr is implemented as a six-layer autonomous agent within the VYUH Neural Network framework. The architecture comprises: (1) a real-time data ingestion layer, (2) a feature engineering pipeline, (3) a Temporal Fusion Transformer inference engine, (4) a calibrated uncertainty quantification module, (5) a confidence-gated decision execution layer, and (6) a continuous learning loop. The agent operates on a six-hour primary cadence, with real-time event processing for anomaly detection. Total inference time from data ingestion to forecast publication averages 4.2 minutes for a full 847-SKU run.

2.1 Architecture Diagram

The diagram below illustrates the full processing pipeline from raw data inputs through to autonomous execution and inter-agent signalling.

3. Data Ingestion Layer

3.1 Connected Data Sources

Sara Jr ingests signals from eight primary source categories totalling 340+ individual features: ERP transactional data (SAP S/4HANA, Oracle Fusion, Microsoft Dynamics 365), point-of-sale and e-commerce platforms (EDI retail feeds, Shopify API, Amazon Selling Partner API), weather and climate data (NOAA, Met Office), promotional calendars (internal marketing system integration), market intelligence (news APIs, competitor price monitoring), customer collaborative forecasts (EDI 830 demand signals), live inventory positions from Cho (VYUH Inventory Agent), and macroeconomic indicators (World Bank API, PMI data feeds).

3.2 Data Quality Management

All ingested data passes through a validation pipeline performing: schema validation and type coercion; outlier detection using modified z-score with adaptive thresholds per source; missing value imputation using forward-fill with exponential decay weighting; and data freshness monitoring with automatic source degradation flags. Sources flagged as stale are down-weighted rather than excluded — preserving partial signal value whilst preventing quality issues from propagating into the forecast.

4. Feature Engineering Pipeline

Raw data inputs are transformed into a structured feature set before TFT inference. The pipeline constructs 340+ features per SKU across five categories.

4.1 Temporal Features

Lag features at intervals of 1, 2, 4, 8, 13, and 52 weeks capture autocorrelative demand structure. Rolling mean and standard deviation windows of 4, 8, and 13 weeks provide moving average baselines. Fourier-transformed seasonality indices capture weekly, monthly, quarterly, and annual periodicity. Trend decomposition using STL (Seasonal and Trend decomposition using Loess) provides deseasonalised trend signals.

4.2 External Covariates

Weather variables are mapped to category-specific demand elasticity coefficients. Promotional uplift curves are constructed per event type using a log-normal decay model fitted to historical promotion response data. Macroeconomic indicators are incorporated as long-horizon covariates with a 12-week lead time assumption.

Feature vector per SKU: F_t = [lag_features(1,2,4,8,13,52wk), rolling_stats(4,8,13wk), seasonality_indices(Fourier), promo_flags, weather_covariates, macro_covariates, cross_sku_signals, known_future_covariates] Dimensionality: 340+ features × 847 SKUs × 6-hour cadence

5. Model Architecture: Temporal Fusion Transformer

5.1 Model Selection Rationale

The Temporal Fusion Transformer (TFT), introduced by Lim et al. (2021), was selected following evaluation against N-BEATS, DeepAR, WaveNet, Prophet, and traditional statistical methods (ETS, ARIMA family). TFT's advantages for this application are threefold. First, multi-horizon forecasting without autoregressive error accumulation. Second, interpretability through variable importance scores and attention weights — critical for analyst trust and escalation reasoning. Third, mixed covariate handling — TFT natively processes static, observed historical, and known future covariates within a single model architecture.

5.2 Architecture Specification

The deployed TFT uses: encoder/decoder LSTM hidden state dimension of 160; 4 multi-head attention heads; 2 transformer layers; gated residual network (GRN) hidden dimension of 160; dropout rate 0.1; quantile outputs at P10, P50, P90. The model contains approximately 12M parameters, implemented in PyTorch using the PyTorch Forecasting library. Inference on the full 847-SKU portfolio completes in under 90 seconds on an NVIDIA A100 GPU.

5.3 Training Methodology

The model trains on a minimum of 104 weeks of historical data per SKU using walk-forward validation with 13-week test windows. The loss function is quantile loss (pinball loss) summed across P10, P50, and P90 — directly optimising for calibrated probabilistic output. Training uses Adam optimiser with learning rate 1e-3, cosine annealing, and gradient clipping at 0.1. Batch size is 64 time series segments of 52-week length.

6. Confidence Gating Mechanism

A core design principle of Sara Jr is that autonomous action occurs only when model confidence is sufficiently high. The mechanism implements a three-tier decision framework.

6.1 Confidence Score Construction

The confidence score is a composite of three signals: (1) Prediction Interval Width — normalised P10–P90 interval width relative to P50; (2) Feature Reliability Score — weighted measure of data source availability and freshness; (3) Historical Accuracy — rolling 13-week MAPE for the specific SKU.

Confidence_Score = w₁ × (1 − normalised_PI_width) + w₂ × feature_reliability + w₃ × (1 − rolling_MAPE/baseline_MAPE) Weights: w₁ = 0.45, w₂ = 0.30, w₃ = 0.25 Calibrated on validation set to minimise escalation error rate

6.2 Decision Thresholds

Autonomous execution (≥85%): Forecast published directly to downstream systems. ERP inventory targets updated, production signals sent to Becci, procurement signals to Ari. No human review. Accounts for 96.7% of all production decisions.

Flagged review (65–85%): Forecast published with review flag. Analyst notified with confidence breakdown and recommended action. System proceeds unless overridden within review window.

Escalation (<65%): Forecast withheld from automatic publication. Structured escalation briefing generated containing: plain-language situation summary, three alternative forecast scenarios with probability weighting, primary uncertainty driver, supporting data, and recommended course of action.

7. Integration with VYUH Agent Network

Sara Jr operates as the primary demand signal source within the VYUH multi-agent framework. Forecast outputs are published to a shared agent state layer subscribed to by Ari, Becci, Cho, and Tom. A demand signal update from Sara Jr propagates across the network within a single cycle — typically within 8 minutes of Sara Jr's forecast publication, each dependent agent has updated its own plans accordingly.

The inter-agent communication protocol uses a structured signal format with standardised fields: SKU identifier, forecast horizon, point forecast, confidence interval bounds, confidence score, signal type (routine/anomaly/escalation), and downstream action recommendation.

8. Performance Benchmarks

MAPE (Mean Absolute Percentage Error): 4.2% at P50 across all SKUs — vs 18.0% baseline
WAPE (Weighted Absolute Percentage Error): 3.8%, weighted by revenue value
Forecast Bias: +0.3% (marginal upward bias, within acceptable tolerances)
Autonomous Decision Rate: 96.7% of decisions executed without human intervention
Escalation Accuracy: 94.2% of escalations were correctly identified as requiring human input
Cycle Time: 6 hours vs 3-week baseline
Working Capital Impact: £280M freed per £10B revenue

9. Limitations and Edge Cases

Sara Jr performs less effectively in four documented edge cases: new product introductions with no demand history; black swan demand events outside the training distribution; highly intermittent demand (Croston-class) SKUs handled by a specialist sub-module; and data source outages degrading more than 40% of features — triggering automatic escalation regardless of TFT output confidence.

10. References

Lim, B., Arık, S. Ö., Loeff, N., & Pfister, T. (2021). Temporal Fusion Transformers for interpretable multi-horizon time series forecasting. International Journal of Forecasting, 37(4), 1748–1764.

Makridakis, S., Spiliotis, E., & Assimakopoulos, V. (2022). M5 accuracy competition: Results, findings, and conclusions. International Journal of Forecasting, 38(4), 1346–1364.

Chopra, S., & Meindl, P. (2016). Supply Chain Management: Strategy, Planning, and Operation (6th ed.). Pearson.

Salinas, D., Flunkert, V., Gasthaus, J., & Januschowski, T. (2020). DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3), 1181–1191.

Silver, E. A., Pyke, D. F., & Thomas, D. J. (2017). Inventory and Production Management in Supply Chains (4th ed.). CRC Press.

Download the full white paper as a text document for offline reading and boardroom distribution.

SaraJr

Six things. Running themselves.

340+ signals. Every cycle.

From signal to decision. In minutes.

Six outputs. Every 6 hours.

What changes when Sara Jr is running.

This is what it looks like
on a real supply chain.

See Sara Jr run on
your supply chain.

Ask anything.

Sara Jr: Architecture and Methodology of an Autonomous Demand Planning Agent for Enterprise Supply Chains

1. Introduction

2. Architecture Overview

2.1 Architecture Diagram

3. Data Ingestion Layer

3.1 Connected Data Sources

3.2 Data Quality Management

4. Feature Engineering Pipeline

4.1 Temporal Features

4.2 External Covariates

5. Model Architecture: Temporal Fusion Transformer

5.1 Model Selection Rationale

5.2 Architecture Specification

5.3 Training Methodology

6. Confidence Gating Mechanism

6.1 Confidence Score Construction

6.2 Decision Thresholds

7. Integration with VYUH Agent Network

8. Performance Benchmarks

9. Limitations and Edge Cases

10. References

SaraJr

Six things. Running themselves.

340+ signals. Every cycle.

From signal to decision. In minutes.

Six outputs. Every 6 hours.

What changes when Sara Jr is running.

This is what it looks likeon a real supply chain.

See Sara Jr run onyour supply chain.

Ask anything.

Sara Jr: Architecture and Methodology of an Autonomous Demand Planning Agent for Enterprise Supply Chains

1. Introduction

2. Architecture Overview

2.1 Architecture Diagram

3. Data Ingestion Layer

3.1 Connected Data Sources

3.2 Data Quality Management

4. Feature Engineering Pipeline

4.1 Temporal Features

4.2 External Covariates

5. Model Architecture: Temporal Fusion Transformer

5.1 Model Selection Rationale

5.2 Architecture Specification

5.3 Training Methodology

6. Confidence Gating Mechanism

6.1 Confidence Score Construction

6.2 Decision Thresholds

7. Integration with VYUH Agent Network

8. Performance Benchmarks

9. Limitations and Edge Cases

10. References

This is what it looks like
on a real supply chain.

See Sara Jr run on
your supply chain.