The Infrastructure -Model Symbiosis: Rethinking Platform Architecture for Billion-Scale Prediction Workloads

Priyadharshini Krishnamurthy

doi:10.52783/jisem.v10i63s.14498

PDF

Published: Dec 13, 2025

DOI: https://doi.org/10.52783/jisem.v10i63s.14498

Keywords:

Infrastructure-Model Symbiosis, Billion-Scale Prediction Workloads, Two-Tower Neural Networks, Feature Store Architecture, Dynamic Batching Orchestration

Priyadharshini Krishnamurthy

Abstract

The clean separation between ML model development and infrastructure engineering—a principle borrowed from traditional software—breaks down at billion-scale prediction workloads. At this scale, deployment constraints don't just affect performance; they determine whether a model can exist in production at all. This article introduces the concept of infrastructure-model symbiosis, demonstrating how production systems at hyperscale platforms achieve transformative performance improvements through co-design viewpoints that treat infrastructure as a first-class architectural constraint. Through detailed investigation of music streaming recommendations, product recommendation systems, and video streaming platforms, the article reveals that models optimized purely for offline accuracy metrics often fail catastrophically when confronted with production realities. The framework of infrastructure-aware model design encompasses evaluation criteria, including memory footprint, serialization overhead, distributed inference coordination costs, and cacheability. Architectural patterns such as two-tower neural networks, hierarchical model cascades, tiered feature materialization pipelines, and dynamic batching mechanisms demonstrate how alignment between model architecture and serving topology enables dramatic improvements in both performance and cost efficiency. Economic scrutiny establishes that optimal model selection requires balancing accuracy against serving costs at production scale, transforming model development from a purely technical optimization into a strategic resource allocation discipline. Feature infrastructure emerges as a critical bottleneck, consuming more computational resources than model inference itself, necessitating sophisticated materialization strategies and tiered caching hierarchies. Serving orchestration patterns reconcile contradictory requirements between batch efficiency and low-latency response through adaptive mechanisms that dynamically adjust to traffic patterns. The synthesis of these elements establishes infrastructure-model co-design as essential for advancing production machine learning systems beyond current barriers.

Issue

Vol. 10 No. 63s (2025)

Section

Articles

Journal of Information Systems Engineering and Management

The Infrastructure -Model Symbiosis: Rethinking Platform Architecture for Billion-Scale Prediction Workloads

Abstract

Volume 11 (2026)

Volume 10 (2025)

Volume 9 (2024)

Volume 8 (2023)

Volume 7 (2022)

Volume 6 (2021)

Volume 5 (2020)

Volume 4 (2019)

Volume 3 (2018)

Volume 2 (2017)

Volume 1 (2016)

Journal of Information Systems Engineering and Management

Article Sidebar

Main Article Content

Abstract

Article Details