Optimization of MLOps Processes for Product Recommendation Systems under High LoadDmitrii Timoshenko Citation: Dmitrii Timoshenko, "Optimization of MLOps Processes for Product Recommendation Systems under High Load", Universal Library of Engineering Technology, Volume 03, Issue 01. Copyright: This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. AbstractThis article examines the comprehensive optimization of MLOps processes for high-load product recommendation systems, in which stringent latency SLAs and terabyte-scale embeddings coexist with rapid drift in user preferences and intense business pressure to maximize commercial KPIs. The relevance of the study stems from the fact that classical batch-oriented MLOps practices do not provide the required feature consistency, stable model quality, or predictability of revenue, conversion, and retention metrics under peak loads typical of e-commerce and media services. The study aims to develop a holistic engineering and product-oriented approach to the design of data, inference, and training architectures, encompassing a Feature Store with streaming aggregations, hierarchical parameter servers, algorithmic embedding compression, dynamic batching, and concurrent model execution, vector search over embeddings, as well as drift monitoring loops and continuous (online) training. The scientific contribution lies in integrating hardware-oriented optimizations and process-centric MLOps methodologies into a unified RecSys design standard that simultaneously improves throughput and reduces latency while maintaining recommendation quality and stabilizing key business metrics. The article is intended for data and MLOps engineers, recommendation system architects, and technical leaders of digital products and marketing teams. Keywords: MLOps, Recommendation Systems, DLRM, High-Load Systems. Download |
|---|