A Survey of Distributed Caching Patterns for High-Throughput Python Applications

Mykhaylo Kurtikov

Citation: Mykhaylo Kurtikov, "A Survey of Distributed Caching Patterns for High-Throughput Python Applications", Universal Library of Engineering Technology, Volume 02, Issue 04.

Copyright: This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

This article surveys distributed-caching patterns tailored to high-throughput Python applications. Its relevance stems from ever-growing data volumes and the need to cut access latencies without sacrificing consistency. The novelty lies in synthesizing findings from ten recent studies—from linearly consistent schemes by Repin & Sidorov [9] to the learnable GL-Cache of Yang et al. We describe architectural topologies (peer-to-peer, hierarchical, sharding), compare eviction algorithms (LRU, LFU, ARC, TLRU, and machine-learning–driven approaches), and evaluate key metrics such as hit ratio, latency, and throughput. Special attention is paid to CPython’s limitations when implementing cache layers, as well as the advantages of ProxyStore, Acorn, and fine-grained RDataFrame caching. Our goal is to offer practitioners clear guidance on selecting the right caching pattern for different load profiles. To that end, we employ comparative analysis, content analysis, and analytical synthesis. In conclusion, we present actionable recommendations for distributed-system architects, data engineers, and researchers optimizing the Python stack in production.


Keywords: Distributed Caching; Python; Hit Ratio; GL-Cache; Raft Consistency; Machine Learning; CPython Performance; Spark SQL Caching; Edge Cache; Sharding.

Download doi https://doi.org/10.70315/uloap.ulete.2025.0204008