Data-Centric AI In Distributed Systems: Bridging the Gap Between Performance and Scalability
Main Article Content
Abstract
Maximizing performance and achieving scale in large distributed and cloud systems is a difficult task which requires novel, data- and heuristics-based solutions. While data are increasing faster and faster, and the workloads of nodes are more flexible, the conventional approaches for managing distributed systems cannot cope with the requirements of performance improvement and workload flexibility. This paper discusses the centrality of data-driven AI in the management of such challenges with special emphasis on intelligent data partitioning, adaptive data indexing and efficient caching. Using ideas of machine learning and integrating such models with the concepts of a distributed system, the study puts forward an architecture that adapts to the changes to the workload in real time and optimizes the data handling approach correspondingly. This approach helps to improve scalability of systems, bring more effective resource management, reduce latency and increase system performance in general. Based on the findings of the presented study, further improvements in system and resource utilization are identified, underlining the role of data-centric AI in redesigning DC based distributed cloud systems. These conclusions highlight the need for intelligent, adaptive, and scale-out data initiatives in the growth of future cloud architectures for handling current and emerging applications.