Performing Initiative Data Prefetching in Distributed File Systems for Cloud Computing (Sep-2017)
The main aim of the project receiving cloud file distributed the prefetched data to the relevant client machines proactively. The information about client nodes is piggybacked onto the real client I/O requests, and then forwarded to the relevant storage server.
The data prefetching mechanisms have been proposed to hide the latency in distributed ﬁle systems caused by network communication and disk operations. In these conventional prefetching mechanisms the client ﬁle system is supposed to predict future access by analyzing the history of occurred I/O access without any application intervention. After that, the client ﬁle system may send relevant I/O requests to storage servers for reading the relevant data. The Prefetching mechanism ﬁrst analyzes disk I/O tracks to predict the future disk I/O access so that the storage servers can fetch data in advance, and then forward the prefetched data to relevant client ﬁle systems for future potential usages. The client disk I/O access operations and classiﬁed them into two kinds of access patterns in order to predict the future I/O access that belongs to the different access patterns as accurately as possible prediction algorithms including the chaotic time series prediction algorithm and the linear regression prediction algorithm have been proposed respectively he data prefetching on storage servers. Without any intervention from client ﬁle systems except for piggybacking their information onto relevant I/O requests to the storage server. The storage servers are supposed to log disk I/O access and classify access patterns after modeling disk I/O. Storage servers proactively forward the prefetched data to the relevant client ﬁle systems for satisfying future application requests.