Performing Initiative Data Prefetching in Distributed File Systems for Cloud Computing (Sep-2017)


The main aim of the project receiving cloud file distributed the prefetched data to the relevant client machines proactively. The information about client nodes is piggybacked onto the real client I/O requests, and then forwarded to the relevant storage server.

Proposed system:

The data prefetching mechanisms have been proposed to hide the latency in distributed file systems caused by network communication and disk operations. In these conventional prefetching mechanisms the client file system is supposed to predict future access by analyzing the history of occurred I/O access without any application intervention. After that, the client file system may send relevant I/O requests to storage servers for reading the relevant data. The Prefetching mechanism first analyzes disk I/O tracks to predict the future disk I/O access so that the storage servers can fetch data in advance, and then forward the prefetched data to relevant client file systems for future potential usages. The client disk I/O access operations and classified them into two kinds of access patterns in order to predict the future I/O access that belongs to the different access patterns as accurately as possible prediction algorithms including the chaotic time series prediction algorithm and the linear regression prediction algorithm have been proposed respectively he data prefetching on storage servers. Without any intervention from client file systems except for piggybacking their information onto relevant I/O requests to the storage server. The storage servers are supposed to log disk I/O access and classify access patterns after modeling disk I/O. Storage servers proactively forward the prefetched data to the relevant client file systems for satisfying future application requests.