Keeping critical data safe and accessible from several locations has become a global preoccupation, either being this data personal, organizational or from applications. As a consequence of this issue, we verify the emergence of on-line storage services. In addition, there is the new paradigm of Cloud Computing, which brings new ideas to build services that allow users to store their data and run their applications in the Cloud. By doing a smart and efficient management of these services's storage, it is possible to improve the quality of service ordered, as well as to optimize the usage of the infrastructure where the services run. This management is even more critical and complex when the infrastructure is composed by thousand of nodes running several virtual machines and sharing the same storage. The elimination of redundant data at these services's storage can be used to simplify and enhance this management. This review study presents a solution to detect and eliminate duplicated data between virtual machines that run on the same physical host and write their virtual disks data to a shared storage. Finally, a study that compares the efficiency of two different approaches used to eliminate redundant data in a personal data set is described.
Keyword: - Cloud Computing; Data Redundancy; Virtual machines.
[1]. M. Armbrust, A. Fox, R. Grifith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. Above the clouds: A Berkeley view of cloud computing.Technical report, University of California at Berkeley, 2009.
[2]. P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xian and the art of virtualization. In SOSP '03: Proceedings of the nineteenth ACM symposium on Operating systems principles, pages 164-177. ACM, 2003.
[3]. M. Brantner, D. Florescu, D. Graf, D. Kossmann, and T. Kraska. Building a database on s3. In SIGMOD '08: Pro-ceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 251_264. ACM, 2008.
[4]. A. Z. Broder. Some applications of Rabin's fingerprinting method. In Sequences II: Methods in Communications, Secu- rity, and Computer Science, pages 143_152. Springer-Verlag, 1993.
[5]. R. Buyya, C. S. Yeo, and S. Venugopal. Market-oriented cloud computing: Vision, hype, and reality for delivering it services as computing utilities. In HPCC '08: Proceedings of the 2008 10th IEEE International Conference on High Performance Computing and Communications, pages 5_13. IEEE Computer Society, 2008.