Back to first pageBack to first page Centre for Artificial Intelligence of UNL
Browse our site

Probabilistic deduplication for cluster-based storage systems

Main informationBy: Konstantinos Kloudas (CITI, Universidade Nova de Lisboa)

Date: Wednesday, 29th of January 2014, 14h00

Location: FCT/UNL, Seminar Room (Ed. II)
AbstractThe need to backup huge quantities of data that keep increasing by the minute has led to the development of a number of distributed deduplication techniques that aim to reproduce the operation of centralized, single-node backup systems in a cluster-based environment. At one extreme, stateful solutions rely on indexing mechanisms to maximize deduplication. However the cost of these strategies in terms of computation and memory resources makes them unsuitable for large-scale storage systems. At the other extreme, stateless strategies store data blocks based only on their content, without taking into account previous placement decisions, thus reducing the cost but also the effectiveness of deduplication.
In this talk, I will present, Produck, a stateful, yet light-weight cluster-based backup system that provides deduplication rates close to those of a single-node system at a very low computational cost and with minimal memory overhead. To achieve this, Produck introduces two novel mechanisms: i) a lightweight probabilistic node-assignment mechanism and ii) a new bucket- based load-balancing strategy. The former allows Produck to quickly identify the servers that can provide the highest deduplication rates for a given data block. The latter efficiently spreads the load equally among the nodes.
Short-bioI am currently a Post-Doctoral Researcher at Nova University of Lisbon in the group of Prof. Rodrigo Rodrigues, where I started in June 2013. Before that, I completed my PhD in INRIA Rennes in March 2013 under the supervision of Anne-Marie Kermarrec. The title of my thesis was: “Leveraging Content Properties to Optimize Distributed Storage Systems”. I received my diploma in Electrical and Computer Engineering from the National Technical University of Athens, Greece in June 2009.

Centre for Artificial Intelligence of UNL
Departamento de Informática, FCT/UNL
Quinta da Torre 2829-516 CAPARICA - Portugal
Tel. (+351) 21 294 8536 FAX (+351) 21 294 8541

Fundacao_FCT