Fullscreen
Loading...
 
Tao
Print

Autonomic Computing

Participants

David, Cécile Germain-Renaud, Balázs Kégl, Michèle Sebag

Former participants

Tamas Elteto, Xianglaing Zhang, Julien Pérez


Research Themes

Autonomic computing (Kephart & Chess 2003) targets self-optimization, self-configuration, self-healing, and self-protection of computing systems, mostly distributed. As a specialization to the grid context, our research targets self-regulation and maintenance under the constraints of production grids.

The motivation for an approach of grid modelling based on statistical analysis, computational learning and, to some extent, data mining, is the proved complexity of the individual components of the grid, and the potentialization effect of their interaction. Due to the large range of dynamic resources (basic hardware and middleware, but also software usage rights, sensors, etc.), the collective behaviour of e-science users, and the mutualisation paradigm that founds the grid concept, such a large distributed system cannot be modelled only through a-priori analysis.

The operational basis for this research is the creation of the Grid Observatory activity, chaired by C. Germain, in the EGEE-III project. The goal of this activity is to integrate the collection and publication of data on the behaviour grid users, exploiting the rich monitoring infrastructure of EGEE, with the development of models and of an ontology for the domain knowledge.

Concise representations of the grid behaviour (filtering, dimensionality reduction, possibly clustering) are necessary in order to make both analysis and publication manageable. Collaborations with research projects have already started (CoreGrid, KDUbiq) in order to develop the relevant concepts and schemas for each domain.

The second step is to characterize the quantities that are internally used in on-line grid management, such as job duration, or the volume of data transfers, or which are critical to dimensioning, such as the cluster and queues loads. Both intrinsic descriptions (the requests) and middleware dependent metrics must be addressed. Preliminary studies \cite{le poster de Julien} have shown that, even for these simple components, the characterization is far from trivial, long-range dependence being the norm.

The next step is to go beyond this collection of profiles, by characterizing, and finally explaining, their interactions. Integrating the grid new concept of Virtual Organization (VO) might be a key point. Grid users are organized along VOs, which define the qualitative (granted or denied) as well as the quantitative (share) software and hardware access rights. Correlated activity (computation, file access, database requests) will be created by the common timelines of related institutions or individuals; these correlations are both temporal (deadlines, “interesting” experimental events) and spatial (the researchers, the data, and the available computing power of a VO are not uniformly distributed over the grid). The characteristics of complex system also appear in the VO structure, with small-world graphs (Foster).

The two main application areas are scheduling on one hand, and real-time fault diagnosis and prediction on the other hand.

Scheduling

  • Optimization is required to assess the effectiveness of grid scheduling policies. Post-mortem analysis, where all information is derived from the analysis of traces, is just an instance of a classical optimization problem, for which the grid scale will require efficient approximation methods.
  • In an on-line setting, and in relation with the reactive grid framework (reference sur la partie agir), the scheduler should learn on-line an optimal policy maximizing both the long-term expected productivity and utility (Jensen, vengerov) as a function of the system current state, which is the goal of the above-mentioned models.

Fault diagnosis and prediction

Efficient end-to-end probing, where commands or transactions are sent from highly reliable sources and their results analyzed on-line, require an adaptive and hierarchical probing scheme (Rish). An alternative approach is to exploit the actual production of the grid as probes (Schuster). The final stepis to integrate both approaches, typically by considering the optimal balance of probes and passive analysis as a multi-criteria optimization problem.The following challenges can be envisioned.

  • The size of the datasets, even after filtering, will require scaling the standard techniques usually used on small or moderately large sets. In an interesting interplay between the grid as an object of research and the grid as a tool, it might be possible to use the grid itself as a computational resource to speed up the analysis.
  • It will be necessary to adapt the algorithms so they can deal with the structured nature of the data. Third, since the ultimate goal of this research project is to contribute to the understanding the grid, it is very desirable to create models that can be interpreted by human experts.

Details on:

Related projects

The Grid Observatory project is supported by the EU project EGEE  , by the DIGITEO fundation, by the DEMAIN (des DonnéEs?Massives Aux INterprétations) project of Université Paris-Sud, and by CNRS under the PEPS scheme.

{img src=/img/wiki_up/image/AutonomicComp/egee07W.jpg height=254 width=319 alt=Image}


Publications

Xiangliang Zhang, Cyril Furtlehner, Michele Sebag and Cecile Germain-Renaud. Grid Monitoring by Online Clustering. In: 4th EGEE User Forum. Catania, Sicily, Italy. 2009.POSTER

Xiangliang Zhang, Michele Sebag and Cecile Germain-Renaud. Multi-scale Real-time Grid Monitoring with Job Stream Mining. In: IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2009). paper

Germain Renaud C., Perez J., Kégl B., Loomis C.
Grid Differentiated Services: a Reinforcement Learning Approach
In 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008) <http://hal.inria.fr/docs/00/28/78/26/PDF/RLccg08.pdf >

Julien Perez , Cecile Germain-Renaud , Balázs Kégl , C. Loomis
Utility-based Reinforcement Learning for Reactive Grids In The 5th IEEE International Conference on Autonomic Computing (2008) <http://hal.inria.fr/docs/00/28/73/54/PDF/RLICAC08.pdf&docid=287354 >

Xiangliang Zhang, Cyril Furtlehner, Michele Sebag
Distributed and Incremental Clustering Based on Weighted Affinity Propagation In the fourth European Starting AI Researcher Symposium (STAIRS) (2008) <http://hal.inria.fr/docs/00/28/73/78/PDF/STAIRS08_vfinal.pdf&docid=287378 >

Xiangliang Zhang, Cyril Furtlehner, Michele Sebag
Frugal and Online Affinity Propagation In Conference francophone sur l'Apprentissage (CAP) (2008) inria-00287381, version 1
<http://hal.inria.fr/docs/00/28/73/81/PDF/v5_for_final.pdf&docid=287381 >

Xiangliang Zhang, Michele Sebag, Cecile Germain
Le modelage des travaux d'un Systeme de Grille In 16th congr sefrancophone AFRIF-AFIA Reconnaissance des Formes et Intelligence Artificielle (RFIA) (2008)
<http://hal.inria.fr/docs/00/28/73/84/PDF/RFIA2008.pdf&docid=287384 >

Xiangliang Zhang, Cyril Furtlehner, Michele Sebag
Data Streaming with Affinity Propagation In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (2008)
<http://hal.inria.fr/docs/00/29/04/25/PDF/ECML08_final3.pdf&docid=290425 >

Resources


Events

We have initiated the Grids Meet Autonomic Computing  (GMAC) worshop, companion to ICAC'09. GMAC will be held in Barcelona, Spain, 15 June 2009.

Project for a collaboration UPC-UPS

Papers

The AC Related Papers page will list papers that we considered useful of intriguing. Classical and introductory papers are out of the scope.



Contributors to this page: cecile , evomarc , rros , xlzhang , sebag and kegl .
Page last modified on Tuesday 13 of January, 2015 16:28:38 CET by cecile.