Difference between revisions of "Alberto Gonzalez"

From SimpleWiki
Jump to navigationJump to search
 
(9 intermediate revisions by one other user not shown)
Line 6: Line 6:
 
| align="left"  | Alberto Gonzalez
 
| align="left"  | Alberto Gonzalez
 
|-
 
|-
| align="right" | '''Title:'''
+
| align="right" | '''Title:'''-
 
| align="left"  | Adaptive Real-time Monitoring for Large-scale Networked Systems
 
| align="left"  | Adaptive Real-time Monitoring for Large-scale Networked Systems
 
|-
 
|-
Line 19: Line 19:
 
|-
 
|-
 
| align="right" | '''Committee:'''
 
| align="right" | '''Committee:'''
| align="left"  | Prof. George Pavlou (Opponent), University College London, UK. Dr. Alexander Clemm, Cisco Systems, USA,
+
| align="left"  | Prof. George Pavlou, University College London, UK (Opponent). <br>Dr. Alexander Clemm, Cisco Systems, USA. <br>Dr. Olivier Festor, INRIA, France. Assoc. <br>Prof. Mads Dam, KTH, Sweden.
Dr. Olivier Festor, INRIA, France,
 
Assoc. Prof. Mads Dam, KTH, Sweden.
 
  
 
|-
 
|-
Line 36: Line 34:
 
<DIV style="text-align:justify">
 
<DIV style="text-align:justify">
 
== Biography ==
 
== Biography ==
 +
Alberto Gonzalez is a postdoctoral researcher at KTH, the Royal Institute of Technology in Stockholm, Sweden.
 +
He received his M.Sc. in Electrical Engineering from the Universidad Politecnica de Cataluña, Spain and his Ph.D. in Telecommunication from KTH in 2008. His research interests include autonomic network management, management of large-scale networked systems, distributed network algorithms and real-time monitoring. He was an intern with NEC Network Laboratories (Heidelberg, Germany) in 2001, and with AT&T Research (Florham Park, NJ, USA) in 2007.
  
 
== PhD project description ==
 
== PhD project description ==
We pursue a design that enables a large set of nodes in a networked environment to achieve a goal through cooperation. Given a goal, the nodes must determine and execute the appropriate actions (considering their capabilities) that lead them to the achievement of their goal. The design must be adaptive, controllable, and have low and controllable computational overhead. Adaptability means that the design must be able to react to changes in networking conditions so that, despite these changes, it achieves its goal. Control is a must in every management system. While the node must be autonomous in its decision-making process, the administrator must have the capacity to guide or control its behavior. The administrator must be allowed to set limits to the node behavior. This control can be expressed as a set of forbidden actions and/or forbidden states. In order to be practically feasible, the computational cost of the design must be low and controllable. Low cost is a must for real-time management. It is crucial for achieving timely adaptability. Controllability permits vendors to bound the processing resources required by management tasks running on their devices. A key aspect is the coordination among nodes. If the nodes do not coordinate their actions appropriately, it might be impossible to achieve the goal. For instance, two uncoordinated nodes may perform opposed actions that cancel each other. Another example is an erratic node jeopardizing the goal achievement. Our research has been centered in the context of continuous monitoring of large-scale networks. Specifically, on the monitoring of network-wide metrics computed from device counters using aggregation functions, such as SUM, AVERAGE and MAX. Examples of such metrics include the total number of VoIP flows and the maximum link utilization in a network domain.
+
Large-scale networked systems, such as the Internet and server clusters, are omnipresent today. They increasingly deliver services that are critical to both businesses and the society at large, and therefore their continuous and correct operation must be guaranteed. Achieving this requires the realization of adaptive management systems, which continuously reconfigure such large-scale dynamic systems, in order to maintain their state near a desired operating point, despite changes in the networking conditions.
We present A-GAP, a novel protocol for continuous monitoring of network state variables, which aims at achieving a given monitoring accuracy with minimal overhead. Network state variables are computed from device counters using aggregation functions, such as SUM, AVERAGE and MAX. The accuracy objective is expressed as the average estimation error. A-GAP is decentralized and asynchronous to achieve robustness and scalability. It executes on an overlay that interconnects management processes on the devices. On this overlay, the protocol maintains a spanning tree and updates the network state variables through incremental aggregation. It dynamically configures local filters that control whether an update is sent towards the root of the tree. We evaluate A-GAP through simulation using real traces and two different types of topologies of up to 650 nodes. The results show that we can effectively control the trade-off between accuracy and protocol overhead, and that the overhead can be reduced by almost two orders of magnitude for allowing small errors. The protocol quickly adapts to a node failure and exhibits short spikes in the estimation error for a fraction of a second. Lastly, it can provide an accurate estimate of the error distribution in real-time.
+
The focus of this thesis is continuous real-time monitoring, which is essential for the realization of adaptive management systems in large-scale dynamic environments. Real-time monitoring provides the necessary input to the decision-making process of network management, enabling management systems to perform self-configuration and self-healing tasks.
 +
We have developed, implemented, and evaluated a design for real-time continuous monitoring of global metrics with performance objectives, such as monitoring overhead and estimation accuracy. Global metrics describe the state of the system as a whole, in contrast to local metrics, such as device counters or local protocol states, which capture the state of a local entity. Global metrics are computed from local metrics using aggregation functions, such as SUM, AVERAGE and MAX.
 +
Our approach is based on in-network aggregation, where global metrics are incrementally computed using spanning trees. Performance objectives are achieved through filtering updates to local metrics that are sent along that tree. A key part in the design is a model for the distributed monitoring process that relates performance metrics to parameters that tune the behavior of a monitoring protocol. The model allows us to describe the behavior of individual nodes in the spanning tree in their steady state. The model has been instrumental in designing a monitoring protocol that is controllable and achieves given performance objectives.
 +
We have evaluated our protocol, called A-GAP, experimentally, through simulation and testbed implementation. It has proved to be effective in meeting performance objectives, efficient, adaptive to changes in the networking conditions, controllable along different performance dimensions, and scalable. We have implemented a prototype on a testbed of commercial routers. The testbed measurements are consistent with simulation studies we performed for different topologies and network sizes. This proves the feasibility of the design, and, more generally, the feasibility of effective and efficient real-time monitoring in large network environments.
  
 
</DIV>
 
</DIV>
  
 
== References ==
 
== References ==
 +
# A. Gonzalez Prieto and R. Stadler, “Adaptive Real-time Monitoring for Large-scale Networked Systems”, Dissertation Digest Session, 11th IFIP/IEEE International Symposium on Integrated Network Management (IM 2009), New York, USA, June 2009.
 +
# A. Gonzalez Prieto and R. Stadler, "Controlling Performance Trade-offs in Adaptive Network Monitoring", 11th IFIP/IEEE International Symposium on Integrated Network Management (IM 2009), New York, USA, June 2009.
 
# A. Gonzalez Prieto, R.Stadler &quot;A-GAP: An Adaptive Protocol for Continuous Network Monitoring with Accuracy Objectives&quot;, IEEE Transactions on Network and Service Management (TNSM), Vol. 4, No. 1, June 2007
 
# A. Gonzalez Prieto, R.Stadler &quot;A-GAP: An Adaptive Protocol for Continuous Network Monitoring with Accuracy Objectives&quot;, IEEE Transactions on Network and Service Management (TNSM), Vol. 4, No. 1, June 2007
 
# A. Gonzalez Prieto and R.Stadler, &quot;Monitoring Flow Aggregates with Controllable Accuracy&quot;, submitted to the 10th IFIP/IEEE International Conference on Management of Multimedia and Mobile Networks and Services (MMNS 2007), San José, California, USA, October 2007
 
# A. Gonzalez Prieto and R.Stadler, &quot;Monitoring Flow Aggregates with Controllable Accuracy&quot;, submitted to the 10th IFIP/IEEE International Conference on Management of Multimedia and Mobile Networks and Services (MMNS 2007), San José, California, USA, October 2007
Line 61: Line 66:
 
# A. Gonzalez Prieto, R. Cosenza and R.Stadler &quot;Policy-based Congestion  Management for an SMS Gateway &quot;, in the Proceedings of the IEEE 5th International Workshop on Policies for Distributed Systems and Networks (POLICY 2004), Yorktown Heights, New York, June 7-9, 2004
 
# A. Gonzalez Prieto, R. Cosenza and R.Stadler &quot;Policy-based Congestion  Management for an SMS Gateway &quot;, in the Proceedings of the IEEE 5th International Workshop on Policies for Distributed Systems and Networks (POLICY 2004), Yorktown Heights, New York, June 7-9, 2004
  
== Additional information ==
 
  
 +
== External links ==
 +
* [http://www.ee.kth.se/~gonzalez Homepage] of Alberto Gonzalez
  
 
== External links ==
 
* [_URL_ Homepage] of Alberto Gonzalez
 
* Publications of Alberto Gonzalez, as [http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/_XXXX_  indexed by DBLP]
 
  
 
[[Category:PhD students]]
 
[[Category:PhD students]]
 
[[Category:People]]
 
[[Category:People]]

Latest revision as of 08:45, 18 May 2010

Summary
Student: Alberto Gonzalez
Title:- Adaptive Real-time Monitoring for Large-scale Networked Systems
e-mail: gonzalez@ee.kth.se
Affiliation: KTH
Supervisor: Rolf Stadler
Committee: Prof. George Pavlou, University College London, UK (Opponent).
Dr. Alexander Clemm, Cisco Systems, USA.
Dr. Olivier Festor, INRIA, France. Assoc.
Prof. Mads Dam, KTH, Sweden.
Start: 2002
End: 2008
Funding: EU Ambient Networks Project, EU 4WARD Project

Biography

Alberto Gonzalez is a postdoctoral researcher at KTH, the Royal Institute of Technology in Stockholm, Sweden. He received his M.Sc. in Electrical Engineering from the Universidad Politecnica de Cataluña, Spain and his Ph.D. in Telecommunication from KTH in 2008. His research interests include autonomic network management, management of large-scale networked systems, distributed network algorithms and real-time monitoring. He was an intern with NEC Network Laboratories (Heidelberg, Germany) in 2001, and with AT&T Research (Florham Park, NJ, USA) in 2007.

PhD project description

Large-scale networked systems, such as the Internet and server clusters, are omnipresent today. They increasingly deliver services that are critical to both businesses and the society at large, and therefore their continuous and correct operation must be guaranteed. Achieving this requires the realization of adaptive management systems, which continuously reconfigure such large-scale dynamic systems, in order to maintain their state near a desired operating point, despite changes in the networking conditions. The focus of this thesis is continuous real-time monitoring, which is essential for the realization of adaptive management systems in large-scale dynamic environments. Real-time monitoring provides the necessary input to the decision-making process of network management, enabling management systems to perform self-configuration and self-healing tasks. We have developed, implemented, and evaluated a design for real-time continuous monitoring of global metrics with performance objectives, such as monitoring overhead and estimation accuracy. Global metrics describe the state of the system as a whole, in contrast to local metrics, such as device counters or local protocol states, which capture the state of a local entity. Global metrics are computed from local metrics using aggregation functions, such as SUM, AVERAGE and MAX. Our approach is based on in-network aggregation, where global metrics are incrementally computed using spanning trees. Performance objectives are achieved through filtering updates to local metrics that are sent along that tree. A key part in the design is a model for the distributed monitoring process that relates performance metrics to parameters that tune the behavior of a monitoring protocol. The model allows us to describe the behavior of individual nodes in the spanning tree in their steady state. The model has been instrumental in designing a monitoring protocol that is controllable and achieves given performance objectives. We have evaluated our protocol, called A-GAP, experimentally, through simulation and testbed implementation. It has proved to be effective in meeting performance objectives, efficient, adaptive to changes in the networking conditions, controllable along different performance dimensions, and scalable. We have implemented a prototype on a testbed of commercial routers. The testbed measurements are consistent with simulation studies we performed for different topologies and network sizes. This proves the feasibility of the design, and, more generally, the feasibility of effective and efficient real-time monitoring in large network environments.

References

  1. A. Gonzalez Prieto and R. Stadler, “Adaptive Real-time Monitoring for Large-scale Networked Systems”, Dissertation Digest Session, 11th IFIP/IEEE International Symposium on Integrated Network Management (IM 2009), New York, USA, June 2009.
  2. A. Gonzalez Prieto and R. Stadler, "Controlling Performance Trade-offs in Adaptive Network Monitoring", 11th IFIP/IEEE International Symposium on Integrated Network Management (IM 2009), New York, USA, June 2009.
  3. A. Gonzalez Prieto, R.Stadler "A-GAP: An Adaptive Protocol for Continuous Network Monitoring with Accuracy Objectives", IEEE Transactions on Network and Service Management (TNSM), Vol. 4, No. 1, June 2007
  4. A. Gonzalez Prieto and R.Stadler, "Monitoring Flow Aggregates with Controllable Accuracy", submitted to the 10th IFIP/IEEE International Conference on Management of Multimedia and Mobile Networks and Services (MMNS 2007), San José, California, USA, October 2007
  5. A. Gonzalez Prieto and R.Stadler “Real-time Network Monitoring Supporting Percentile Error Objectives", 14th HP Software University Association (HP-SUA) Workshop, 8-11 July 2007, Munich, Germany
  6. A. Gonzalez Prieto, R. Stadler, P. Kersch, R. Szabo, G. Nunzi, M. Brunner and S. Schuetz, "Distributed Network Management for AN" , 16th IST Mobile and Wireless Communication Summit, Budapest, Hungary, July 1-5, 2007
  7. R. Ocampo, L. Cheng, K. Jean, A. Gonzalez Prieto, A. Galis, Z. Lai, "Towards a Context Monitoring System for Ambient Networks", Chinacom, Beijing, China, October 25-27, 2006
  8. A. Gonzalez Prieto and R.Stadler "Adaptive Distributed Monitoring with Accuracy Objectives", in the Proceedings of ACM SIGCOMM workshop on Internet Network Management (INM 06), Pisa, Italy, September 11, 2006
  9. A. Gonzalez Prieto: "Adaptive Management for Networked Systems", KTH, Licentiate Thesis, June 2006
  10. J. Nielssen, Z. Lajos Kis, A. Gonzalez Prieto, R.Stadler, M. Brunner, "Pattern-based Network Management for Ambient Networks", 15th IST Mobile and Wireless Communication Summit, Myconos, Greece, June 4-6, 2006
  11. A. Gonzalez Prieto and R.Stadler "Distributed Real-Time Monitoring with Accuracy Objectives", in the Proceedings of IFIP Networking, Coimbra, Portugal, May 15-19, 2006
  12. A. Gonzalez Prieto and R.Stadler "Design and Implementation of Performance Policies for SMS Systems", in the Proceedings of the 16th IFIP/IEEE Distributed Systems: Operations and Management (DSOM 2005), Barcelona, Spain, October 24-26, 2005
  13. M. Brunner, A. Galis, L. Cheng, J. Andres Colas, B. Ahlgren, A. Gunnar, H. Abrahamsson, R. Szabo, S. Csaba, J. Nielssen, A. Gonzalez Prieto, R.Stadler, G. Molnar, "Towards Ambient Networks Management", Second International Workshop on Mobility Aware Technologies and Applications (MATA 2005), Montreal, Canada, October 17-19, 2005.
  14. A. Gonzalez Prieto and R.Stadler, "Scalable Policy Distribution for Ambient Networks", 14th IST Mobile and Wireless Communication Summit, Dresden, Germany, June 19-23, 2005
  15. A. Gonzalez Prieto and R.Stadler "Policy-based Management for SMS Systems", RVK 2005, Linkoping, Sweden, June 14-16, 2005
  16. A. Gonzalez Prieto and R.Stadler "Evaluating a Congestion Management Architecture for SMS Gateways", 9th IFIP/IEEE International Symposium on Integrated Network Management (IM 2005), Nice, France, May 15-19, 2005 (poster)
  17. M. Brunner, A. Galis, L. Cheng, J. Andres Colas, B. Ahlgren, A. Gunnar, H. Abrahamsson, R. Szabo, S. Csaba, J. Nielssen, A. Gonzalez Prieto, R.Stadler, G. Molnar, "Ambient Networks Management Challenges and Approaches", First International Workshop on Mobility Aware Technologies and Applications (MATA 2004), Florianpolis, Brazil, October 20-22, 2004.
  18. A. Gonzalez Prieto, R. Cosenza and R.Stadler "Policy-based Congestion Management for an SMS Gateway ", in the Proceedings of the IEEE 5th International Workshop on Policies for Distributed Systems and Networks (POLICY 2004), Yorktown Heights, New York, June 7-9, 2004


External links