|Title:||Adaptive Real-time Monitoring for Large-scale Networked Systems|
|Committee:||Opponent: Prof. George Pavlou, University College London, UK.
Grading committee: Dr. Alexander Clemm, Cisco Systems, USA, Dr. Olivier Festor, INRIA, France, Assoc. Prof. Mads Dam, KTH, Sweden.
|Funding:||EU Ambient Networks Project, EU 4WARD Project|
PhD project description
We pursue a design that enables a large set of nodes in a networked environment to achieve a goal through cooperation. Given a goal, the nodes must determine and execute the appropriate actions (considering their capabilities) that lead them to the achievement of their goal. The design must be adaptive, controllable, and have low and controllable computational overhead. Adaptability means that the design must be able to react to changes in networking conditions so that, despite these changes, it achieves its goal. Control is a must in every management system. While the node must be autonomous in its decision-making process, the administrator must have the capacity to guide or control its behavior. The administrator must be allowed to set limits to the node behavior. This control can be expressed as a set of forbidden actions and/or forbidden states. In order to be practically feasible, the computational cost of the design must be low and controllable. Low cost is a must for real-time management. It is crucial for achieving timely adaptability. Controllability permits vendors to bound the processing resources required by management tasks running on their devices. A key aspect is the coordination among nodes. If the nodes do not coordinate their actions appropriately, it might be impossible to achieve the goal. For instance, two uncoordinated nodes may perform opposed actions that cancel each other. Another example is an erratic node jeopardizing the goal achievement. Our research has been centered in the context of continuous monitoring of large-scale networks. Specifically, on the monitoring of network-wide metrics computed from device counters using aggregation functions, such as SUM, AVERAGE and MAX. Examples of such metrics include the total number of VoIP flows and the maximum link utilization in a network domain. We present A-GAP, a novel protocol for continuous monitoring of network state variables, which aims at achieving a given monitoring accuracy with minimal overhead. Network state variables are computed from device counters using aggregation functions, such as SUM, AVERAGE and MAX. The accuracy objective is expressed as the average estimation error. A-GAP is decentralized and asynchronous to achieve robustness and scalability. It executes on an overlay that interconnects management processes on the devices. On this overlay, the protocol maintains a spanning tree and updates the network state variables through incremental aggregation. It dynamically configures local filters that control whether an update is sent towards the root of the tree. We evaluate A-GAP through simulation using real traces and two different types of topologies of up to 650 nodes. The results show that we can effectively control the trade-off between accuracy and protocol overhead, and that the overhead can be reduced by almost two orders of magnitude for allowing small errors. The protocol quickly adapts to a node failure and exhibits short spikes in the estimation error for a fraction of a second. Lastly, it can provide an accurate estimate of the error distribution in real-time.
- A. Gonzalez Prieto, R.Stadler "A-GAP: An Adaptive Protocol for Continuous Network Monitoring with Accuracy Objectives", IEEE Transactions on Network and Service Management (TNSM), Vol. 4, No. 1, June 2007
- A. Gonzalez Prieto and R.Stadler, "Monitoring Flow Aggregates with Controllable Accuracy", submitted to the 10th IFIP/IEEE International Conference on Management of Multimedia and Mobile Networks and Services (MMNS 2007), San José, California, USA, October 2007
- A. Gonzalez Prieto and R.Stadler “Real-time Network Monitoring Supporting Percentile Error Objectives", 14th HP Software University Association (HP-SUA) Workshop, 8-11 July 2007, Munich, Germany
- A. Gonzalez Prieto, R. Stadler, P. Kersch, R. Szabo, G. Nunzi, M. Brunner and S. Schuetz, "Distributed Network Management for AN" , 16th IST Mobile and Wireless Communication Summit, Budapest, Hungary, July 1-5, 2007
- R. Ocampo, L. Cheng, K. Jean, A. Gonzalez Prieto, A. Galis, Z. Lai, "Towards a Context Monitoring System for Ambient Networks", Chinacom, Beijing, China, October 25-27, 2006
- A. Gonzalez Prieto and R.Stadler "Adaptive Distributed Monitoring with Accuracy Objectives", in the Proceedings of ACM SIGCOMM workshop on Internet Network Management (INM 06), Pisa, Italy, September 11, 2006
- A. Gonzalez Prieto: "Adaptive Management for Networked Systems", KTH, Licentiate Thesis, June 2006
- J. Nielssen, Z. Lajos Kis, A. Gonzalez Prieto, R.Stadler, M. Brunner, "Pattern-based Network Management for Ambient Networks", 15th IST Mobile and Wireless Communication Summit, Myconos, Greece, June 4-6, 2006
- A. Gonzalez Prieto and R.Stadler "Distributed Real-Time Monitoring with Accuracy Objectives", in the Proceedings of IFIP Networking, Coimbra, Portugal, May 15-19, 2006
- A. Gonzalez Prieto and R.Stadler "Design and Implementation of Performance Policies for SMS Systems", in the Proceedings of the 16th IFIP/IEEE Distributed Systems: Operations and Management (DSOM 2005), Barcelona, Spain, October 24-26, 2005
- M. Brunner, A. Galis, L. Cheng, J. Andres Colas, B. Ahlgren, A. Gunnar, H. Abrahamsson, R. Szabo, S. Csaba, J. Nielssen, A. Gonzalez Prieto, R.Stadler, G. Molnar, "Towards Ambient Networks Management", Second International Workshop on Mobility Aware Technologies and Applications (MATA 2005), Montreal, Canada, October 17-19, 2005.
- A. Gonzalez Prieto and R.Stadler, "Scalable Policy Distribution for Ambient Networks", 14th IST Mobile and Wireless Communication Summit, Dresden, Germany, June 19-23, 2005
- A. Gonzalez Prieto and R.Stadler "Policy-based Management for SMS Systems", RVK 2005, Linkoping, Sweden, June 14-16, 2005
- A. Gonzalez Prieto and R.Stadler "Evaluating a Congestion Management Architecture for SMS Gateways", 9th IFIP/IEEE International Symposium on Integrated Network Management (IM 2005), Nice, France, May 15-19, 2005 (poster)
- M. Brunner, A. Galis, L. Cheng, J. Andres Colas, B. Ahlgren, A. Gunnar, H. Abrahamsson, R. Szabo, S. Csaba, J. Nielssen, A. Gonzalez Prieto, R.Stadler, G. Molnar, "Ambient Networks Management Challenges and Approaches", First International Workshop on Mobility Aware Technologies and Applications (MATA 2004), Florianpolis, Brazil, October 20-22, 2004.
- A. Gonzalez Prieto, R. Cosenza and R.Stadler "Policy-based Congestion Management for an SMS Gateway ", in the Proceedings of the IEEE 5th International Workshop on Policies for Distributed Systems and Networks (POLICY 2004), Yorktown Heights, New York, June 7-9, 2004
- [_URL_ Homepage] of Alberto Gonzalez
- Publications of Alberto Gonzalez, as indexed by DBLP