Performance and Dependability Evaluation of Scalable Massively Parallel Computer Systems with Conjoint Simulation

Axel HEIN and Mario DAL CIN

Axel HEIN
Zum Hahnenschrei 2
D-69168 Wiesloch, Germany
Tel.: +32 16 32-18-18
Axel.Hein@computer.org

Mario DAL CIN
IMMD III, University of Erlangen-Nuernberg
Martensstr. 3
D-91058 Erlangen, Germany
Tel.: +49 9131 85-7003
dalcin@immd3.informatik.uni-erlangen.de


ACM Transactions on Modeling and Computer Simulation
vol. 8, no. 4 (October 1998)

Paper (PostScript 380 KB)
Paper (GZipped PostScript 109 KB)
Papers only available to TOMACS subscribers and others authorized to access the ACM Digital Library.


Abstract

Computer systems are becoming more and more a part of our daily life; business and industry rely on their service, and the health of human beings depends on their correct functioning. Computer systems used for critical tasks have to be carefully designed and tested during the early design stage, the prototype phase, and the operational life. Methods and tools are required to support and facilitate this vital task.

In this paper, we tackle the issue of system-level performance and dependability analysis of fault-tolerant scalable computer systems. A modeling methodology called "Conjoint Simulation" is presented, which is based on the partitioning of the system model and the combination of several modeling techniques. Object-oriented model construction and process-based simulation are applied for architecture and workload modeling, while timed Petri nets are the core modeling technique to represent the failure scenarios and repair policies. Splitting the overall model and exploiting appropriate modeling techniques ease the development, maintenance, and extensi bility of large-scale and complex simulation models. Furthermore, techniques are provided for hierarchical model construction, object-oriented workload modeling, and simulated error injection in order to perform combined performance and dependability analysis.


General Terms

object-oriented modeling, process-based simulation, hierarchical model design, fault-tolerant and large-scale computer systems, timed Petri nets


Return to Accepted Papers Page
Return to TOMACS Home Page