The following is a tentative list of topics to be covered in the course:
• Introduction
• Definition of fault tolerance
• Redundancy
• Applications of fault-tolerance
• Fundamentals of dependability
• Attributes: reliability, availability, safety
• Impairments: faults, errors and failures
• Means: fault prevention, removal and forecasting
• Dependability evaluation
• Common measures: failures rate, mean time to failure, mean time to repair, etc.
• Reliability block diagrams
• Markov processes
• Hardware redundancy
• Redundancy schemes
• Evaluation and comparison
• Applications
• Information redundancy
• Codes: linear, Hamming, cyclic, unordered, arithmetic, etc.
• Encoding and decoding techniques
• Applications
• Time redundancy
• Software fault tolerance
• Specific features
• Software fault tolerance techniques: N-version programming, recovery blocks, self-checking software, etc.
The aim of this course is to give doctoral students knowledge necessary to develop dependable systems. As our society becomes more and more reliant on computer, software and embedded systems, dependability of these systems becomes a critical issue. In airplanes, chemical plants or heart pace-makers a system's failure can cost people's lives or environmental disaster. After completing the course, the students should be able to demonstrate the knowledge and skills required to implement and evaluate various fault-tolerant approaches. More specifically, upon completion, students will be able to:
• Describe the state-of-the-art fault-tolerant design techniques. Justify their targeted applications and limitations. Describe how the dependability is assured in an exemplary application.
• Describe dependability means, attributes and impairments. Apply the knowledge to select a suitable set of attributes for a specific application scenario.
• Analyze and critically access the tradeoff among system dependability, performance, and cost. Exemplify some of the trade-offs that are available to designers of electronic and embedded systems.
• Explain the need for different redundancy techniques. Justify pros and contras of different redundancy techniques and select a suitable one for a specific application.
• Apply the knowledge to design a small electronic or embedded system with enhanced dependability. Explain how the dependability is assured in the system.