[Picture: emblem]

Fault Tolerant Computing

Fault Tolerant Computing is activity of the PARALLEL TEMPUS project.

Available kind of materials:

Overview

Documents
  1. [Document(ps.gz)] Basic terminology by István Majzik (9 pages Gzipped Postscript 40k)
    [Transparencies(ps.gz)] Motivation by dr. András Pataricza (12 slides Gzipped Postscript 34k)
    [Transparencies(ps.gz)] Software Based Fault Tolerance by István Majzik (75 slides Gzipped Postscript 90k)
    1. The dependability concept
    2. On the impairments of dependability
    3. On the attributes of dependability
    4. On the means of dependability
    5. Redundancy techniques
  2. [Document(ps.gz)] Modelling of fault tolerant systems by György Csertán (40 pages Gzipped Postscript 126k)
    [Transparencies(ps.gz)] Fault Modeling by dr. András Pataricza (15 slides Gzipped Postscript 41k)
    [Transparencies(ps.gz)] Automatic Test Generation by dr. András Pataricza (21 slides Gzipped Postscript 61k)
    1. Introduction
    2. The modeling approach
      1. Dataflow networks
      2. Informal presentation of the model
      3. Formalism of dataflow networks
      4. Fault modeling
      5. Uncertainty modeling
      6. Features of the approach
      7. An example
    3. Model refinement
      1. Approaches to model refinement
      2. Dataflow refinement
      3. Domain refinement
      4. Structure refinement
      5. Refinement of the example
    4. Appendix
  3. [Document(ps.gz)] Concurrent error detection by István Majzik (28 pages Gzipped Postscript) 95k
    [Transparencies(ps.gz)] Watchdog Processors by dr. András Pataricza (24 slides Gzipped Postscript 62k)
    [Transparencies(ps.gz)] Dependable Multiprocessors by dr. András Pataricza (9 slides Gzipped Postscript 24k)
    1. Basics of error detection
    2. Classification of errors in microprocessor systems
    3. Concurrent error detection techniques
    4. Watchdog processors
    5. Control flow checking using derived signatures
    6. Control flow checking using assigned signatures
    7. Watchdog processors in parallel systems
    8. Conclusion
  4. [Document(ps.gz)] Master-checker mode by dr. András Pataricza (9 pages Gzipped Postscript) 31k
    [Transparencies(ps.gz)] Master-Checker Setup by dr. András Pataricza (14 slides Gzipped Postscript 35k)
    [Transparencies(ps.gz)] CPU Testing by dr. András Pataricza (12 slides Gzipped Postscript 82k)
    1. The master-checker principle
    2. Fault coverage
    3. Error latency
    4. Comparison of the different self-test techniques
  5. [Document(ps.gz)] Memory protection by dr. András Pataricza (15 pages Gzipped Postscript 40k)
    [Transparencies(ps.gz)] Memory Testing by dr. András Pataricza (17 slides Gzipped Postscript 54k)
    1. Introduction
    2. Information vs. time redundancy
    3. Error detecting codes in compact testing
  6. [Document(ps.gz)] Overview of RAID by András Petri (14 pages Gzipped Postscript 113k)
    [Document(ps.gz)] Overview of RAID by András Petri (8 pages Gzipped Postscript 85k)
    1. Overview
    2. Disk array basics
    3. Data striping and redundancy
    4. Basic RAID organizations
    5. Performance and cost comparisons
    6. Comparisons
    7. Reliability
    8. Implementation considerations
  7. [Document(ps.gz)] Statistical techniques for analyzing fault-tolerant systems by András Petri (8 pages Gzipped Postscript 70k)
    1. Introduction
    2. Statistical techniques
    3. Parameter estimation
    4. Distribution characterization
  8. [Document(ps.gz)] Software fault tolerance by István Majzik (13 pages Gzipped Postscript 62k)
    [Transparencies(ps.gz)] Software Based Fault Tolerance by István Majzik (75 slides Gzipped Postscript 90k)
    1. Exception handling
    2. The recovery block scheme
    3. The N-version programming scheme
    4. The N-self-checking programming scheme
    5. Self-configuring optimistic programming
    6. Language support for software fault tolerance
    7. Hardware architecture and software fault tolerance
    8. Comparison of the schemes
  9. [Document(ps.gz)] System-Level Fault Diagnosis by Tamás Bartha (80 pages Gzipped Postscript 188k)
    [Transparencies(ps.gz)] Integrated Diagnostics by dr. András Pataricza (29 slides Gzipped Postscript 98k)
    1. Introduction
    2. System model
    3. Fault models
    4. Testing models
    5. Diagnostic algorithms
    6. Classification of diagnostic algorithms
    7. Deterministic algorithms
    8. Probabilistic algorithms
  10. AI based diagnosis
Additional transparencies
  1. [Transparencies(ps.gz)] Components of Dependable Distributed Systems by Tamás Bartha
    Transparency (27 Slides Gzipped Postscript 93k)
    Abstract (HTML)
  2. [Transparencies(ps.gz)] Concept and Practice of Fault-Tolerant Distributed Systems by Tamás Bartha
    Transparencies (34 Slides Gzipped Postscript 159k)
    Abstract (HTML)

Literature

  1. S. Mishra and R. D. Schlichting
    Abstractions for Constructing Dependable Distributed Systems
    TR 92-19, University of Arisona, 1992.
  2. F. Christian
    Understanding Fault-Tolerant Distributed Systems
    Comm. of the ACM, vol. 34, No. 2, pp. 57-78, Feb. 1991.

[DoA] [TEMPUS] [MODIFY] [PARALLEL] [DOWNLOAD] [UPLOAD] [SEARCH] [FEEDBACK]