12:20 - 12:55
Paper 3
Design Patterns for High Availability Systems
Dr David Kalinsky
This second part of a 2-part series of talks, begins with a discussion of basic hardware N-plexing and voting issues for high availability systems and software design. These approaches are then generalized to cluster-based system and software designs. This is followed by a survey of a number of backware error recovery fault tolerance techniques including static N-version programming, Checkpoint-Rollback, Process Pairs, and Recovery Blocks. The talk concludes with several forward error recovery techniques.