Information and Telecommunications Engineering Department, The Ming Chuan University, Taipei 11120, Taiwan, ROC Received 4 September 2001; Accepted 2 October 2002. Available online 13 February 2004.
In a typical distributed computing system (DCS), nodes consist of processing elements, memory units, shared resources, data files, and programs. For a distributed application, programs and data files are distributed among many processing elements that may exchange data and control information via communication link. The reliability of DCS can be expressed by the analysis of distributed program reliability (DPR) and distributed system reliability (DSR). In this paper, two reliability measures are introduced which are Markov-chain distributed program reliability (MDPR) and Markov-chain distributed system reliability (MDSR) to accurately model the reliability of DCS. A discrete time Markov chain with one absorbing state is constructed for this problem. The transition probability matrix is employed to represent the transition probability from one state to another state in a unit of time. In addition to mathematical method to evaluate the MDPR and MDSR, a simulation result is also presented to prove its correction.
Go to Journal