Mechanism for fail-over notification

5117352
Add to folder: View Folders  
Keywords to Highlight:

full-text

print

pdf

permalink

Inventors

Falek, Louis H.

Application #

424903

Filed

Oct-20-1989

Published

May-26-1992

Current US Class

714/4

International Classes

G06F 011/20; G06F 015/16

Field of Search

364/228.3 364/230.3 364/268.1 364/268.9 364/270.7 371/9.1 371/11.3 371/16.1 371/16.5

Assignee

Digital Equipment Corporation (Maynard, MA)

Examiners

Atkinson; Charles E.

Attorney, Agent or Firm

Kenyon & Kenyon

US Patent References

4399504   Method and means...
4480304   Method and means...
4646298   Self testing data pr...
4660201   Failure notice syste...
4665520   Optimistic recovery...
4768150   Application progra...
4803683   Method and appar...
4815076   Reconfiguration ad...
4827411   Method of maintain...
4965719   Method for lock ma...

Referenced by:

View Backward References

Citation

Cite This Patent

More From Subclass 4

6393581   Reliable time delay...
6188668   Automatic isolation...
5317198   Optically controlled...
5533188   Fault-tolerant proce...
5991893   Virtually reliable s...
7020797   Automated software...
4847610   Method of restoring...
5435003   Restoration in com...
6928584   Segmented protecti...
5452437   Methods of debugg...
6289002   Automatic isolation...
6523178   Video transmission...
6199174   Abnormality recove...
5640504   Distributed computi...
6574753   Peer link fault isola...
6785786   Data backup and r...
6023581   Program debuggin...
5568605   Resolving conflicti...
4435704   Loop transmission s...
5257393   Serially controlled...
4975914   Non-disruptive sess...
6725295   Multi-path compute...
6282669   Ethernet communic...
6944785   High-availability cl...
4774709   Symmetrization for...
6931565   Semiconductor me...
4412281   Distributed signal p...
5675724   Knowledge based r...
6654914   Network fault isolati...
5167033   Congestion control i...
6832301   Method for recoveri...
6158011   Multi-access virtual...
5226037   Line switching system
5922077   Fail-over switching...
5761405   Data integrity guar...
4967344   Interconnection net...
5572658   Network interface
5862311   Inter-equipment dat...
4709365   Data transmission s...
4633473   Fault tolerant com...
6778489   Subscriber termina...
5065399   Telecommunication...
5153874   Redundancy data t...
6910149   Multi-device link a...
6944726   Distributed backgro...
6671821   Byzantine fault toler...
6938179   Socket extensions fo...
6947982   Distributed session...
6665624   Generating and usi...
6963996   Session error recov...
6594786   Fault tolerant high...
6865592   Automatic transacti...
7000141   Data placement for...
5463545   Functional redund...
7020075   Communication sta...
4888586   Data transmission s...
5473771   Fault-tolerant proce...
4542507   Apparatus for switc...
4412285   Multiprocessor inter...
5036518   Guaranteed reliabl...
 

More From Class 714

4768195   Chip tester
6807641   Content provider sy...
5822226   Hardware system v...
5627840   Memory based inte...
7020797   Automated software...
4958345   Memory testing dev...
5430678   Semiconductor me...
6779128   Fault-tolerant data t...
5062109   Memory tester
4245343   Automatic shunt de...
5463545   Functional redund...
5720025   Frequently-redund...
 
Abstract
An automatic failure notification mechanism for use in a computer network wherein several co-operating parts of an application program are each running on a different node of the computer network. The mechanism comprises a set of linked subroutines called by each part of the application program and operating through the use of a distributed lock manager to designate one part as a part to receive failure notification and to link and reverse link the selected part and the other parts of the application program. The failure notification mechanism utilizes the link and reverse link to initiate a failure communication upon the failure of a node executing one part of the application program.
 
Claims
What is claimed is:

1. A method for providing failure notification in a computer network having a plurality of nodes wherein a predetermined number of the nodes are each processing one part of an application program, which method comprises the steps of:

for each part o the application program, making a first request for an exclusive access privilege to a preselected resource name;

for each part of the application program, making a second request for a preselected access privilege to a corresponding resource name having a name based upon the respective part;

granting the first request for an exclusive access privilege to one and only one part of the application program;



Description
FIELD OF THE INVENTION

The invention relates to failure notification in a computer network wherein several co-operating parts of an application program are each running on a different node of the computer network.

BACKGROUND OF THE INVENTION

In modern computer data processing, improved efficiency in the execution of an application program is often achieved by separating the program into several cooperating parts and running each part on a different CPU within a computer network. The several parts of the application program are each run as a detached process on a specific CPU. The several parts may be active one at a time with the other inactive parts in a "standby" mode or they can all be active at the same time as cooperating parts of the overall data processing operation.

For reliable execution of the entire application program, each CPU running one of the parts of the application must function properly throughout the entire processing of the part. If one part of the application program fails due to a CPU crash, it is imperative that notification of the failure be made to enable a network manager to implement appropriate corrective actions. For example, the network manager can transfer the failed part of the application program to another CPU on the network for execution.