|
Oracle Fusion Middleware Oracle WebLogic Server API Reference 11g Release 1 (10.3.5) Part Number E13941-05 |
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweblogic.cluster.singleton.SimpleLeasingBasis
weblogic.cluster.singleton.ReplicatedLeasingBasis
public class ReplicatedLeasingBasis
LeasingBasis
that delegates to replicated remote instances.
Lampson and others advocate that high-performance,
high-availability distributed systems utilize hierarchical lease
managers. The reasoning is that consensus algorithms are generally
slow and costly - this is true of both Paxos and our DatabaseLeasingBasis
- and therefore cannot accomdate fast
failover. Instead the primordial lease manager (PLM), is identified
utilizing Paxos or some other consensus mechanism and long leases,
and is used to manage short leases for some other master lease
manager we shall call the hierarchical lease manager (HLM). The
general principal is that the identification of the PLM is slow but
does not rely on singleton state. The HLM is essentially singleton
state that is managed through leases rather than consensus and can
therefore be fast. The HLM owns the state for the duration of its
lease. It may replicate that state elsewhere, or write it to stable
storage, for fault-tolerance but it is still the owner for the
duration of the lease. Other parties with some interest in the
state agree to abide by the lease interval also avoiding
split-brain syndrome.
In order for failover of the HLM to occur failure must be detected and a new HLM elected. The new HLM needs to have access to the old HLM's state. The HLM is leasing relatively quickly and so missed heartbeats can be detected relatively quickly. The new HLM can be elected by other candidate HLM's constantly trying to aquire the HLM lease from the PLM. The candidate HLM's can have access to the primary HLM's state through a replication framework.
In the caching scenario the HLM is not really a lease manager but instead contains partition location information. A client would manipulate a paritioned map by first inquiring of the HLM where a partition resides. Once the location information is known the partition is contacted directly. In order to provide fault tolerance the partition state needs to be synchronously replicated and the HLM needs to determine which copy is the copy-of-record. This can be achieved once again by the primary and secondary leasing against the HLM. This is exactly the scenario we just described for the HLM. One might wonder why we cannot collapse the partition leasing into the HLM and indeed we can, the issue is one of scalability.and fault-tolerance.
Let us consider the case of the partition owners leasing directly against the PLM first. Any cache operation first needs to determine the partition location and then perform the operation. Determining the partition location means that either (a) the caller contacts the PLM to get this or, (b) the partition table is replicated to all servers. (a) is clearly a single point-of-failure. The partition table would have to be maintained in stable storage and the failover time would equate to the server reboot time for the PLM. (b) is a non-sequitur, the state can be replicated, but it cannot be consulted since all copies are, apart from that of the PLM, not authoritative. For the replicated copies to be authoritative the consensus algorithm would have to be run on the partition table itself - a costly operation.
Let us now consider the case of the partition owners leasing against the HLM which in turn leases against the PLM. Partition location information would be determined by contacting the HLM. How does a server know where the HLM resides? In the first instance the server gets this from the PLM. This is not a lease, merely a bootstrap to the HLM. Once the HLM (and the HLM's secondary) is located a server does not need to contact the PLM again unless both the HLM and HLM secondary fail simultaneously. All well and good, but suppose the PLM fails instead. The master HLM's lease will expire and it will be unable to renew it. In this instance a failure detector can be used to determine the outcome of its current lease.
Failure detectors can be implemented in practice by having the failure detector probe each process regularly; an unresponsive process p is placed on the list and a broadcast message is sent to all processes (including p) announcing its death. If p is has not actually crashed, then it will eventually refute its death announcement. Chandra and Toueg show that this weak and unreliable model of failure detectors allows the consensus problem to be solved.
In our case it the master HLM should assume that the PLM has failed and announce it to interested parties. If the PLM has not failed then it will refute the claim, or if the PLM is not contactable by the master, but is contactable by the secondary then the secondary will refute the claim. In this case the master should relinquish its lease since the secondary will have already obtained the lease. If no-one refutes the claim then the master can continue to hold its lease until the consensus algorithm has elected a new PLM. For N HLM's we can tolerate N-1 failures. Thus if we want more reliability we need mor replicas.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class weblogic.cluster.singleton.SimpleLeasingBasis |
---|
SimpleLeasingBasis.LeaseEntry |
Field Summary | |
---|---|
static String |
BASIS_NAME
|
Constructor Summary | |
---|---|
ReplicatedLeasingBasis(String leaseType)
|
Method Summary | |
---|---|
protected static Map |
getReplicatedMap(String leaseType)
|
Methods inherited from class weblogic.cluster.singleton.SimpleLeasingBasis |
---|
acquire, findExpiredLeases, findOwner, findPreviousOwner, getLeaseTable, release, renewAllLeases, renewLeases |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final String BASIS_NAME
Constructor Detail |
---|
public ReplicatedLeasingBasis(String leaseType) throws IOException
IOException
Method Detail |
---|
protected static Map getReplicatedMap(String leaseType) throws IOException
IOException
|
Copyright 1996, 2011, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Oracle Fusion Middleware Oracle WebLogic Server API Reference 11g Release 1 (10.3.5) Part Number E13941-05 |
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |