View Single Post

   
  #7 (permalink)  
Old 04-08-2008, 06:02 PM
Joe Weinstein
 
Posts: n/a
Default Re: Does Sybase have something like Oracle's RAC for hot failover?



DA Morgan wrote:

> Rob Verschoor wrote:
>
>> Sybase currently has active-active failover with its HA solution for ASE,
>> which does pretty much what you describe/
>> ASE's 'Shared disk clusters' version is one (big) step further and
>> that's in
>> the works -- expected next year.
>>
>> HTH,
>>
>> Rob
>> -------------------------------------------------------------
>> Rob Verschoor

>
>
> RAC is not a Sybase like HA failover and there is no relationship
> between the Sybase capability and that of Oracle's RAC. Sybase HA
> failover is more equivalent to Oracle's DataGuard failover.
>
> With the Sybase solution you do not have shared-everything. With
> Sybase you do not have load balancing. With Sybase there is a
> primary node and the balance are secondary. With Oracle that concept
> does not exist. All nodes stand on a equal footing.
>
> With Sybase, after failover, all existing client connections are lost.
> With Oracle all existing client connections are maintained. The end-user
> experience is seemless with a DML statement restarted on another live
> node.


I undertand now that RAC != Sybase HA (yet). However RAC is not nirvana,
and the above seems to confuse/co-mingle Oracle RAC with Oracle TAF.
With RAC alone, connections to a failing RAC node are simply dead.
There is a connect-time option for the Oracle client code to try
a named series of RAC nodes to find on that is currently running,
(failover and/or load-balancing). When a RAC conneciton dies, any
ongoing local transaction or global transaction that has not yet reached
the prepared state is gone. Transactions in the prepared state will
be known to other nodes, so these can be completed according to XA
during recovery (needed because of the failure of the original connection).
However, there are bugs and problems: These in-doubt transactions will
not be known *and in a recoverable state* for upwards of a minute, during
which time, Oracle will return these TX IDs to the coordinator, but
if the coordinator then says to commit/roll back these IDs, the DBMS
will send failures. This is worked around by retrying the resolution
calls for a while. Also, in a wierd turn-about, Oracle uses PAGE_LEVEL
LOCKING like old Sybase! All data that is (must be) locked regarding these
in-doubt transactions is locked in pages, which also locks logically
unrelated data, so until these in-doubt transactions are resolved,
Oracle will fail other new unrelated transactions!
TAF is different. It is "Transparent Application Failover", and
purports to make connections failover in-flight, so applications
can just continue to do their work. This is a noble goal, but TAF is
a cynical misnomer for most real clients. Below is a link to
Oracle documents which partially admits to the computational and
session state that is lost over the 'transparent failover', which
boils down to a connection which can be worse than useless for anything
but read-only applications. In the case of JDBC clients, all
existing prepared statements are dead, all packages are wiped clean,
current queries may be corrupted, and all user-set session state is
gone. In the case of some ongoing query-data reading, Oracle will try
to re-establish the cursor state so the results can keep coming, but it
does so by re-querying and stepping through the data *by row count* to
get the cursor back to the same place. In cases where the data
is different in the new node (such as the query including data that
the user had not yet committed in the previous state of the pre-prepared
transaction), the new cursor will be at the wrong place...

(http://metalink.oracle.com/metalink/...p_id= 97926.1)



5.1.3 TAF - SESSION state after failing over


If a session fails over to the BACKUP then there are some important restrictions on statements either in progress at the time of
fail-over, or statements issued after the failover. These restrictions are documented in the Oracle8i documentation and include:
PL/SQL package state is lost after failover
ALTER SESSION statements are lost
In-progress transactions must be rolled back
Continuing work on existing cursors may raise an error (eg: ORA-25401 "cannot continue fetches")
Failed over selects may take time to re-position (when FAILOVER_TYPE=SELECT)
Failed over selects may raise an error
In the case of instance or node failure there may be a delay before the BACKUP instance can do any user work (due to time taken
for lock remastering and any instance recovery)


>
> HTH


Reply With Quote