From: Jim.Johnson@software-exploration.nospam.com
Sent: Thursday, August 17, 2000 1:32 PM
To: Info-VAX@Mvb.Saic.Com
Subject: Re: Cluster transaction processing paradigm

Other replies have suggested RTR.  That will certainly work, but if
all your activity is intra-cluster, then it is definite overkill.  Use
what the connection manager gives you.

The issues you're dealing with are much like those that DECdtm had to
deal with in its structure (especially around logging and recovery).
We came up with a fairly simple and robust design which has attributes
that I believe could be applied to your case.  I don't know how much
to go into the details here -- it was written up a DTJ article years
ago.

Our first simplification was to say that TIDs only had to be unique.
There are no ordering or sourcing implications.  That allowed us to
use UUIDs, which can be generated locally with no inter-node
communication.

Second, we ran a log server on every active node, with a different log
for each node.  That minimized cache coherency traffic (by eliminating
it).  We used a locking system wherein all log servers were attempting
to get exclusive access to each log (with different modes for the
'primary' node, and 'informational' locks that advertised which log
server had an exclusive lock (this turned out to be faster and
somewhat more reliable than $GETLKI...)

This meant that each client could generate its own TIDs, and
efficiently work out which server should be contacted for a given log
file, and in case of log server failure/recovery there was a simple
and reliable locking protocol to work out when the log had been failed
over/failed back and where it now was.  This protocol can be easily
adjusted to support on-demand movement of log file ownership.

This gets you a shared nothing application data protocol on a shared
everything device substrate, thus providing a high data throughput
rate with minimal MTTR.

If you're interested in discussing this further, please feel free to
email me.  Just drop the .nospam.

Jim.

On Wed, 16 Aug 2000 17:08:02 -0400, JF Mezei
<jfmezei.spamnot@videotron.ca> wrote:

>I am trying to think of an application architecture which would not only be
>cluster friendly (easy) but also make maximum use of clustering to distribute
>processing and provide transparent operation if a node fails.
>
>Transactions go into an indexed file in a random fashion.
>
>Transactions are extracted from the file in an ordered fashion and sent to a
>third party (by modem) in batches of say 30 transactions at a time (or less).
>There are various third parties involved, so the reception of 30 transactions
>might mean one phone call if they are all destined to the same destination, or
>multiple phone calls.
>
>FEEDING TRANSACTIONS TO THE QUEUE-FILE:
>
>To feed the transactions into the queue file, I see 3 paradigms:
>	
>-client application itself opens the queue file and writes to it. However,
>this still requires that the client contact some server to get a unique
>transaction number.
>
>-client application uses ICC to contact a single server process on the cluster
>which provides the transaction number and it stores the transaction in the
>queue file. However, if the node on which that server runs fails, then you're cooked.
>
>-client application submits to any number of servers on the cluster. Each
>server then provides a unique transaction number (with its node name as part
>of it) and writes to the single queue file.
>
>Question: what technologies exist right now that would allow multiple
>processes (one per node) to advertise themselves as being available to receive
>transactions from any process on any node of the cluster ?
>
>
>SERVER COORDINATION
>
>If I have multiple server processes on multiple nodes on the cluster, is there
>some technology which allows these to talk to each other in a broadcast
>fashion ? (eg: server 1 send a single message that is received by all the
>servers). And if a new server joins, then it too starts to receive those
>"broadcast" messages. (and ideally, when one leaves, all others receive a
>message of it leaving).
>
>
>
>TRANSACTION CONSUMPTION/LOAD SHARING
>
>Ideally, I'd like to have the multiple servers intelligently share the
>transactions. While one server has "reserved" some 30 transactions to one
>remote partner, the other server reserves the next 30 transactions (to some
>other or perhaps same partner) and so on. This way, the throughput is
>maximised, and if one node fails, then the servers on the other nodes continue.
>
>However, how can one distribute transactions amongst those servers without
>having a single server coordinate this (which represents single point of
>failure if that node fail?
>
>For instance, during slow period, if a single transaction is received, which
>server will get to pick that one up and deliver it as fast as possible ? 
>
>Are there any technologies which do this type of stuff (or papers that discuss
>this) ? ? ?

Jim Johnson
Software Exploration, Ltd.
Software Navigation and Discovery Tools