Clustering Patches
List of Patches to Jabberd to Support Clustering
sm: Allow Multiple Session Managers to Serve One Domain
The Jabberd2 session manager is a large chunk of fairly robust code. Rather than recreate the session manager to properly support clustering, we're going to take the easier route of allowing multiple instances of the session manager to serve the same domain (across multiple machines; still one session manager per domain per machine).
To do this, we're going to insert an extra component between the c2s/s2s instances, and the session manager; the cluster component. This binds to the local router as the domain(s) being clustered, and thus the session manager will need to bind to the router as a different ID (and be handed packets by the cluster component)
This requires a few small patches to the session manager, such that it can bind to the router as a seperate ID, but still think that it is handling the clustered domain. Yes, the intent is to lie to the session manager, and have it consider the cluster component as a local c2s instance. This patch is listed as Ticket #25, and is controlled by a <router_id> tag in the sm.xml configuration file.
sm: Lose concept of being completely authoritative for the Domain
The preceding patch allows the cluster component to sit between the c2s/s2s instances, and the session manager, and thus divert packets intended for other nodes to the proper destination, rather than hitting the session manager and being saved in offline storage eg: johndoe@… is connected to machineA, and joebloggs@… is connected to machineB, the session manager on machineA doesn't know about joebloggs being connected, and would save the message for joebloggs in offline storage.
However, the issue of roster notifications still remains. If joebloggs@… was on johndoe@…'s roster, joebloggs would not get notified by the session manager on machineA, because as far as machineA is concerned, joebloggs is offline and doesn't need a roster notification.
So, we'd like the session manager to hand packets destined for 'offline' people in its own domain off to the local router, in case the cluster component (which the router will hand these packets to) knows that these JIDs are connected elsewhere. This is a slightly larger patch, as the session manager makes a number of shortcuts when dealing with its own domain.
There is no Bug #Id for this patch as yet. (20050425 - Anzac Day)
sm: Explicitly notify the local cluster component of <presence> and <session>
So far, the cluster component has needed to understand certain of the ins and outs of the session manager in order to further give notifications to other cluster components. This adds an excessive bit of complexity to the cluster component, and circumvents any policy restrictions that the session manager may apply to a given JID (eg, joebloggs@… hasn't paid their bill, and is having his destinations restricted).
Rather that have excessive complexity in the cluster component, and still allow the session manager to implement its policies (whatever they may be), it would be better for the session manager to explicitly tell the local cluster component when a given JID is considered 'connected' or not, as well as any priority changes.
This also allows the cluster component to take the correct actions when packets are reaching the session manager outside the cluster component.
There is no Bug #Id for this patch as yet. (20050425)
router: Accept route redirects for specific JIDs
Eagle-eyed watchers will have noted that the introduction of an extra component in the logical path between c2s/s2s and the session manager introduces a certain amount of double-handling of packets. Rather than have this be the normal case, it would be good for the cluster component to be able to divert high-volume streams to the appropriate destination, and simply not handle them.
For example, if alice@… and bob@… are connected to the same machine and are having a high-volume conversation, the cluster component has better things to do with its time than handle every packet. Ideally, the cluster component tells the router 'traffic for this JID should go directly between the c2s instance and the session manager' (again, policies on the session manager may need to be observed, so the c2s instance shouldn't shortcut it).
There is no Bug #Id for this patch as yet, and probably won't be for some time. This is an optimisation that may be required in the future, after there are working cluster installations. (20050425)
c2s/sm: Be cluster aware and don't assume a consistent underlying database
All of the preceding patches make the assumption that the session manager (and c2s instances) are working from a consistent underlying database, such that password changes are copied to all nodes, and offline messages and roster changes are also copied.
In some cluster installations, this may be an invalid assumption. Ideally, this information is shared between c2s and sm instances around the cluster via Jabber, to make sure that the underlying databases are indeed consistent, or each session manager canvases other session managers to see whether they have any offline messages for a given JID, and present them to the JID as a seamless stream.
This may end up as being a cluster-specific module within the session manager. Like the redirect patch, this is an optimisation that may be required for the future, after there are working cluster installations. (20050425)