Clustering
Summary of Links
ClusteringPatches - List of Patches to the main Jabberd2 code to support Clustering
ClusteringComponent - Explanation of the workings of the Clustering Component
ClusteringRouting - Dry, Dusty explaination of the routing within a Clustered domain
Some thoughts on how to cluster a Jabberd2 domain
This assumes that you have read the Ideas with its section on Scalability.
The current issue with large installations of Jabberd2 is that it has a single bottleneck and chokepoint, being the session manager for the domain, or jsm. While the grunt work of interfacing with clients (via multiple c2s instances) or other servers (via multiple s2s instances) can be spread over multiple machines, the work of the session manager can only be done via one component, on one machine.
Should that machine fail, the whole domain is impacted. Traffic for JIDs within that domain stops until the machine is restored, which may lead to some clients aggressively reconnecting.
Ideally, in a clustered domain, this choke point does not exist, and each machine servicing the domain is able to fully answer for the domain. The failure of a given machine simply means that clients directly connected to that machine must reconnect (hopefully to another machine), and does not result in the failure of the entire domain.
This document attempts to pose some solutions, and issues, to the problem of providing clustering services using jabberd2.
The first problem to overcome with clustering a given domain, is to provide a consistent view. For example, if you have machineA and machineB both servicing the domain example.net, clients connected to machineA must be able to talk to clients connected to machineB.
A simple solution to this, without any changes to the session manager code, is the introduction of a cluster component. This component appears to the local router as another c2s instance, keeps the local session manager informed of example.net clients which are connected to the other machine(s), and passes packets destined for these clients to the appropriate machine.
The obvious caveat with this approach is that each session manager within the cluster thinks that it is the only session manager for that domain. As such, when each client connects (either locally or seemingly via the local cluster component), the local session manager will do its normal client-connecting things, such as sending offline messages to the client, informing the client's roster that the client has connected etc. And this is done by each session manager in the cluster. The client will receive duplicate offline messages, and JIDs on its roster will receive duplicate notifications.
The more subtle caveat with this approach is that each session manager must have sufficient memory to cope with each concurrently-connected client across the whole cluster. In really large domains, this may be an issue.
The second problem to overcome with clustering a given domain, is connections with external domains. For example, machineA (serving example.net) may make a connection to example.com. Later, machineB makes the same connection. Depending on the implementation of the jabber server servicing example.com, the second connection may or may not be permitted. Of course, the router on both machineA and machineB will send packets for example.com to the machines quite happily, and packets returning will be routed via the cluster to the appropriate destination.
A simple solution to this is for the cluster component to also handle packets destined for external domains, where other machines in the cluster have connections to said external domains.
The caveat with this approach is that packets for external domain may be bounced around between several cluster machines before finding an exit. If the cluster mesh is a world-wide mesh (WWM?), then this may well result in non-optimum routing (eg, sending a packet from Europe to Australia within the cluster, and then back to Europe externally).
The above are the two main problems facing a generic clustering solution. Now, we need to roll up our sleeves and think of how to implement this within a framework of several machines, each with their own router, c2s, s2s and jsm instances.
The caveats listed in the first problem, being the issue of duplicate messages and excessive memory usage, have their solutions in a small number of changes to the session manager.
These are listed in the ClusteringPatches page.
The first proposed change to the session manager is for it to bind to the local router as a different name. For example, to bind simply as sm, but still support clients for example.net . Within the sm.xml file, this can be indicated as <id realm="example.net">sm</id> (see the c2s.xml file for an example. Code for this is in sm/main.c (config) and sm/sm.c ("bind" to a name).
The second proposed change to the session manager is for it to lose the assumption of being completely authoritative for the domain. This is actually a simple change, as it simply involves removing the checks for whether the given JID is within the local domain, thus passing packets for JIDs which are not known to be connected back to the router.
The third proposed change to the session manager is for it to pass copies of all <presence> and session packets to the local cluster component.
The fourth proposed change to the session manager, is for it to support the concept of a route redirect for a given JID. This is explained in greater detail below.
With these four changes, each local session manager is then aware only of JIDs connected to local c2s instances, removing the issue of memory consumption and duplicate offline and roster messages. Messages for JIDs within the clustered domain, but not connected to the local session manager, are handed back to the cluster component.
The cluster component then needs to pick up the slack. This component needs to bind to the local router as the clustered domain (example.net), and the machine's DNS name (so it can be reached by other machines in the cluster). It then will be in the chain between the local c2s instances and the local session manager, thus being able to pass knowledge of locally-connected JIDs off to other machines in the cluster.
However, this does introduce another choke point, with packets being double-handled between the c2s, the cluster, and the sm components. This is desired when a client initially connects, in order that the cluster component can get the required information to pass information onto other machines in the cluster, but not desired when there is a lot of traffic being exchanged between locally-connected JIDs. The cluster component is intended to route traffic between other instances of the cluster, not handle local routing.
Thus, an addition is proposed to the Jabberd2 Component XEP, along with appropriate changes within the c2s, s2s and sm code. This addition is to support the concept of a route redirect on the full JID level, such that traffic between two locally connected JIDs travels c2s -> sm -> c2s rather than c2s -> cluster -> sm -> cluster -> c2s .
This change will also allow the cluster component to act as the default route, and then issue redirects (which may not be followed of course) when appropriate. However, the cluster component still needs to know when a given JID disconnects from the local cluster, so the other machines in the cluster can be informed, hence the third change to the session manager listed above.
In summary, clustering a jabberd2 domain can be done with a few minor changes to existing code, and a few not-so-minor changes (redirect) to prevent excessive overhead in passing message packets around. Most of the hard work lies in the implementation of this cluster component.
