Old TODO
Global
Protocol objects
We want a seperate "object" (struct + methods) for each protocol "object" (ala pkt_t, but more like xdata_t - object parser, generator, and accessor methods). This way, we reduce the amount of DOM digging by the components to a minimum. Further seperates the business logic from the protocol, and helps avoid the mess that c2s has become.
Common SX/IO callback
The SX and IO callbacks are largely the same across all components. Some have additions (c2s & s2s have keepalives, c2s & router have access controls, c2s & router has byterate limiting). It would be good if all this common code was broken out (including SASL/TLS negotiation) so that components only have to deal with packets, nothing more.
Asynchronous resolver
We need a resolver that can run on the back of MIO. DNS resolution should really be done by a single utility rather than an entire component. Jeremie Miller has some lightweight DNS code that will probably fit well (given the similar coding techniques).
Support for non-DNS sources
Some people want to be able to place server names in /etc/hosts. This presents a problem, as res_query() (currently used by the resolver to lookup names) is a DNS-only resolver, and doesn't use the name service switch. The same would be true of any asynchronous resolver.
gethostbyname() could be used, but it blocks. The best way to solve this would probably be via a config option. By default, the async resolver would do SRV/A lookups like normal. If configured, this would change to no SRV lookup, just a straight gethostbyname(). This would cause serious pain if used against DNS, because of the blocking, but if NSS is configured to use a local source (/etc/hosts), it should be quick enough.
Zero-configuration networking
Zeroconf (or Rendezvous as Apple calls it) is a network discovery mechanism, allowing applications to find network services by sending broadcast queries into the network. There are two places that this could be used:
- The router could respond to zeroconf queries to let components know where it is and how to connect, allowing new components to be brought online without configuration.
- Apple's iChat product uses zeroconf queries and the Jabber protocol to do p2p chat on a local network. A component (or c2s/sm extension) that bridges SM users and iChat users would be great. We need protocol documentation for this - Julian Missig may be able to help.
The best way to do this is probably to write a generic component that provides components access to a Multicast-DNS responder.
Router mesh
It should be possible for several router instances to connect together to form a single logical router. This requires route information to be extended to list the router that the component is connected to. This information needs to be synchronised between router instances (probably via component presence). This is a mesh - ie, each router is connected to every other router. This ensures that no component is more than three hops away from another (C-R-R-C).
Config reloading
We need to make it so that components can reload portions of their config on the fly. The easiest way is probably to let parts of the code (modules etc) register a callback that gets called when config gets reloaded, so they can update internal structures, reconnect to databases, etc.
Admin interfaces
Once we have it set up so that config can be reloaded on the fly, its a small step to provide a more generic administrative interface to all sort of server functions. JEP-0050 documents the protocol for this.
Traffic stats
JEP-0039 documents a protocol that components can use to provide data on certain activities (eg. #packets received, sent) (like SNMP).
Serialised NAD component protocol
Early in the j2 development cycle, there was talk of making it so that the component interconnect could pass serialised nads across the wire, decreasing the time required to parse and generate XML. There might still be value in this.
Better error reporting
If a component gets disconnected from the router for sending invalid XML, we should log the packet that broke it to make life easier for the admin to track the issue down (corrupt DB or something). There's probably other places where logging can be improved like this.
Storage / authreg
Break storage & authreg out into seperate libs
There is value in making storage & authreg common subsystems to all components. Storage will be useful in (for example) c2s & router for doing packet queuing. Authreg will be useful to better do SASL authentication in the router, and to include it in s2s.
Merge storage & authreg
If it can be abstracted clearly, it would be good to merge authreg into storage. Since authreg is task-based (get zerok, check password, etc), there may need to still be a thin wrapper around the normal storage stuff. The hardest part is giving the storage API a comparison operator in a clean way.
Make storage & authreg asynchronous
This is currently most noticable in authreg_ldap. The entire process (all connections) hangs while a storage or authreg operation is in progress.
The easiest way to make these operations asynchronous is to make it so that with operation a callback is registered. Each operation is started (where asynchronous APIs exist), and then control is returned to the caller. Progress checks are triggered by either MIO or the main loop (probably a combination of both). When the operation completes, control goes to the registered callback.
This requires parts of the code that require storage or authreg to be rewritten into IRQ-style top and bottom halves.
Storage drivers for config, ACI, etc
There's value in having it so that config, access control information, etc are retreived through the normal storage interface. This would make it possible to do things like store ACI in LDAP, for example.
Storage driver for storing vCard info in LDAP
Rob Rankin is doing some work on this. The need for this may be moot if we get a decent profile spec in the near future. Of course, in that case, we'll probably want a way to map the new profiles to LDAP anyway.
Storage driver for SQLite (FR #2644) [2.1]
Embeddable SQL engine, essentially, but with a nice (hand-editable) file format. If it works well, this could be the default storage backend, as it would have no dependencies.
Common storage component [2.1 partial]
If storage and authreg could be broken out into a component, that would be cool, since we'd only need one lot of config. We need to expose the API via protocol, including capability checking. We need storage and authreg broken out from their current components, and we need calls to be asynchronous. We also need to think about redundancy - it should be possible to have (depending on DB type) more than one storage components, and have requests farmed out among them. Router load balancing will help with this.
Note that this is not XDB - we're not storing XML chunks, but simple triple-based objects.
Multi-valued fields
If a field could hold multiple values, then all roster groups for a roster item (for example) could be held in the same object internally. The storage module would have to be smarter to be able to seperate them.
DB dump / restore utility
Like it says. Should be fairly easy - loop over a list of JIDs, load objects from one driver, store to another.
Utilities
Audit existing utilities
The existing utilities need to be audited. Things that aren't used should be removed, prototypes need to be cleaned up and made consistant (eg moving xht -> xhash_t), and interdependencies should be minimised.
Callback queue utility
Currently, anything that uses callbacks manages their own callback queues. As we have more and more parts of the code that require callbacks, it would be useful to have a utility that can do the grunt work.
NAD work
NAD needs some work:
- nad_find_scoped_namespace() currently searches all elements from start to finish looking for the namespace. This works, but isn't really correct. It should take a elem index, and only search under that. It's also not really named correctly (what's the difference between nad_find_namespace() and nad_find_scoped_namespace(), at least by name?)
- New functions
- nad_insert_nad(nad_t src, int selem, nad_t dest, int delem) insert a nad (or part thereof) into another nad
- nad_clear_elem(nad_t nad, int elem) remove this element and any subelements (like xmlnode_hide)
- Comparison macros One of the most common things we do with nads is compare a element name, attribute value or cdata with some hardcoded value. We always have to check length first, and then strncmp. This results in very long lines. Macros to streamline this would be nice.
- Audit use of depths array and parent See bug #792. Need to make sure that depths and parent remain consistent. There might even be value in keeping the depths array across serialisation.
Managed I/O
Incremental descriptor allocation [2.1]
Currently the size of the file descriptor array is fixed at initialisation. This needs to be changed so that the array can grow as required. This should reduce memory usage and make things much easier for certain backends (Ben Schumacher wrote a BSD kqueue backend that didn't make it into 2.0 for these reasons - it was impossible for it to track memory usage).
APR "backend"
A port of MIO to APR would make it much easier to port the server to systems that don't have a POSIX file descriptor model - Win32, for example. Richard Dobson has done some work on this already.
Replace MIO
Its a bit out there, but perhaps we should get rid of MIO altogether and use some other layer out there that works and is supported. Maybe APR, maybe Dan Kegel's Poller stuff, maybe something else.
Router
XPath-based routing
A component should be able to bind using some XPath expression, and have all packets that match that XPath be delivered to it. Support for the old name-based binding can be done by internally mapping it to an XPath expression like "/route[@to='somedomain']".
Load balancing
Two components should be able to bind with the same name (or XPath) and have traffic destined for that name distributed evenly between the two (or more) components. A round-robin algorithm is probably suitable here, though this could be made more complex if a single component can provide some sort of data that will weigh the algorithm in certain directions (eg a component could ask the likelyhood of it being chosen for a packet is reduced if its currently under high load).
Session manager
SM as component framework
All the protocol specifics of the SM are implemented in the modules, and in pkt.c. If we move pkt.c out of the SM (see "Protocol Objects"), then the SM core effectively becomes a protocol-agnostic toolbox, providing user and session management, presence tracking, and all the utilities. It should be possible to write several types of component using this framework (such as MUC or Pub/Sub).
Dynamic modules [2.1]
Modules should really be shared objects, and it should be possible to load/unload them at runtime. An apxs-like utility for doing builds outside of the source tree would be useful too.
Merge pkt-router and in-router
Its kinda silly to have two minimally-used chains that do largely the same job. Make them one.
Client-to-server
Offline packets on client disconnect
If a user session disappears unexpectedly (disconnect or timeout), any packets that c2s is currently waiting to deliver to them should be bounced back to the session manager. The session manager can then act appropriately on these (bounce them back to the sender or store them offline).
Restart sessions on SM bounce
If the session manager goes away, user sessions should be throttled. When the session manager comes back, c2s should issue session starts for its sessions to bring them back online. This will require the session manager to hold persistent state about current sessions (eg presence sent, roster sent, etc).
Multiple authentication realms per domain
It should be possible to specify a number of authentication realms for a single domain. For auth mechanisms that support this (ie DIGEST-MD5), they can offer serveral choices to the client. For those that don't (traditional auth, most other SASL mechs) we need to figure out how to choose a realm. Or, just maybe, cop out and say that multiple realms are only support for SASL and only for mechanisms that can handle them.
Multiple auth/reg backends
There's no reason that multiple auth/reg backends couldn't be used, one for each domain.
Server-to-server
SASL/TLS for s2s connections [2.1]
XMPP-core requires that remote servers should be able to authenticate using SASL instead of dialback. Similarly, outgoing connections should use TLS when offered, while incoming should offer TLS.
Bind names rather than the default route
If s2s could bind one or more names rather than the default route, then it could be effectively used in an intranet environment where a different s2s must be used for routing depending on the destination.
SASL layer (scod)
DIGEST-MD5 channel encryption
Currently the DIGEST-MD5 mechanism only implements basic authentication. It should implement everything that DIGEST-MD5 has to offer.
Use a real SASL library
It seems that Cyrus can use an internal auth database. gSASL is also interesting. Look into dropping scod in favour of one of these.
Streams (sx)
Certificate authentication
When using SSL/TLS, a client should be able to provide a client certificate. If this certificate is confirmed valid, the server should offer the SASL EXTERNAL mechanism. If the client takes this, the authzid should be inferred from the presented certificate.
Codebase maintenance
Code documentation
Doxygen comments and a config file were added in 2.0b2. We should try to add meaningful comments to the entire codebase.
Header seperation
Each .c file should have its own header, containing only the prototypes, defines and structures that actually belong to it. .c files should only include headers (system and local) that they absolutely need. This should reduce compile times.
Header #ifdef traps
Each header should have appropriate #ifdef traps to prevent it from being included more than once.
Configure script [2.0]
The current one is a mess. We've got to systematically check for every facility we use. win32 checks should be treated like just another platform (ie the platform-specific configure stuff has to go - the test just need to be smarter).
