Thursday, September 22, 2011

OSGi Subsystems in Virgo

A public draft(*) of the OSGi subsystems RFC (152) should soon emerge from the OSGi Alliance. A subsystem is a multi-bundle application, not dissimilar to a PAR or plan in Virgo. IBM is leading the spec work and a number of other vendors, including SpringSource/VMware, are contributing. Quite a few projects have multi-bundle application constructs, so it makes sense to agree a standard form.

After going through the RFC again with my implementer's hat on, I listed the features necessary to support subsystems in Virgo. Most of the changes are in the Virgo kernel, although I hope to structure the support into non-subsystem specific generalisations of the kernel plus subsystem-specific code running in the user region.

Currently, it is not possible to deploy a plan which contain artefacts that are already deployed. This will need generalising so that we can support subsystems using a common data structure of deployed artefacts: a directed acyclic graph (DAG) rather than the current collection of trees.

The switch to a DAG has interesting implications for lifecycle management of shared subgraphs. With today's tree, when a node is stopped, started, or uninstalled, any subtrees are also stopped, started, or uninstalled, respectively. With a DAG, shared subgraphs need to be sensitive to all their parents. This boils down to keeping each shared subgraph at the maximum state required by any of its parents. States are ordered: ACTIVE > RESOLVED > UNINSTALLED.

Lifecycle management will also get interesting when a shared subgraph belongs to one or more atomic subgraphs as then lifecycle changes in the common subgraph will propagate to all the containing atomic subgraphs. I think that will just "work", but users might need to be careful in their use of atomic plans if they want to avoid management operations on one application affecting other applications.

I'm also considering using garbage collection as a means of uninstalling artefacts which are no longer needed. Given the number of types of dependency that are possible, this is likely to be more reliable than the alternative of maintaining reference counts.

I'm a little concerned about a possible race between garbage collection detecting an artefact as dead and a new dependency being created on the artefact just before garbage collection goes ahead and uninstalls the artefact, but there would be a similar concern for reference counting. The basic issue in this race is that a dead artefact may be found by a live bundle and a new dependency created before the dead artefact can be uninstalled. For instance, a dead bundle may be found by using the OSGi API to list all bundles. It may be possible to use some technique such as a special region in the region digraph to isolate dead bundles, although this issue is probably something to discuss among those working on the RFC.

Anyway, there's plenty of work to be getting on with. I haven't done detailed estimates of the features identified so far, but I guess there's a person year or so of effort needed, so I'm initially targeting Virgo 4.0. If you feel like lending a hand, please get in touch on virgo-dev.

* - RFC 152 has changed quite a bit since the version in the Enterprise spec early access draft dated 16 May 2011. A new draft is being prepared.

2 comments:

Neil Bartlett said...

I agree that the garbage collection approach is probably to be the right way to go. With regard to the race condition, it seems we can use the analogy of "stopping the world"... i.e prevent any new wires being added or bundles changing their states while the garbage collector is scanning.

Glyn said...

Possibly, although bear in mind that subsystems is being implemented strictly on top of the framework, so we'd need to use framework hooks to prevent/delay new wires etc and it would be very easy to introduce deadlocks. Anyway, the point in this context is that there is plenty of interesting, complex stuff to work on here...

Projects

OSGi (130) Virgo (59) Eclipse (10) Equinox (9) dm Server (8) Felix (4) WebSphere (3) Aries (2) GlassFish (2) JBoss (1) Newton (1) WebLogic (1)