One of the goals I have with the FTS library is that users should be able to write down their desired graphs in a language that is close to their domain. The transformation between that and the
FTS XML specification language could then be done, for example, by an XSLT stylesheet.
One of the first applications for that is a finite-state-machine (FSM) declaration and immediately I ran into a problem with using XSLT for graph manipulation. The gist is that the usual way to write down an FSM is by specifying the transitions. Obviously, one node may appear many times in as many rules. However, the FTS XML language specifies the nodes and when giving nodes twice, they are created twice, which is at best inefficient and at worst results in the wrong results.
So, to transform one into another, I need to have some way of grouping by attributes (in this case) and choose one only node per distinct attribute value. XSLT does not have grouping, as such. Furthermore, I cannot use the XPath 2.0 distinct-values function, because I need the original node, not just its value, as the context item. Last, but not least, I do not know which attribute values might occur.
The solution, as I discovered, is the so-called "Münchian method". What XSLT
can do, and pretty efficiently at that, is build a map from a value to all nodes that have that value, using the <xsl:key> statement. The gist of the Münchian method is than that for each element in the input, you
look up whether the node is the first in the list in that map. If it is, you process the node-set
only this once. Thus, grouping ;-)
btw, in the descriptions of this method I found on the web, the "
generate-id" function is usually used to create a unique id for a node, resulting in a statement like "generate-id(.) = generate-id(key("index", @value)). I have found that -- for XSLT 2.0 only -- it works just as well to use "is" identity based comparison. generate-id is guaranteed to compute the id of the node in the set that is first in document-order (which may not the first in the set) and that may be why some people use it. However, for grouping, it is not necessary to find the first node. It is just necessary to make sure the group is only processed once and identity comparison works just fine and may be more efficient (and clearer, IMHO).
In my case, the xsl:key statement looks like this:
<xsl:key name="states" match="transition" use="@currentState"/>
and the apply-templates call like so:
<xsl:apply-templates select="//transition[. is key('states', @currentState)[1]]">
<xsl:with-param name="type">state</xsl:with-param>
</xsl:apply-templates>
btw, the only reason for the param is that the use the same template multiple times. It is not necessary if you can dispatch to different templates straight away.
p.s.
see this mail to xsl-list for an intro to the Münchian method