Friday, July 6, 2007

Recursive Zen

Recursion, in mathematics and computer science, is a method of defining functions in which the function being defined is applied within its own definition. Wikipedia
Recursion is one of the most powerful concepts in CS, it's a very cool tool to define and work with functions, structures, data, etc.
In the case of Monodoc documentation it uses recursion to define the structure of certain parts of the documentation. For example the "para" element is a mixed-content element, it indicates that the element can have text and in between the text other elements. In the case of "para" it can have for child elements "block", "see", "list", etc.

Here is an example markup:

<block>
<para>Foo is a cool method! Use it</para>
</block>

The "block" element is mixed and can have for child elements "list", "para", etc. So we have a mutual recursion between "para" and "block", even more "block" can contain other "block"elements so we come full circle.

<para>
<block>Bar is a cool method! Use it</block>
</para>

This recursive behavior is mainly contained in the "Docs" element in Monodoc, this affects the previous implementation using TextTag. I was using the Priority property of TextTag to represent the structure of the elements in the documentation, this property is a number from 0 to the size of the table of tags minus one.
The priority is automatically computed in the order the tag is inserted in the table so the last tag will have the highest priority, this value is used by TextView to resolve conflicts in text regions with tags with conflicting property values.
I also use it to give a parent element a lower priority in comparasion of its child elements. This defines a static order of the tags in the table, that is used to later help me get a valid XML document when I iterate over all the buffer.
Alas the recursive behavior of some elements prevents the use of this static order. For example lets say I give the tag representing a "block" a priority of 10 and give the tag for "para" a priority of 11, if I encountered markup like the second example I will interchange the order of the tags. If we swap the values then the first example will be serialized wrong.

To resolve this issue I came with the idea of creating and adding on demand the tags so
each tag will get a higher priority in relation of its
First I isolated the elements that where part of any recursive production, after they were identified I added the non-recursive elements to the table. The recursive elements I defined them as dynamic, that is that they will be created and added to the table on the fly.
Then I had to came with a suffix because each tag needs a unique name but that still indicates the element that is representing. Now we have "para#1" or "para#2" instead of "para", the number is the depth inside the XML the element was found. This "family" of "para" tags has the same properties so we for the view is like when the order was static.

Between the previous week and this one I finished implementing the dynamic system and making several optimizations in both the serialization and deserialization of the documentation.

This week I also started to work in some basic format to visualize the documentation to be edited:


I have some ideas of the way to present the document to the user but I don't have a final one right now, one idea I have right is to mark the areas that can be edited using [ and ] and/or using a light color background.

One of the immediate problems is to optimize the time to open and save the document. For example at one point it took nearly 10 secs to save Convert.xml from Mono Class Library (373K), right one after some optimizations I did it's now near 3 secs. I think part of the problem is that I was encountering some performance issues with in TextBuffer and TextView when handling one huge line of text as indicated in bugs #332057 and #172099

Another screenshot:

No comments: