Saturday, November 14, 2009

Sandcastle Workshop: Conceptual Templates

Sandcastle Workshop is our IDE for Sandcastle, and will be based on the Sandcastle Assist builder libraries and tools. This project is based on the SharpDevelop IDE, which we have scaled down with some usability improvements.
For the past 2-3 months, this project took away all our spare times. By God's grace, we have made progress with this effort and we are planning an alpha release on the November 30th, 2009. You can read our recently announced road-map here.
We have just completed the file templates for the Sandcastle conceptual topics, and we want to present that in this blog.

Conceptual Templates
Here is how it looks in the our New File dialog (click for larger image):

As shown are are providing two sets of conceptual topic templates:
  1. AML Topic Templates: The file extension is *.aml, first introduced by DocProject and now supported by SHFB. These topics files are wrapped in the topic tag.
  2. MAML Topic Templates: The file extension is *.maml. These topics are not wrapped in the topic tag, and will be the recommended format for the Sandcastle Workshop, I will explain why below.
The New File dialog will allow you to provide author information, topic title etc., and the will generate both the main topic file and the companion file.
The companion files are supposed to contain information on the topic, and we will use it extensively to keep most information on the topic (its metadata), including those not directly required by the Sandcastle compiler, such as your topic TO-DO, and other memo information.

Why the *.maml topic files?
The topic tag is not defined by the Sandcastle conceptual authoring schema, and VS.NET IntelliSense, for instance, will give at least two warning for every such file opened in its Error List window.
Also, we think the topic identifier and the revision number are better kept as part of its metadata or topic information in the companion file.
Wrapping the topic in the topic tag should be considered an intermediate step, and handled by the building tools.
The "tradition" of creating topics wrapped in the topic tag started with the Sandcastle conceptual example. In order to be able to compile that example with a simple batch file, and not requiring any preprocessing by tools, the Sandcastle conceptual example came with all the intermediate files, including wrapping the topic in the topic tag. We want to restore the normal way of doing this stuff!

How we did it
We created the templates for all the 19 topic types supported by the Sandcastle, and then customized copies of these by adding Sandcastle Workshop specific file template items.
The original empty templates will be available in the development branch of the source control at the Codeplex for you to customize and enhance for your own IDE or applications.

On the user interface, New File Dialog, we have to rework parts of the SharpDevelop's New File Dialog, which had a nice innovative feature of including the property grid control support.

We like that feature, but unfortunately, it was not as complete as we thought when we started working with it. We improved upon the idea, including the support for read-only property you see for the topic creation date, and in this blog we presented our result.


Friday, July 17, 2009

Reworking the Sandcastle Help 1.x Build

As part of the new Sandcastle Assist update we will be uploading soon, we have completely reworked the Help 1.x (CHM File) building process of the Sandcastle for mainly two reasons:
  • Improve its use of memory, and
  • Control the build process to add more options.
For anyone new to the Sandcastle tools, I will give a little background so that you can understand why we wanted to satisfy the above requirements.

Background

Microsoft uses the Sandcastle tools to build its own documentations, and as we all know, these are mainly in Help 2.x format. The building of Help 1.x as in Silverlight documentations does not involve large outputs.
Sandcastle, therefore, is designed to produce Help 2.x compatible HTML format, which includes keywords, attributes etc.
To build the Help 1.x compatible HTML format, Sandcastle uses two console tools
  1. ChmBuilder.exe, which converts the HTML files to the Help 1.x format, generates the projects, table of contents and keyword files. See Building CHM using CHMBuilder.
  2. DBCSFix.exe, which is used to fix localization issues with the compiler.
    See CHM Localization and Unicode issues.
Memory Issue

This will not be an issue for the Help 2.x, since that help format supports easy plugin system, enabling the developer to split the help building process into parts to avoid large memory requirements. See Componentization - Building Assembly level HxS using Sandcastle.
Now, componentization is available in the Help 1.x too, but it is not that easy and not easily supported by the current Sandcastle GUI frontends.
Another reason why this memory issue may not arise in the case of the Help 2.x is that, the HTML file generated by the Sandcastle contains the keywords and attributes, and so no separate keyword list file is created for that compiler.
In the case of the Help 1.x, the memory requirement is high, due to the following reasons
  • Both the ChmBuilder.exe and DbcsFix.exe tools use the .NET Directory.GetFiles() to retrieve all the files in an array for the conversion processes, and for very large projects this could be high.
    NOTE: Sandcastle's HxfGeneratorComponent could be used to eliminate this since this outputs the generated HTML files to a *.HxF file.
  • To generate the table of contents, the ChmBuilder.exe uses a .NET Dictionary for a mapping of topic and title.
  • Again, ChmBuilder.exe uses a .NET List of a structure to store all the keywords retrieved from the Help 2.x HTML compatible file in the conversion process.
Our solutions:
  • Use the Windows API FindFirstFile and FindNextFile to iterate over the files in a output directory, it is the same used by the .NET Directory.GetFiles() to pack all into an array.
  • Use a file-based dictionary, a BTree implemenation (hopefully, it is faster than complete database, we will be testing) to store the topic-file map,
  • Allow the Help 1.x compiler to retrieve the keywords from the HTML files, just like the Help 2.x compiler by rewriting the keywords in the format used by Microsoft, google for MS-HKWD and MS-HAID. The tests so far produce the same results as the ChmBuilder tool, but we are looking at ways to improve this.
Control Over Output

We simply needed more control, and wanted to modify the output project files from the ChmBuilder tool, but these are not valid XML or XHML files. We will have to rewrite those files or write a parser to modify them.
We used the memory issue above to work on this :)

Conclusion

With the Sandcastle Assist, we are trying to find means to improve upon the Sandcastle tools, and on memory requirements, this is just the first and easier step. The main is with the reference link resolving component, ResolveReferenceLinksComponent. This will involve a little bit of work and research, we are working on it.
We have supported grouping in the build process to enable you to separate the documentation project into parts for componentization. We will continue to work on this too.
Thanks for reading, we will love any input in these efforts. May God bless you.

In the beginning...

Sandcastle Assist is an open source effort to enhance and provide easier developer access to the Microsoft Sandcastle product, which is a documentation system for managed class libraries.
The project is currently hosted on CodePlex: Sandcastle Assist.
In this page we will provide information on the develop of this library, tips and useful information on using the Sandcastle in various projects.

Please join us.