Introduction

It’s been a little while since I last wrote an article for my .NET tooling series. The series up to this point has included the following steps:

I’ve finally gotten to writing up an important part of the build process: documentation. This is a big article, but it forms an important piece of the puzzle that is software development.

An important part of software projects is documentation; this includes end-user documentation and technical documentation. The latter usually includes API documentation, which is important for developers working to maintain or extend existing software. There are a number of large-scale API documentation repositories, including, but not limited to, the Java docs and MSDN.

Such documentation can be done within your code, in the form of comments. You can then use a tool to extract those documentation comments and convert them into other formats such as HTML or CHM. Since I’ve been on a .NET bent in in recent years, I’ll show how to generate documentation from code written in C#.

Writing Documentation in Code

The first step of this process is to ensure your code is documented; more specifically, it should be documented with XML tags, such as:

  • <summary>
  • <exception>
  • <param>
  • <returns> There is an

overview of documentation comments on MSDN, as well as a full listing of tags that can be used for documentation. Incidentally, much of the .NET content on MSDN comes from auto-generated documentation via the process I am describing now.

I’ll start by saying that there are essentially two types of language comments for C#: the normal code comment (//) and the documentation comment (///) – note the extra slash. VB.NET has similar variations: ' and ''', respectively. Aside from the comment indicaters, the tags used and the generation process remain the same between languages; I’ll be using C# for code samples.

I’ll now show a simple function, with before and after versions. The before version will not have any documentation comments, just a normal comment.

protected override IEnumerable<object> ResolveAll(Type servType)
{
    var matches = new List<object>();

    foreach (var pair in _items.Keys)
    {
        // skip the current item if the key is not the required type
        if (pair.Key != servType) continue;

        matches.Add(Resolve(pair.Key, pair.Value));
    }

    return matches;
}

The after version is essentially the same, but with the addition of the documentation block above the function. Note the different sets of slashes used to mark the documentation comments as opposed to the normal comment.

/// <summary>
/// Gets all instances for a given service.
/// </summary>
/// <param name="servType">The service type.</param>
/// <returns>A sequence of instances of the <paramref name="servType"/> service.</returns>
protected override IEnumerable<object> ResolveAll(Type servType)
{
    var matches = new List<object>();

    foreach (var pair in _items.Keys)
    {
        // skip the current item if the key is not the required type
        if (pair.Key != servType) continue;

        matches.Add(Resolve(pair.Key, pair.Value));
    }

    return matches;
}

You can document a class with a summary comment. This is the only class-level documentation.

/// <summary>
/// Defines a base implementation for the <see cref="IContainer" /> interface. Class contains virtual functions that must be implementated by a subclass.
/// </summary>
public abstract class ContainerBase : IContainer

For a given class, you can document fields, functions, constructors, and events, as well as the class itself. There is no in-code documentation of namespaces.

GhostDoc

Writing all the comment tags gets tedious quickly. There is an add-on for Visual Studio, called GhostDoc, that simplifies the process.

When GhostDoc is installed and configured, you can put the type insertion point on a line within a function, and press GhostDoc’s shortcut key (for me, CTRL-SHIFT-D); GhostDoc will intervene and insert the appropriate documentation comment, with needed tags in place. You can then edit the tag contents as needed; this is usually necessary as GhostDoc can only imply from the code what the documentation is supposed to be. This issue aside, GhostDoc is a very useful tool for writing code documentation.

GhostDoc was originally developed by Roland Weigelt, and was acquired by SubMain mid-2009. Roland has written about the change, with an emphasis that SubMain will continue to produce GhostDoc and make it freely available.

Compiler Option

You’ve written your in-code documentation. Now what? Visual Studio provides an option to have the compiler produce an XML file for each project; the XML file is generated at compile-time, and contains a collation of XML documentation from classes within the project.

XML output option in Visual StudioSetting the compiler option to generate XML

To set the option, go into the project’s properties, hit the Build; tab, and check the XML documentation file checkbox. Ensure there is a path and filename in the adjoining textbox; convention has the XML file appearing in the same folder as the matching compiled project, and with the same name. Save and close the properties window. Repeat for each project to be documented.

At this point, the tour of the code documenting process is complete. I’ve shown how to write in-code comments in a way that will be picked up by the documentation generation tool. The in-code comments are half of the process, the other half is the tool used to convert the in-code documentation to a separate format. Stay tuned.

Code Documentation Tools

NDoc

Back in the days of .NET 1.1, Kevin Downs wrote a tool, called NDoc, that would take the compiler’s output XML and convert it into a format similar to the MSDN documentation. NDoc became very popular, but unfortunately came to an end in 2006. The author of the tool, for varying reasons, stopped maintaining the project, and that was the end of it.

There have been some attempts at reviving the project, but apparently none have succeeded. A spinoff called NDoc 2005 appears to have been abandoned some time ago. Another one, NDoc Alpha, was an attempt at a .NET 2.0-compatible version, but it is unmaintained, has no available source, and is therefore a dead end.

NDoc was a popular tool for its time, but it is no longer maintained – nor are the spinoffs. The common solution today appears to be Sandcastle, and is the next subject of discussion.

Sandcastle

Microsoft first released Sandcastle in 2006, and has maintained it since on CodePlex; the project wasn’t open-source initially, but became so in 2008. Sandcastle is the tool Microsoft uses to generate their MSDN documentation; in fact, there are reports that the latest version of Sandcastle was used to generate the MSDN documentation for Visual Studio 2010 Beta 2. It makes sense to document .NET projects using Sandcastle, since the tool creates output in a format that many .NET developers are already familiar with.

The SDKs for Visual Studio 2005 and 2008 include older versions of Sandcastle; the latest release can be downloaded from CodePlex. The current version, at this time, is 2.4.10520, from May 2008. The installation is quite simple, a few clicks and it is done.

Sandcastle provides different output options; you can choose from CHM, a website, or Visual-Studio compatible help. You can use any one of these options, or any combination of them that are required.

Sandcastle is not a single application, but rather a set of console applications, config files, and XSL transformation files. A subset of the applications include:

  • MRefBuilder
  • XslTransform
  • BuildAssembler
  • ChmBuilder

That’s just a sampling, but it gives an indication of how complex the process is. There’s a lot of reflection involved, along with converting reflection-generated XML to other formats and assembling the final outputs.

Sandcastle includes a very basic GUI. It doesn’t do much other than take some user input and delegate generation to the console tools. To automatically generate code documentation, one must use the console applications, which require multiple complex steps.

Having spent time online reading how to go through the process, I’ll put it this way: it isn’t friendly, not by a long shot. Have a look at the following links to see what I first found.

There are some batch files available to largely automate the work, but I wanted something that would integrate into my build process using NAnt. There’s some information on that as well, but it still involves multiple complex steps just to document even one assembly. I saw some solutions with custom compiled NAnt tasks, and others that involved fancy footwork directly within a NAnt build file. Some examples:

None of the options presented seemed ideal. Perhaps I’ve been spoiled by the previous tools I’ve written about, but I just like being able to have NAnt delegate tool-specific work to that tool. The proposed solutions have NAnt doing a lot of work. In addition, most of the solutions I read made certain assumptions, such as expecting Sandcastle to be installed. I like having my binary dependencies under source control alongside my code.

I then found out about another tool that got me moving again.

Sandcastle Help File Builder

Eric Woodruff wrote about an open-source GUI for Sandcastle, called Sandcastle Help File Builder (a bit long, so hereafter called SHFB). The application is an open-source project, available on CodePlex. The current version is 1.8.0.3, from January 2010.

SHFB is a supplement to Sandcastle; Eric created it to relieve some of Sandcastle’s annoyances – namely, basic GUI, no automated documentation, and lack of simple configuration. With the addition of SHFB to the tookit, these issues are largely resolved. SHFB was intended to essentially replace NDoc, which was, at the time the CodeProject article was written, a dead project.

Before installing SHFB, ensure you have also installed the .NET Framework 3.5 SP1, and of course, Sandcastle itself. You’ll also need the .NET SDK, for the compilers and related tools. After the prerequisites are satisfied, SHFB can be installed.

The CodeProject article contains plenty of detail on SHFB, so I recommend you take a look at that for a good overview. It is dated, but the important concepts are there. The project’s site on CodePlex is kept current, and the SHFB distribution includes a help file. I’ll walk you through the SHFB GUI, pointing out specific details of interest along the way.

Touring SHFB

The SHFB GUIThe interface of SHFB

SHFB provides an easy-to-use GUI; similar to that of NDoc, a means to define a documentation project and the ability to generate documentation. SHFB is essentially a GUI wrapper – it delegates the work to Sandcastle. But the GUI is much more intuitive than Sandcastle itself.

Setting the documentation sourceNeed to tell SHFB where documentation is coming from

To be able to generate documentation, SHFB needs to know where the information is coming from. It can be pointed at a Visual Studio solution file, from which it will derive the necessary information. SHFB can also be directed to use individual project files, and even work directly off the compiled assemblies (.exe or .dll). In the latter case, Sandcastle expects a compiler-generated XML documentation file in the same location to match the assembly.

In the case of multiple documentation sources (multiple projects or assemblies) within a single SHFB project, the resulting documentation from each will be merged in the final output.

Note that if the assemblies being documented are dependent on other third-party assemblies, the additional assemblies are expected to be in the same location as the primary ones. This is just a rule of the documentation process, apparently. This rule does not apply to the framework assemblies; Sandcastle has other means for those.

I indicated earlier that Sandcastle supports various output formats. Since SHFB delegates to Sandcastle, it of course allows use of those same options. Within the SHFB GUI, there is an option labelled HelpFileFormat; you can choose just one of the options, or any combination that you need. I chose the Website option here, as that was all I needed.

Options SHFB provides for outputSHFB provides a number of output options

You’ll also need to specify an output folder, via the OutputFolder property. This path can be absolute or relative; if relative, the path will be relative to the location of the SHFB project. The default value is ./Help/. Change it however you like, I changed mine to docs/html/. If the destination folder does not exist, Sandcastle will create it.

Note: when generating a documentation website, Sandcastle will empty the destination folder – without asking – and populate it with the output. So be careful which folder you choose to have the result appear in. This doesn’t apply to single help files, e.g. the CHM option.

There are a number of additional options of interest, listed below. There are many more available, listed on the CodeProject article, the project’s CodePlex site, or in SHFB‘s included help file.

  • Visibility properties – you can specify which members, fields, or classes should appear in the documentation, based on whether their visibility levels.
  • Show missing tags – you can specify whether you want to be shown in the final output where you have missing tags in your XML documentation.
  • PresentationStyle – Sandcastle gives you the choice of different presentation styles for the final documentation. SHFB allows you to choose which you want, and Sandcastle will work accordingly. Sandcastle ships with three styles: vs2005, prototype, and hana.
  • NamingMethod – controls how help topic filenames are generated. The default option is Guid, but MemberName is better for accessibility, particularly in a generated website
  • NamespaceSummaries – recall I said there is no in-code way to document namespaces. SHFB provides a feature to add a text summary for each namespace being documented, to appear in the final output.

I’d like to point out a nice usability feature. When you examine your project properties, the default values, if any, for the properties are shown in a regular weight; non-default values are shown in bold text. This is a helpful cue to options that have been customized.

You can save your documentation settings to a SHFB project file, which has a .shfbproj extension. This extension is registered to SHFB when the tool is installed, so double-clicking a project file will open it in the SHFB GUI. This is useful when you want to regenerate a project’s documentation without having to revisit all the possible settings.

That concludes the feaure tour of SHFB; as mentioned before, there is much more information to be had, avilable on CodeProject, CodePlex, or the local help file. With the basics taken care of, I’ll get on to the most interesting part: generating documentation.

Generating Documentation

With a project file built up, containing desired options, you can have SHFB do a documentation run. The process is started via the top menu; open the Documentation menu, and click on Build Project. Alternatively, just like in Visual Studio, you can press the CTRL-SHIFT-B keys.

SHFB working to generate documentationDocumentation process is shown within SHFB

Once you start the process, a new tab will open in the GUI, showing the progress of the documentation process. It shouldn’t take very long to finish. My run-through with a single assembly clocks at a minute and a half. Below is a screenshot of what happens.

Assuming the process is successful, the progress display will print a success message at the bottom. You’ll probably want to see what the result looks like. You can reopen the Documentation menu, point to View Help File, and choose which one to view, depending on your selection of the output options. You can also press the CTRL-SHIFT-V keys.

A shot of a website created by SandcastleOne of Sandcastle’s output capabilities: a website (click for full size)

A log file – LastBuild.log – is generated during each documentation run, and is placed in the output folder. You can review it to see what happens during a generation process; it is also a good place to look if errors occur during a documentation run.

That concludes the introduction to SHFB. I’ve described some of the features, and shown how to use the tool to generate documentation. Using the GUI; is fine for manually running the documentation task, but not so good for build process automation. I’ve been through this before in my previous tooling articles: I like automating my builds as much as possible. Time to look at how to do it.

Integrating into NAnt

Those of you who have read my previous articles on .NET tooling (anyone? Bueller?) will understand that I prefer to keep my build-related tools in source control, close to the code that they are intended to be used with. This maximizes the portability of a source repository, as a single checkout will provide all the tools needed – except the compiler, of course – to build the code and work with the results.

Older versions of SHFB included a console mode builder, in addition to the GUI application. More recent versions removed the separate console application. Since an SHFB project is saved in MSBuild format, it can be executed with MSBuild.

This is a bit tricky with both Sandcastle and SHFB, since their installations set environment variables and require certain files to be in just the right places. But I figured it out, and was able to get the tools into source control, check out the same repo on another computer which had never had those tools installed before, and do a build and document the result. I’ll show you how I did it.

Copy Files

One of the objectives of source control for projects is to keep your build-related dependencies close to your code. As I’ve demonstrated in the past, this is a matter of copying the tool directories to a single location within the source repo. This reduces the requirements of the build machine, since all it should need is the SDK with the compilers. A single checkout will bring the code and all the related tools with it.

Sandcastle and SHFB are not distributed as standalone applications, however they do work standalone, as I’ve determined. Their installers add to the Program Files folder: Sandcastle is found in, unsurprisingly, Sandcastle, and SHFB is in, surprisingly, EWSoftwareSandcastle Help File Builder. You can just copy those folders to your repo.

Having done so, my repo structure looks like the following (relevant bits only):

  • (repo_root)/
    • proj.shfbproj
    • tools/
      • sandcastle/
      • sandcastlehelpfilebuilder/

Various other tools (NAnt, NUnit, etc) have their own folders within tools. My src folder is within the root, alongside tools.

That’s one step done, on to the next.

Speed Bump

The Sandcastle distribution is big. The culprit is the DataReflection subfolder, which contains reflection data for the .NET framework. This is a local cache for info that is used when generating documentation; even when it is your code being documented, Sandcastle needs this framework data. On my computer, the files totaled 246 MB, which is a bit much for a single tool. All other tools in my repos are each well under 20MB.

It turns out that these files, despite being needed by Sandcastle, need not be checked into source control. Why? Because Sandcastle had a means to regenerate them on demand. Hence, checking the files into the repo is a waste since they can recreated when needed. You can safely delete the Data folder and its contents, and follow these instructions.

The Sandcastle folder has a subfolder, Examples, containing examples of use of the tool. The Examples/sandcastle subfolder has a particular file of interest, fxReflection.proj – copy this file to the Sandcastle root folder. Incidentally, you don’t really need the Examples folder either; it can be safely deleted as well.

fxReflection.proj is an MSBuild file; when executed by MSBuild, it will cause Sandcastle to regenerate the reflection cache. You can manually run the file, but I integrated it into my build process. That can be achieved with the following snippet within a documentation task in the NAnt build file.

<loadtasks assembly="${dir.tools}nantcontribNAnt.Contrib.Tasks.dll" />
<if test="${not directory::exists(dir.tools + 'sandcastleData')}">
    <setenv name="DXROOT" value="${dir.tools}sandcastle" verbose="true" />
    <msbuild project="${dir.tools}sandcastlefxReflection.proj" verbosity="Normal" failonerror="true" verbose="false"></msbuild>
    <setenv name="DXROOT" value="" verbose="true" />
</if>

The first line brings in the MSBuild task from NAntContrib. This is followed with a conditional block; if the cache is already in place, there is no need to regenerate it. The first setenv line sets a temporary DXROOT environment variable that Sandcastle needs to run, while the second one removes the variable. The line in between has MSBuild execute the fxReflection.proj build file.

I will note that this task takes a while to execute, as Sandcastle is essentially reflecting over the whole .NET framework. Hence the reason for having the conditional block, since regenerating the framework reflection cache is not a quick process, and it doesn’t need to be regenerated every time you create your documentation.

With the unnecessary folders removed, Sandcastle is much lighter, at about 4MB. And the reflection cache can be regenerated with a single command.

At this point, the reflection cache hurdle has been cleared. Now it’s time to get to the good stuff: generating documentation with SHFB.

Generation Automation

As I mentioned earlier, Sandcastle sets the DXROOT environment variable when installed. Likewise, SHFB sets the SHFBROOT variable, of a similar nature to DXROOT. To do its work, SHFB needs to know where Sandcastle is located, and so also depends on the DXROOT variable.

There is an easy workaround for SHFB‘s reliance on DXROOT. An SHFB project file has a <SandcastlePath> element; this can be edited to point to the repo copy of Sandcastle, which I did using a relative path:

<SandcastlePath>toolssandcastle</SandcastlePath>

If this property is set, SHFB will use the value and won’t even bother with DXROOT. Be aware that the path is relative to the location of the SHFB project file, and not to SHFB itself.

On a related note, if you run the SHFB GUI app from your repo, and it is not otherwise installed, you’ll get a message about SHFB setting a temporary variable so it knows where it is working from. This is normal.

On to automating SHFB via NAnt. Following is part of the documentation task I added to my NAnt build file.

<loadtasks assembly="${dir.tools}nantcontribNAnt.Contrib.Tasks.dll" />

<property name="file.shfb.project" value="${dir.base}proj.shfbproj" />

<setenv name="SHFBROOT" value="${dir.tools}sandcastlehelpfilebuilder" verbose="true" />
<msbuild project="${file.shfb.project}" verbosity="Normal" failonerror="true" verbose="false"></msbuild>
<setenv name="SHFBROOT" value="" verbose="true" />

The file.shfb.project property is used to hold the location of the SHFB project file, and provides the value to MSBuild when needed.

The first setenv line is used to set the environment variable that SHFB needs; the second one clears the variable once the documentation is done. Something about leaving the environment at least as clean as you found it. The msbuild invocation (the reason for importing NAntContrib at the start) uses the previously defined file.shfb.project property, and sends the work to SHFB; with the SHFBROOT environment variable in place, SHFB happily gets to work. Recall that the project file contains the path to Sandcastle, so SHFB can delegate the documentation work.

Following is the complete documentation task, which I named docu. Note the inclusion of the framework cache regeneration snippet I showed previously.

<target name="docu" depends="compile">
    <loadtasks assembly="${dir.tools}nantcontribNAnt.Contrib.Tasks.dll" />

    <!-- if sandcastle's local reflection cache does not exist, will need to generate it first -->
    <if test="${not directory::exists(dir.tools + 'sandcastleData')}">
        <!-- temporarily set the DXROOT env variable, needed by Sandcastle -->
        <setenv name="DXROOT" value="${dir.tools}sandcastle" verbose="true" />
        <!-- Get the reflection cache generated -->
        <msbuild project="${dir.tools}sandcastlefxReflection.proj" verbosity="Normal" failonerror="true" verbose="false"></msbuild>
        <!-- remove the temp. env variable -->
        <setenv name="DXROOT" value="" verbose="true" />
    </if>

    <!-- project file contains settings for doc generation -->
    <property name="file.shfb.project" value="${dir.base}proj.shfbproj" />

    <!-- help file builder needs the SHFBROOT environment variable giving path to binaries -->
    <setenv name="SHFBROOT" value="${dir.tools}sandcastlehelpfilebuilder" verbose="true" />
    <!-- generate the documentation -->
    <msbuild project="${file.shfb.project}" verbosity="Normal" failonerror="true" verbose="false"></msbuild>
    <!-- don't need the environment variable anymore, so clear it-->
    <setenv name="SHFBROOT" value="" verbose="true" />
</target>

With a single command, I can run the task:

build docu

Which (normally – see below) generates the following output:

Sandcastle running via NAntThe command-line output Sandcastle creates (click for full size)

The normal case is that the framework cache is already in place; if it is not, that regeneration process will happen first, then the documentation will be created. In my experience, the former case takes a bit over one minute. In the latter case, the task takes around 10 minutes to finish. That lag is why I used the conditional block.

Generally speaking, the cache generation should only need to happen once per checkout. A freshly checked out repo will not have the cache, which will need to be generated. Once this is done, it generally does not need to be done again for the same checkout, unless the cache folder gets deleted.

Here is a shot of the final output, a documentation website:

A shot of a website created by SandcastleA generated website (click for full size)

Additional Observations

Before I finish up, I’ve added some supplemental observations gathered from my time using Sandcastle and SHFB.

The documentation process is CPU-heavy. Reflection is heavily used to gather all documentation details from assemblies, as is XSL transformation to convert the generated XML into final output. The process also creates links to base framework types on MSDN, and verifies them, requiring an active internet connection. The point being, don’t run the documentation task if you are in a hurry!

I’ve said that Sandcastle and SHFB do not need to be installed to be usable during the build process. It can be useful to have SHFB installed on your computer if you will be using it frequently. I mentioned earlier that you can have your SHFB project file specify the repo copy of Sandcastle, so that is what SHFB will use when it loads a project. Plus, SHFB registers the .shfbproj extension, so you can double-click a SHFB project file to open it in the GUI.

SHFB is built on .NET 3.5, and must use that version of MSBuild to run. However it can be used to generate documentation for projects going back to .NET 1.1, and already supports .NET 4.0. Evidently the version matters to Sandcastle, not so much to SHFB. 4.0 support requires pointing SHFB to your compiled assemblies, rather than the projects or solution they come from.

You can use the duo to document F# projects, with a caveat: SHFB will not recognize the .NET 4.0 version number in an F# project file, so you must point it at the compiled assemblies, rather than the projects or solution. Then it will work.

I’ve written this process based on generating a website to contain the resulting documentation. I also indicated that Sandcastle and SHFB support additional output formats, including CHM. For this to work, Sandcastle requires that HTML Help Workshop to be installed. Following my pattern of having build-related tools in my repo, I attempted to make this work, but could not; I get an error about missing files, which are apparently only available if HHW is installed.

Conclusion

Hopefully I’ve made clear the usefulness of documenting your code, and how such work can reap benefits in the form of automatically-generated API documentation. After a little preparation, that is.

Documenting your code is just half the battle. You also need to set up and configure a tool to examine your code, or compiled result, and generate the documentation. I briefly discussed NDoc, and dismissed it as an option that is outdated and is not being maintained.

Following from NDoc, I presented a modern alternative: Sandcastle. Explanation was provided as to what the tool can do, along with an overview of how it works. And it was pointed out that it is a complicated tool to use.

In response to issues that Sandcastle presents, I introduced a useful front-end called Sandcastle Help File Builder. Links were provided to fuller sources of information, but there was also a brief tour through the SHFB GUI and the capabilities the tool provides. This stage ended with using SHFB, which in turn used Sandcastle, to generate documentation from a compiled project.

I then went into how to integrate the documentation process into a build process using NAnt. It was a matter of setting temporary variables the programs needed, and running the process via MSBuild. On the surface of it, not that complicated. But it took some doing to get figured out!