Developer Docs

This chapter provides an overview of how webhelp is implemented.

The table of contents and search panes are implemented as divs and rendered as if they were the left pane in a frameset. As a result, the page must save the state of the table of contents and the search in cookies when you navigate away from a page. When you load a new page, the page reads these cookies and restores the state of the table of contents tree and search. The result is that the help system behaves exactly as if it were a frameset.

Design

An overview of webhelp page structure.

DocBook WebHelp page structure is fully built on css-based design abandoning frameset structure. Overall page structure can be divided in to three main sections

  • Header: Header is a separate Div which include company logo, navigation button(prev, next etc.), page title and heading of parent topic.

  • Content: This includes the content of the documentation. The processing of this part is done by DocBook XSL Chunking customization. Few further css-styling applied from positioning.css.

  • Left Navigation: This includes the table of contents and search tab. This is customized using jquery-ui styling.

    • Tabbed Navigation: The navigation pane is organized in to two tabs. Contents tab, and Search tab. Tabbed output is achieved using JQuery Tabs plugin.

    • Table of Contents (TOC) tree: When building the chunked html from the docbook file, Table of Contents is generated as an Unordered List (a list made from <ul> <li> tags). When page loads in the browser, we apply styling to it to achieve the nice look that you see. Styling for TOC tree is done by a JQuery UI plugin called TreeView. We can generate the tree easily by following javascript code:

      //Generate the tree
      $("#tree").treeview({
      collapsed: true,
      animated: "medium",
      control: "#sidetreecontrol",
      persist: "cookie"
      });
      

    • Search Tab: This includes the search feature.

Search

Overview design of Search mechanism.

The serching is a fully client-side implementation of querying texts for content searching. There's no server involved. So, the search queries by the users are processed by JavaScript inside the browser, and displays the matching results by comparing the query with a simplified 'index' that too resides in JavaScript. Mainly the search mechanism has two parts.

  • Indexing: First we need to traverse the content in the docs folder and index the words in it. This is done by webhelpindexer.jar in xsl/extentions/ folder. You can invoke it by ant index command from the root of webhelp of directory. The source of webhelpindexer is now moved to it's own location at trunk/xsl-webhelpindexer/. Checkout the Docbook trunk svn directory to get this source. Then, do your changes and recompile it by simply running ant command. My assumption is that it can be opened by Netbeans IDE by one click. Or if you are using IntelliJ Idea, you can simply create a new project from existing sources. Indexer has extensive support for features such as word scoring, stemming of words, and support for languages English, German, French. For CJK (Chinese, Japanese, Korean) languages, it uses bi-gram tokenizing to break up the words (since CJK languages does not have spaces between words).

    When ant index is run, it generates five output files:

    • htmlFileList.js - This contains an array named fl which stores details all the files indexed by the indexer. Further, the doStem in it defines whether stemming should be used. It defaults to false.

    • htmlFileInfoList.js - This includes some meta data about the indexed files in an array named fil. It includes details about file name, file (html) title, a summary of the content. Format would look like, fil["4"]= "ch03.html@@@Developer Docs@@@This chapter provides an overview of how webhelp is implemented.";

    • index-*.js (Three index files) - These three files actually stores the index of the content. Index is added to an array named w.

  • Querying: Query processing happens totally in client side. Following JavaScript files handles them.

    • nwSearchFnt.js - This handles the user query and returns the search results. It does query word tokenizing, drop unnecessary punctuations and common words, do stemming if docbook language supports it, etc.

    • {$indexer-language-code}_stemmer.js - This includes the stemming library. nwSearchFnt.js file calls stemmer method in this file for stemming. ex: var stem = stemmer(foobar);

New Stemmers

Adding new Stemmers is very simple.

Currently, only English, French, and German stemmers are integrated in to WebHelp. But the code is extensible such that you can add new stemmers easily by few steps.

What you need:

  • You'll need two versions of the stemmer; One written in JavaScript, and another in Java. But fortunately, Snowball contains Java stemmers for number of popular languages, and are already included with the package. You can see the full list in Adding support for other (non-CJKV) languages. If your language is listed there, Then you have to find javascript version of the stemmer. Generally, new stemmers are getting added in to Snowball Stemmers in other languages location. If javascript stemmer for your language is available, then download it. Else, you can write a new stemmer in JavaScript using SnowBall algorithm fairly easily. Algorithms are at Snowball.

  • Then, name the JS stemmer exactly like this: {$language-code}_stemmer.js. For example, for Italian(it), name it as, it_stemmer.js. Then, copy it to the docbook-webhelp/template/search/stemmers/ folder. (I assumed docbook-webhelp is the root folder for webhelp.)

    [Note] Note

    Make sure you changed the webhelp.indexer.language property in build.properties to your language.

  • Now two easy changes needed for the indexer.

    • Open docbook-webhelp/indexer/src/com/nexwave/nquindexer/IndexerTask.java in a text editor and add your language code to the supportedLanguages String Array.

      Example 2. Add new language to supportedLanguages array

      change the Array from,

      private String[] supportedLanguages= {"en", "de", "fr", "cn", "ja", "ko"}; 
          //currently extended support available for
          // English, German, French and CJK (Chinese, Japanese, Korean) languages only.
      

      To,

      private String[] supportedLanguages= {"en", "de", "fr", "cn", "ja", "ko", "it"}; 
        //currently extended support available for
        // English, German, French, CJK (Chinese, Japanese, Korean), and Italian languages only.
                          

    • Now, open docbook-webhelp/indexer/src/com/nexwave/nquindexer/SaxHTMLIndex.java and add the following line to the code where it initializes the Stemmer (Search for SnowballStemmer stemmer;). Then add code to initialize the stemmer Object in your language. It's self understandable. See the example. The class names are at: docbook-webhelp/indexer/src/com/nexwave/stemmer/snowball/ext/.

      Example 3. Initialize correct stemmer based on the webhelp.indexer.language specified

            SnowballStemmer stemmer;
            if(indexerLanguage.equalsIgnoreCase("en")){
                 stemmer = new EnglishStemmer();
            } else if (indexerLanguage.equalsIgnoreCase("de")){
                stemmer= new GermanStemmer();
            } else if (indexerLanguage.equalsIgnoreCase("fr")){
                stemmer= new FrenchStemmer();
            }
      else if (indexerLanguage.equalsIgnoreCase("it")){ //If language code is "it" (Italian)
                stemmer= new italianStemmer();  //Initialize the stemmer to italianStemmer object.
            }       
            else {
                stemmer = null;
            }
      

That's all. Now run ant build-indexer to compile and build the java code. Then, run ant webhelp to generate the output from your docbook file. For any questions, contact us or email to the docbook mailing list .