AscToHTM Documentation for AscToHTM conversion utility
This documentation can be downloaded in .zip format.

3 How AscToHTM works

3.1 The big assumption

AscToHTM makes one big assumption :-

Each text file has been laid out in a consistent manner by its author in a way that makes it easy for a human reader to understand.

Given this, AscToHTM tries to read the text file like a human being would and then mark it up in HTML accordingly. This is achieved by making two passes through the document, an analysis pass (see 3.2) and an output pass (see 3.3).


3.2 The analysis pass

During the analysis pass AscToHTM gathers together all the statistics that it deems to be relevant. For example the distribution of line indentations and line lengths is observed, together with the number and types of bullets, section headings and lots of other stuff.

Once this has been done, the program uses this data to determine how the author has structured the document. For example are the section headings underlined, capitalised or numbered? If numbered, what style of numbering is used, and at what level of indentation is the heading placed?

This information is published as the "calculated document policy" for this document.

The user may, if they wish, supply their own document policy to override all or part of the calculated document policy.


3.3 The output pass

During the output pass AscToHTM

3.3.1 Generating HTML

The HTML generated depends only on the original document, the calculated document policy, and any user policies supplied.

Section 5 describes this process in more detail.


3.3.2 Generating a contents list

AscToHTM can detect the presence of a contents list in the original document.

Alternatively you can choose (see 6.3.3 and 6.3.4.2) to have AscToHTM to generate a contents list for you, in which case any original list is omitted from the output HTML document.

Regardless of whether the original or generated contents list is used, AscToHTM will turn the contents list into hyperlinks that will take you to the correct HTML file and location.

The placement of the contents list varies as follows :-

3.3.3 Splitting the document into many HTML pages

By default AscToHTM creates a single .HTML file. However, through file organisation document policies (see 6.3.3 ) it is possible to

  1. Split the document into a number of smaller .HTML files (see 6.3.3.6).

  2. Insert standard JavaScript into the <HEAD> ... </HEAD> section of each page (see 6.3.1.4)

  3. Add a HTML "header" to the top of each generated file (see 6.3.1.5)

  4. Add a navigation bar at the foot of each page with links to the Next/Previous .HTML page and the contents list (see 6.3.3.8).

  5. Add a HTML "footer" to the end of each generated page (see 6.3.1.6)



Prev | Next | Contents


© 1997 John A. Fotheringham