AscToHTM Documentation for AscToHTM conversion utility
This documentation can be downloaded in .zip format.

7 Using the preprocessor

The preprocessor is introduced in version V1.05. It is intended to allow users more flexibility in the HTML they generate, and as such moves AscToHTM towards being a HTML authoring tool, as opposed to a simple text conversion or migration tool. This wasn't AscToHTM's original purpose, and so - as yet - it's a little functionally poor in this area.

Likely future additions include the insertion of embedded HTML in the output code. This would allow pictures to be embedded.

The preprocessor looks for lines that begin with a special character sequence. Presently this is "$_$_", but this will become configurable in later versions.

Preprocessor lines are not normally output to the HTML generated. Instead they are used to modify AscToHTM's behaviour in a number of ways.

7.1 The SECTION command

This directive is used to divide the document up into named section types. Section type names can be repeated through the document, and by default text is assumed to belong to a section called "all", indicating that this text is always copied to the output file.

Section type names must contain no white space.

This has no effect unless the user supplies a policy file indicating that they wish to select only certain section types for output.

For example, if the text document looks like this

                Some text that'll always get copied, because it is in an
                "all" section type by default.

        $_$_SECTION Private

                Some text that will be copied either when the preprocessor
                is switched off, or when the user's policy file indicates
                that "private" section types are to be included.

        $_$_SECTION Other

                Likewise, this is an "other" section type.

        $_$_SECTION Private

                And here's some more "private" text.

        $_$_SECTION all

                Some text that will always get copied because it is explicitly
                in an "all" section type.
If the user then supplies a document policy file which includes the lines (see 6.3.5)
        [Preprocessor]

        Use Preprocessor           :  Yes
then the two section types marked "private" won't be copied into the converted file unless the line
        Include document section   :  Private
is added to the policy file. Similarly with the "other" section.

Note_1:
Strictly speaking the "use preprocessor" line above isn't needed as this is set to "yes" by default. This means that any $_$_SECTION lines will cause text to be omitted unless you supply an appropriate policy file.

Note_2:
Be aware that any sections omitted are also omitted from the analysis pass. This may have unexpected results as AscToHTM responds only to the input text that is to be included in the output.


7.2 The PRE (pre-formatted text) commands

The BEGIN_PRE ... END_PRE directives are used to bracket user-formatted text. Such text
is placed in <PRE> ... </PRE> markup, and minimal conversion is applied to it. This can be useful to preserve the layout of a text table.

Note:
AscToHTM does attempt to spot such user -formatted text automatically, but this is a difficult area and prone to error. Hence the use of these directives can reduce the error rate on such occasions.


7.3 The CONTENTS commands

The BEGIN_CONTENTS ... END_CONTENTS directives are used to bracket a contents list in the source
document. AscToHTM will attempt to automatically detect the presence and location of any contents list in the document, but the algorithm can be problematic.

Use this markup only when the document contains a contents list that AscToHTM fails to detect correctly.


7.4 The HTML commands

      AscToHTM

The BEGIN_HTML ... END_HTML directives are used to bracket actual HTML in the source document.
The bracketed HTML will be transcribed to the output file unconverted.

This device will allow you to embed images, tables and other HTML constructs not normally generated by AscToHTM.

This is how the image to the right has been added.


7.5 The TITLE command

This directive allows you to specify the <TITLE>...</TITLE> to be inserted into the <HEAD> section of the output page. This title will appear in the browser's frame title whenever the page is viewed, and will be the text shown in your browser's history.

The presence of a TITLE command overrides any title specified in a policy file (see 6.3.1.1).


7.6 The INCLUDE command

This directive allows you to specify the name of a source file to be included at this point. This is useful if you wish some standard text inserted into many related documents, or into the same documents at many locations.

The included file will be treated as though it were part of the original file during both the analysis and output passes.

The include will fail is the fail cannot be found, and a test for recursive include files will be made.


7.7 The STYLE_SHEET command

new in V2.1

This directive allows you to specify the URL of a style sheet file, usually with a .css extension. Style sheet files are a new HTML feature that allow you specify fonts and colours to be applied to your document.

The resulting HTML is inserted into the <HEAD> section of the output page(s) as follows :-

<LINK REL="STYLESHEET" HREF="URL" TYPE="text/css">

The presence of a STYLE_SHEET pre-processor command will overrides any style sheet specified in a policy file (see 6.3.6.1).


7.8 The KEYWORDS command

new in V2.1

This directive allows you to specify keywords that are added to a META tag inserted into the <HEAD> section of the output page(s) as follows :-

<META NAME="keywords" VALUE="your list or keywords">

This tag is often used by search engines when indexing your HTML page. You should add here any relevant keywords possibly not contained in the text itself.

The presence of a KEYWORDS pre-processor command overrides any keywords specified in a policy file (see 6.3.1.2 ).


7.9 The DESCRIPTION command

new in V2.1

This directive allows you to specify a description of your document that is added to a META tag inserted into the <HEAD> section of the output page(s) as follows :-

<META NAME="description" VALUE="your description">

This tag is often used by search engines (e.g. AltaVista) as a brief description of the contents of your page. If omitted the first few lines may be shown instead, which is often less satisfactory.

The presence of a DESCRIPTION pre-processor command overrides any description specified in a policy file (see 6.3.1.3).



Prev | Next | Contents


© 1997 John A. Fotheringham