$_$_CHANGE_POLICY Convert TABLE X-refs to links : Yes
1 Introduction
------------------
[AscToTab] is a highly specific ASCII to HTML conversion tool. It
converts plain text files to HTML tables. That's all it does. It
doesn't convert tab separated lists, or comma separated lists, although
this may be added in the _very near future_ as it's *so* easy to do.
[AscToTab] evolved out of the development of [AscToHTM], the
general-purpose text to HTML conversion utility. AscToTab now
forms a subset of AscToHTM, and is offered as "as is" freeware.
As of V2.3, AscToTab now forms a complete subset of AscToHTM. This
includes the command line interface and the use of policy files. For
this reason, and to avoid duplication, reference is occasionally made
to the [A2hDoco].
From V2.3 onward, AscToTab version numbers simply match the
AscToHTM release they are a subset of, regardless of whether the
AscToTab part of the functionality has advanced significantly or
not. However only those releases of AscToTab that have significant
new functionality will be announced separately (via USENET) .
This document describes AscToTab V3.0, which is available as
postcardware (a big thanks to those that have sent in postcards, they're
much appreciated) from August 1998. [AscToHTM] is available as
shareware, and has been awarded 5 stars by ZDNet, the *only* text to
HTML converter to attain this award to date.
The HTML version of this document has been produced using AscToHTM, and
no post-processing has been done to the HTML pages produced.
It has been generated from a single source document and a few small
configuration files.
If you encounter a RTF version of this document, that will have been
produced by a text-to-RTF converter which we are developing using
the same analysis engine.
AscToTab is made available for download via the Internet from
[AscToTab download location].
2 Installation
------------------
AscToTab is downloadable as a .ZIP file from [AscToTab Download location].
You should download the version best suited to your needs.
Once downloaded, simply unzip the files and move them to a suitable
location.
AscToTab V3.0 runs as a console application ("DOS program") under
Windows 95/NT, and from the command line under OpenVMS.
Note: In the _very near future_ I plan to offer a full Windows version.
3 How AscToTab works
------------------------
AscToTab looks at the layout of your text file and tries to spot the
column boundaries in your table. It doesn't (yet) require or recognise
comma-delimited or tab-delimited values.
Having detected your column positions, it attempts to detect if your
table has a header.
Finally it outputs your table, paying attention to the following
- Data alignment. The alignment of a column is checked, and
where suitable, numerical values are right-aligned.
- Column-spanning. Where a value appears to span two or more columns
the COLSPAN attribute is used, and the alignment re-calculated.
If too many values appear to span columns, the columns are liable
to be merged.
- Table headers. Where the heading is underlined, this is detected
and the header row(s) are marked up using
markup.
- Cell entries that span multiple lines. Where possible, this is
detected and the entries are added together with inserted to
preserve the original layout.
- Blank lines. Usually omitted, unless they appear to be separators,
in which case this information is fed back into the cell analysis.
- Border. Added unless a number of user-supplied lines are detected
in which case these are shown, and the HTML border omitted.
In addition to it's automatic features, AscToTab can be customized to
give even better output. See [section 6] for details.
4 Running AscToTab
----------------------
4.1 Execution from a command line
From a command prompt (Windows or OpenVMS) you can type
AscToTab [] [/qualifiers]
Where
Name of file to be converted. The output will
be the same name with a ".html" extension. Wildcards are allowed.
If the is of the form "@", then
AscToTab will read the file line-by-line
and convert the files listed in that file.
and
Is a "policy" file used to customize the conversion see 6.1.
As of V2.3, the command line interface is in identical to that used
by [AscToHTM], although virtually none of the qualifiers are relevant.
4.1.1 Processing several files at once
4.1.1.1 Using wildcards
*Section added in V3.0*
You can convert multiple files at one time by specifying a wildcard
describing the files to be converted. The wildcard has to be meaningful
to the operating system you are using, and will be expanded in
alphabetical order.
At present we recommend that wildcards are only used on the contents
of a single directory.
Note, the same policies will apply to all files being converted. If you
wish different policies to apply, use a steering command file (see
4.1.1.2)
4.1.1.2 Using a steering command file
*Section added in V3.0*
You can convert several files at the same time in the order and manner
of your choosing. To do this use the command
AscToTab @List.file [rest of command line]
Where the file "list.file" is a steering file which contains a list of
AscToTab command arguments, and the "@" in front indicates it is a list
file, rather than a file to be converted.
An example list file might look like
$_$_BEGIN_PRE
! this is my first table... it's special
Table1.txt special_policy.pol
#
# These are my other tables. I don't want table2 converted
table3.TXT
table4.TXT
$_$_END_PRE
Note the use of "!" or "#" at the start of a line signifies it's a
comment line to be ignored.
Any qualifiers used on the original AscToTab line will be used as
defaults for each conversion, but will be overridden by any listed in
the list file. In this way it would be possible to specify a default
policy file for a bunch of similar conversions.
4.2 Drag'n'Drop execution
Create an Icon for AscToTab and simply drag'n'drop files onto it.
The results will be identical to those obtained by typing in the
filenames as described in 4.1.
4.3 Refining your results
If all goes well the resultant HTML will be satisfactory.
However, you can customize the conversion in two ways:-
- Use a policy file (see 6.1)
- Use pre-processor commands (see 6.2).
5 HTML markup produced
--------------------------
5.1
statement
5.1.1 BORDER=n attribute
AscToTab will default to a BORDER=2 unless
a) A BORDER preprocessor command is encountered (see 6.2.1)
b) It determines that the user has added their own lines
5.1.2 CELLPADDING=n attribute
AscToTab will only add CELLSPACING if a CELLSPACING
preprocessor command is encountered (see 6.2.2).
5.1.3 CELLPADDING=n attribute
AscToTab will add CELLPADDING if:-
a) It encounters a CELLPADDING command (see 6.2.2)
b) A BORDER is present. The default is CELLPADDING=4
5.1.4 BGCOLOR="colour" and BORDERCOLOR="colour" attributes
AscToTab will add these attributes if it encounters BGCOLOR
or BORDERCOLOR commands (see 6.2.3).
5.1.5 WIDTH= or WIDTH=
AscToTab will add this attribute if it encounters a WIDTH command
(see 6.2.7)
5.2
statement
AscToTab will add a caption if it encounters a CAPTION
command (see 6.2.4)
5.3
statements
AscToTab will use
..
markup whenever it determines that a
cell forms part of the header.
AscToTab will attempt to automatically detect headers by looking for
a single separator line near the top of the file.
Alternatively the HEADING_ROWS command (see 6.2.5) will be
used to specify the number of header lines.
AscToTab will set the ALIGN and COLSPAN attributes as best it can.
5.4
statements
AscToTab will use
..
markup for most of the cells in
the table.
If the HEADING_COLS command (see 6.2.6) is encountered,
the first few columns will additionally use ... markup.
AscToTab will set the ALIGN and COLSPAN attributes as best it can.
6 Customizing your conversions
----------------------------------
6.1 Policy files
Policy files are an AscToHTM feature that are supported as of the
integration between the two products that occurred in V2.3.
Not all of the policies recognised are relevant to AscToTab, but here's
a list of some that are :-
Descriptive text Values
-----------------------------------------------------------------
Active Link Colour HTML Colour
Background Colour HTML Colour
Background Image URL of image
Convert TABLE X-refs to links Yes/No
Default TABLE border colour HTML Colour
Default TABLE border size Number. 0 = "automatic"
Default TABLE caption Text String
Default TABLE cell padding Number. 0 = "none"
Default TABLE cell spacing Number. 0 = "none"
Default TABLE colour HTML Colour
Default TABLE header cols Number. 0 = "automatic"
Default TABLE header rows Number. 0 = "none"
Default TABLE width Table width in pixels or as a
percentage of page width
Document Style Sheet URL of style sheet file
Document description Text string
Document keywords Comma-separated list
Document title Text string
HTML footer file File name. File contains HTML commands
HTML header file File name. File contains HTML commands
Minimise HTML file size Yes/No
TAB size Number of characters
Text Colour HTML Colour
Unvisited Link Colour HTML Colour
Use .HTM extension Yes/No
Visited Link Colour HTML Colour
Policy files are simply text files with a .pol extension by default.
Each is placed on a separate line with the policy phrase, a colon (:)
and the value. The .pol file is then specified as an extra argument
on the command line (see 4.1).
An example policy file might look as follows:-
$_$_BEGIN_PRE
Background Colour : CCDD00
Default TABLE border size : 3
Default TABLE colour : White
Default TABLE width : 75%
Document title : This is a table I converted
Document keywords : Keywords, included, in, META, tag
$_$_END_PRE
Note, as of V3.0 it is possible to embed *any* policy line in the source
document using the $_$_CHANGE_POLICY pre-processor command (see 6.2.11).
6.1.1 HTML Colours
*Section added in V3.0*
These policies identifies the colours to be placed in the various
attributes of the tag. You can enter any value acceptable to
HTML. Normally a value is expressed as a 6-digit hexadecimal value in
the range 000000 (black) to FFFFFF (white), but certain colours such
as "white", "blue", "red" etc may also be recognised by HTML. AscToTab
simply transcribes your value into the output file.
The various policies control the colours of the foreground Text (TEXT),
the background (BGCOLOR), unvisited hyperlinks (LINK), visited hyperlinks
(VLINK) and active hyperlinks (ALINK).
A value of "none" signals the defaults are to be used. By default
AscToTab changes the background colour to be white, and omits all the
other tag attributes.
6.1.2 TABLE policies
*Section added in V3.0*
Most of the these policies are equivalent to pre-processor commands
described in section 6.2.
Default TABLE border colour 6.2.1
Default TABLE border size 6.2.1
Default TABLE cell padding 6.2.2
Default TABLE cell spacing 6.2.2
Default TABLE colour 6.2.3
Default TABLE caption 6.2.4
Default TABLE header rows 6.2.5
Default TABLE header cols 6.2.6
Default TABLE width 6.2.7
Minimum TABLE column separation 6.2.8
Expect Sparse tables 6.2.9
Convert TABLE X-refs to links 6.2.10
6.1.3 Document policies
6.1.3.1 Document Style Sheet
*Section added in V3.0*
This policy allows you to specify the URL of a style sheet file,
usually with a .css extension. Style sheet files are a new HTML feature
that allow you specify fonts and colours to be applied to your document.
The resulting HTML is inserted into the section of the output
page(s) as follows :-
6.1.3.2 Document keywords
*Section added in V3.0*
This policy allows you to specify keywords that are added to a
META tag inserted into the section of the output
page(s) as follows :-
This tag is often used by search engines when indexing your HTML page.
You should add here any relevant keywords possibly not contained in
the text itself.
6.1.3.3 Document description
*Section added in V3.0*
This policy allows you to specify a description of your document
that is added to a META tag inserted into the section of the
output page(s) as follows :-
This tag is often used by search engines (e.g. AltaVista) as a brief
description of the contents of your page. If omitted the first few
lines may be shown instead, which is often less satisfactory.
6.1.3.4 Document title
*Section added in V3.0*
AscToTab can calculate - or be told - the title of a document. This
will be placed in ... markup in the section of
each HTML page produced.
The Title is calculated as in the order shown below. If the first
algorithm returns a value, the subsequent ones are ignored.
1) If a $_$_TITLE pre-processor command is placed in the
source text, that value is used
2) If the "Document title" policy is set (see 6.3.1.1) then this value
is used.
Note: If this is the value you want, ensure the other policies
outlined above are disabled.
3) Finally, if none of the above result in a title the text
"Converted from " is used.
6.1.4 Other policies
6.1.4.1 HTML header
*Section added in V3.0*
This identifies the name of a text include file to be transcribed into
the HTML file at the top of the ... portion of the
generated HTML page.
This can be used to add standard headers, logos, contact addresses to
your HTML pages, and is especially useful to give a consistent "look
and feel" when converting many files.
6.1.4.2 HTML footer
*Section added in V3.0*
This identifies the name of a text include file to be transcribed into
the HTML file at the bottom of the ... portion of the
generated HTML page.
This can be used to add "return to home page" links, and contact
addresses to your HTML pages. Again, this helps to give a consistent
"look and feel" when converting many files.
6.1.4.3 TAB size
*New in V3.0*
This value can be used to specify the size of TABs in your source
document. AscToTab converts all tabs to space assuming using this tab
size. This becomes important only when comparing lines that use tabs to
lines that use spaces for alignment. If problems occur you may find
indentations appear strange, or tables are not quite right.
Note, text that is all tabs or all spaces should experience no such
problems.
If you know your source file uses a different TAB size (e.g. Notepad files
use a value of 4), try adjusting this policy.
6.1.4.4 Minimise HTML file size
*New in V3.0*
This policy may be used to reduce the size of the created HTML file.
By default AscToTab attempts to layout the created HTML code in an
easy-to-read manner. This was done so that the created HTML would be
easier to manually edit after creation.
To make the code easier to read, AscToTab inserts white space to indent
the code to match the output indentation levels. It also outputs each
cell of a TABLE on its own line.
All this white space adds up, particularly the indentation of
largely-empty cells in TABLES. If you select this option, all the
extra white space is eliminated.
Depending on the file contents, this can make the file 5-20% smaller (and
hence faster to download), at a cost of readability.
6.1.4.5 Use .HTM extension
*Section added in V3.0*
This policy specifies whether or not the generated HTML files should have
a .HTM extension. The default is to use a ".html" extension, unless
DOS-compatible files are requested.
6.2 Preprocessor commands
----------------------------
The preprocessor is a feature shared with [AscToHTM]. Essentially you
insert commands into your source file that tell AscToTab how you
want various aspects of your file converted.
The preprocessor looks for lines that begin with a special character
sequence "$_$_". All the AscToTab commands add "TABLE_" to this, making
the relevant prefix "$_$_TABLE_". This sequence *must* appear at the
start of the source line with no leading white space. Each command
must be wholly contained on a separate line.
Commands are best placed at the top of the source file.
6.2.1 The BORDER command
$_$_TABLE_BORDER 5
This command specifies the BORDER attribute.
A value of 0 means "none".
6.2.2 The CELLSPACING and CELLPADDING commands
$_$_TABLE_CELLSPACING 5
$_$_TABLE_CELLPADDING 5
These command specify the values of the CELLSPACING and CELLPADDING
attribute.
A value of 0 means "none".
6.2.3 The BGCOLOR and BORDERCOLOR commands
$_$_TABLE_BGCOLOR AntiqueWhite
$_$_TABLE_BORDERCOLOR #FF2345
These commands specify the values of the BGCOLOR and BORDERCOLOR
attributes.
6.2.4 The CAPTION command
$_$_TABLE_CAPTION Ooo! what a pretty table
This command specifies the value of
...
markup to
be added to the table.
6.2.5 The HEADING_ROWS command
$_$_TABLE_HEADING_ROWS 4
This command tells AscToTab how many lines of text are to be treated
as part of the header. This should be the number of lines as it
appears in the source file, including any blank lines.
6.2.6 The HEADING_COLS command
$_$_TABLE_HEADING_COLS 1
This command tells AscToTab how many columns (if any) at the start of
each line should be marked up in ... markup.
6.2.7 The WIDTH command
$_$_TABLE_WIDTH 500
$_$_TABLE_WIDTH 75%
This command specifies the value of the WIDTH attribute in pixels or
as a percentage of screen width
6.2.8 The MIN_COLUMN_SEPARATION command
$_$_TABLE_MIN_COLUMN_SEPARATION 2
This command specifies the minimum number of spaces that may be
interpreted as a column separator. The default value is 1, but this
occasionally gives rise to too many "columns" - particularly in short
tables, or columns whose data values are similar.
A larger value will lead to fewer columns.
6.2.9 The TABLE_MAY_BE_SPARSE command
*New in V3.0*
$_$_TABLE_MAY_BE_SPARSE
This command specifies that the table may be sparse. This fact will
then be used to adjust the analysis of the table.
Columns which appear to have little or no data in them are usually
eliminated by merging them with their more populated neighbours.
If you use this command this process is relaxed, meaning that you will get
more, emptier, columns rather than fewer, more filled ones.
6.2.10 The TABLE_CONVERT_XREFS command
*New in V3.0*
Although this affects table generation in AscToHTM, it's irrelevant in
AscToTab.
This specifies whether numbers in tables should be converted to hyperlinks
to numbered document sections. Since AscToTab deals with single-table
files, there can be no numbered sections elsewhere in the document.
6.2.11 The CHANGE_POLICY command
*New in V3.0*
This directive allows you set a policy in the document source.
This allows you to effectively embed a policy file at the start of
your source file.
The syntax of the command line is
$_$_CHANGE_POLICY
where is a policy line as it would appear in a policy file,
and (usually) as it appears in 6.1.
For example the following would be a valid directives
$_$_BEGIN_PRE
$_$_CHANGE_POLICY Background Colour : red
$_$_CHANGE_POLICY Document Title : My pretty table
$_$_END_PRE
7 Purchasing AscToTab
-------------------------
7.1 How do I purchase AscToTab (trick question)?
You can't. It's free. Or rather its postcardware. If you wish to be
notified of updates or request support you have to send me a postcard
with your email address. I'll accept enquiries via email, but I still
want my postcard.
Thanks to all those that have sent cards to date. Keep 'em coming.
It you really like the program, send a postcard to
John A Fotheringham
c/o Yezerski Roper
Applicon House
Exchange Street
Stockport
SK3 0ET
UK
You could also look at [AscToHTM], which is a superset of the AscToTab
functionality, i.e. it'll convert any document, using the AscToTab
software to convert any tables it finds.
[AscToHTM] is shareware in the Windows version, but is free to OpenVMS
users and FAQ maintainers.
8 Contacts on the Web
-------------------------
8.1 The home page
At time of writing [Yezerski Roper] (whom I work for) have graciously
allowed me to give [AscToTab] and [AscToHTM] a home page.
Yezerski Roper are the most intelligent software house it's
ever been my privilege to be associated with. We're based in the UK
and offer OpenVMS and Windows NT systems, and are currently
developing state-of-the-art products which will allow companies
to exploit the full communications potential of the Internet.
Oh yeah... and they pay me as well :)
AscToTab and AscToHTM are "hobbies".
If you have problems locating the home page and suspect it has moved,
go to [AltaVista] and enter
+"John A Fotheringham" +AscToTab
to locate any new home page.
8.2 E-mail
E-mail any feedback to jaf@yrl.co.uk. Sadly, we cannot guarantee any
replies.
8.3 Support
A limited amount of support is available by emailing jaf@yrl.co.uk.
Sadly, we cannot guarantee any replies, though we do try to be helpful.
9 Known problems
--------------------
None. (Ignorance is bliss)
10 Change History
--------------------
10.1 Version 1.00 (December '97)
Initial release of command line version as postcardware. Originally
the intention was to develope this as a separate shareware product, but
I've since decided to keep it postcardward (just so long as those
postcards keep coming).
10.2 Version 2.00 (February '98)
AscToTab is now integrated into [AscToHTM], and the bug fixes
and enhancements are released as V2.00 of AscToTab
10.3 Version 2.3 (April '98)
AscToTab is now totally subsumed in [AscToHTM]. This allows it full
access to the AscToHTM feature set. In particular the command line
interface is now the same, allowing wildcards and policy files to be
used.
New commands are added (see 6.2.7 and 6.2.8), and more improvements are
made to the algorithms.
From now on the AscToTab version numbers will indicate the release of
AscToHTM they are a subset of.
10.4 Version 3.0 (August '98)
Release synchronized with [AscToHTM] release V3.0.
10.4.1 Bug fixes
- The TABLE_HEADER_COLS directive only worked when there were header
rows as well.
- Use of emphasis inside a TABLE cell was not being detected as all. Now
it is detected if held on a single line. Phrases that are emphasised
over several lines inside a table cell may still not be detected.
10.4.2 New functions
- New "TAB size" policy (see 6.1.4)
- New "Expect sparse tables" policy and TABLE_MAY_BE_SPARSE pre-processor
command (see 6.1.2 and 6.2.9)
- New "Minimise HTML file size" policy (see 6.1.4)
- New "Convert TABLE X-refs to links" policy and TABLE_CONVERT_XREFS
pre-processor command (see 6.1.2 and 6.2.10)
- New CHANGE_POLICY pre-processor command (see 6.2.11)
10.4.3 Other changes
- Empty lines in a table cell now get an extra added, in addition
to the . This is to compensate for a bug in Internet Explorer 3
which would ignore the otherwise, leading to alignment errors.
- Improved handling of tables with long urls in them. Previously these
would not be recognised as part of a table. Increased "long line"
limit inside tables to 110 characters
- Improved detection of "mal-formed" tables. Previously this was
over-cautious, especially on short tables.