%Options: KBD=VOID; FMT=LATEX; LANG=ENGLISH; HYPHEN=LATEX; %\documentstyle[12pt, export]{article} \documentclass[12pt]{article} \usepackage{export} \begin{document} \parskip=3pt \def\TeXspider{TeXpider} Thank you for using MicroPress {\TeXspider}---the new exciting program from the leading publisher of TeX-related software. MicroPress {\TeXspider} converts your \LaTeX\ documents: articles, books, reports --- into professional Web sites --- quickly, easily. In majority of cases, no or minimal changes changes to your source are required. To make things yet simpler, the {\TeXspider} intergrates seamlessly into the V\TeX/Windows Shell---so that a full Web site can be built with just a few keystrokes. For impatient users, we will start with a quick explanation of how to run the {\TeXspider}; you can try it immediately on standard \LaTeX\ examples, like {\tt small.tex} or {\tt sample.tex}. However, be sure to read the rest of the documentation before trying anything more advanced. \bigskip \tableofcontents \section{Quick Start: Using the {\TeXspider}} The main {\TeXspider} program is the {\tt itexwb.exe} executible. This is an enhanced version of the Big \TeX\ for Windows ({\tt vtexwb.exe}) present in the regular V\TeX\ distribution\footnote{At this time there is no intention to produce a Small version of the program}. {\tt itexwb.exe} supports this additional switch: \begin{itemize} \item[-\$h] Generate HTML %\item[-\$r] Generate RTF \end{itemize} For example, to generate an HTML output out of a \LaTeX\ document {\tt mypaper.tex}, you would generally write \begin{itemize} \item[] \tt {C:> itexwb -\$h \<209w mypaper} \item[] \tt {C:> itexwb -\$h \&latexw mypaper} \end{itemize} Notice the use of the format files {\tt \<209w and \&latexw}. These formats are essentially \LaTeX2.09 and \LaTeX2E, with minor modifications and customized font assignments; the explanation and details can be found below. if you are using Win95 or Win/NT, you can actually enter the above lines on the command line. Under Win3.1, where there is no command line, you should enter this line from the {\tt[File]/[Run]} prompt. In general, running a command-line application from Win3.1 is not going to be pleasant, and we strongly suggest running the {\TeXspider} from the V\TeX\ shell. For you convenience we have provided a couple of batch files: {\tt l2etohtm.bat} and {\tt 209tohtm.bat} for correspondingly \LaTeX2E and \LaTeX2.09. \medskip Before actually compiling the document, you {\em must} make small changes to you source: \begin{itemize} \item [-] in the {\tt\char92 documentstyle} line (\LaTeX2.09), insert the {\tt export} style. This change will have no effect on normal \TeX\ compilation but will provide structural guidelines to the {\TeXspider}. \item [-] if you are using \LaTeX2E, you should add \verb+\usepackage{export}+ to your document. \item [-] In both cases, if you are using the {\tt leqno} style, you should include the {\tt expleqno} style instead. \item [-] In almost all cases, the {\tt export.sty} should be {\em the last} style included. In \LaTeX2E, it is best to place \verb+\usepackage{export}+ just before \verb+\begin{document}+. \end{itemize} \smallskip The output of the HTML conversion ({\tt .htm} and {\tt .gif} files) is written into the directory {\tt mypaper.i2h}. %The output of the RTF conversion ({\tt .hpj}, {\tt .rtf}, {\tt .bmp}, and {\tt .h} files ) is written %into the directory {\tt mypaper.i2r}. \medskip \hrule \medskip To convert a \LaTeX\ document within the V\TeX/Windows shell, use the {\bf [Run]/[Convert to RTF/HTML]} menu option. \section{Ideology} There are several existing programs that convert \LaTeX\ code to HTML; they utilize different approaches: \begin{itemize} \item Some contain their own full parser. For example, \LaTeX-to-HTML (Unix program written in Perl script) is one. The problem with this approach that implementing full \TeX-compatible parser is a major undertaking. Most of existing programs of this type therefore exhibit weird syntax errors and do not support most of \TeX\ language. \item Some re-work \TeX's .dvi output. The idea here is to let \TeX\ do the parsing and then re-process the \TeX's output building HTML. While this is acceptible in terms of supporting full syntax, the results cannot be good since too much is lost in the .dvi output built by \TeX. \end{itemize} MicroPress {\TeXspider} works along a different model: it is actually a modified (enhanced) \TeX\ compiler, capable of direct generating of HTML and RTF code (and still capable of generating .dvi's, of course). By being a regular \TeX\ compiler, it behaves like \TeX\ and takes full input syntax, including user defined macros. By containing the HTML and RTF back-ends, it can make optimal decision in generating these types of the output. In designing the {\TeXspider}, we tried to minimize the changes made to the \TeX\ code and process as much of the structural information by changes in the \LaTeX\ format. This simplifies customization and allows the user better control over the formatting done by the {\TeXspider}. \section{The format file and fonts} The {\TeXspider} should normally be used with the {\tt lt209w/latexw} format files, or similarly built formats. In this section we'll explain what {\tt lt209w/latexw} are; this information will be of assistance if you desire to build a different format. Since the {\TeXspider} output may be used on computers that do not have V\TeX\ running (in fact, the HTML output can be used on non-Windows computers), you should restrict yourself to the fonts that will definitly be present. This means: Times, Helvetica, and Courier. The {\tt lt209w} format is essentially \LaTeX2.09, except for usage of these common fonts. If you prefer \LaTeX2E, the format to use is called {\tt latexw}. Notice that if you are absolutely sure that your output will be used only on computers where some other preferred fonts are {\em always} present, you can build another format using these preferred fonts. {\em These font assignments are done in the {\tt wintier.tex} source file.} This restriction applies to the {\em text} fonts only. Since the {\TeXspider} renders all math into graphics files, the math fonts should stay conventional \TeX\ fonts ({\em These font assignments are done in the {\tt expfonts.sty} file.}). Thus, in {\tt lt209w} the {\tt\char92 rm} command will invoke Times Roman in text mode, but the Computer Modern roman in math mode. This has been set up with \LaTeX2E\ with the NFSS; under \LaTeX2.09 we used the MicroPress \verb+\aliasfont+ extension. However, the 2E compatibility mode (which is used under 2E for the documents that start with \verb+\documentstyle+) does not really support NFSS and we did not want to ``hack our way through'' or modify the 2E sources. Thus, if you document indeed starts with \verb+\documentstyle+, you should either \begin{itemize} \item Compile it under 2.09 \item Change the command to \verb+\documentclass+ and compile it under 2E \item Compile it under 2E as is and live with the less-than-good results (use of Times in math will result in very low quality). \end{itemize} \section{Text, math, and the export directive.} In general, both HTML and RTF are inferior to \TeX\ in terms of typesetting flexibility. For example, neither one will render mathematical formulas correctly. For this reason, the {\TeXspider} divides the output into two piles: \begin{itemize} \item Output that will be formatted by the target language (HTML, e.g.) is saved into the target language files ({\tt .htm}) \item Output that cannot be well-formatted by the target language is saved into pre-built graphics files ({\tt .gif} for HTML). \end{itemize} The default assigment is as follows: \begin{itemize} \item Normal text, including headers, accents, footnotes, and tables is given to the target language. \item Math is converted into graphics files. \end{itemize} There are some pitfalls here. Firstly, notice that in \LaTeX\ tables are actually always done in math mode. However, HTML is capable of tables, albeit not with the same sophistication as \TeX. Since majority of simpler tables will work fine with HTML, the {\TeXspider} treats tables as text. Secondly, there are some (usually) table-related situations where the {\TeXspider}'s intercept of tables is not sufficient. For example, some people use the math mode in order to center a table on a page. In a case like this, the table will be built as a bitmap. If this is not desirable, modify your \LaTeX\ source to use \verb+\begin{center}..\end{center}+ or some other centering mechanism. Finally, there will be situations where you would prefer to let \TeX\ do the formatting on a fragment of document. The new directive {\tt\char92 export\{...\}} converts its argument to graphics. Some of the cases where this directive should be used are \begin{itemize} \item invironments similar to the {\tt picture} environment [the {\tt picture} is trapped by the {\tt export.sty}]. \item paragraphs formatted with {\tt\char92 parshape} \item more complicated tables \end{itemize} In reverse, if you would like a fragment of text which normally be converted to the graphics, to be formatted by the target language, set the {\tt\char92 MathBSuppress} counter to a non-zero value before the box. For example, you can type \begin{itemize} \item {\tt\char92 MathBSuppress=1 \$a+b\$} \item[] {\tt\ \ \ \ or} \item {\tt\char92 MathBSuppress=1 \$a \char92 over b\$} \end{itemize} Notice that in the first case, you will get ``a+b'' in text mode; in the second the rendering will be wrong since neither HTML nor RTF supports fractions. Notice also that \begin{itemize} \item {\tt\char92 MathBSuppress} affects only one next formula. \item This technique is used in {\tt export.sty} to modify the default behaviour of the {\tt tabular} environment. \end{itemize} \section{OpMode and export (technical details)} If you examine {\tt export.sty}, you will see that the way \verb+export{}+ works is by temporary incrementing the {\em new} counter \verb+\OpMode+. OpMode in actually one of a few new {\tt internal} quantities added to \TeX. It is an abbreviation for ``Operation Mode'' with some of the defined values being 0 (normal .dvi-making \TeX), 10 (HTML-making), and 12 (RTF-making). In general you should not alter this variable, except for temporary incrementing it by 1, which shifts the compiler to the corresponding {\em Bitmap} creation mode. Thus, when OpMode is 11, the {\TeXspider} makes {\tt.gif}'s for HTML, and when it is 13, the {\TeXspider} makes {\tt.bmp}'s for RTF. Notice that this OpMode shift should be done on an entire \TeX\ box. The sizes of the box will become the sizes of the bitmap image. This has an important corollary that if in normal \TeX\ it is permissible to have boxes with elements sticking out of them, it is a no-no for {\TeXspider} boxes that are converted to bitmapped images. Some examples: \begin{itemize} \item Formula \verb+$\!\!=\!\!$+ will produce an equal sign (possibly touching the surrounding text) in normal \TeX; it will produce just a colon in HTML export (most of the equal sign being cut off). \item It is fully permissible (but rarely useful) in \LaTeX\ to have a picture environment box that is smaller than the included elements. Such boxes will get truncated. \end{itemize} \section{Page division, Footnotes} The exporter ignores \TeX-generated page breaks. Instead, it breaks the document into sections, using the the \LaTeX's {\tt\char92 section} and {\tt\char92 chapter} commands. Notice that you can produce additional breaks with the new {\tt\char92 Split} command. If your document contains very large sections (say more than three printed pages), you should use it. \smallskip Since footnotes are of use only in printed materials, the {\TeXspider} treats them as endnotes, moving them to the end of the current section. \section{Some additional commands} The following additional commands may be of use in preparing HTML output: \begin{itemize} \item {\tt \char92 aref}: create an anchor reference (hyperlink). This command has two arguments: the label name and the text to actually appear in the document. \item {\tt \char92 aname}: create an anchor name (target for hyperlinks). This command has one argument: the label name; it does not actually produce any visible output. \item {\tt \char92 Split}: divide the document. This command has one argument: the label name to be placed on top of the new section. For HTML, splitting the document actually means opening a new {\tt .htm} output file; for RTF, it means opening a new section within a single .RTF file. \end{itemize} These commands are described and used in the {\tt export.sty} file. \section{Bitmapped fonts} The {\TeXspider} can use either scalable ({\tt .IF4}) or bitmapped fonts ({\tt .PK}, {\tt. PXL}, {\tt .GF}). While we were one of the first companies advocating the use of scalable fonts (well before TrueType ever existed), with the {\TeXspider} you may want to consider using bitmapped fonts. The rationale here is that the {\TeXspider} builds the output only once and you want it to look the best {\em on screen}. The bitmapped fonts will produce better quality images in all cases. To use the bitmapped fonts, you should run the {\tt CGRASTER} utility program to configure them (this is described in V\TeX\ user guide) and then turn the {\tt [ITEX]/[bitmappedfonts]} setting in {\tt VTEX.INI} on. (Of course, you would also need the bitmapped fonts at the right resolutions: these can be created using VMetaFont). Some common pre-generated bitmapped fonts for resolutions 120dpi and 160dpi are supplied. \section{Additional options} The following options can be placed into the {\tt VTEX.INI} file to control how the output is generated. \begin{itemize} \item {\bf [Emitter]/[Resolution]}: the .dpi resolution to be used in rendering graphics images. The default is 120 dpi. Increasing the resolution will produce larger images. \item {\bf [Emitter]/[DiBits]}: only the value of 1 is currently supported. \item {\bf [ITEX]/[cachesize]}: the size of font cache used in rendering fonts. Cache value of 100000 (100KB) should be sufficient. Large cache makes the {\TeXspider} faster. \item {\bf [ITEX]/[darken]}: The darkening coefficient in generating fonts from the {\tt .if4} outlines. The default is 1. Larger values will produce darker fonts. \item {\bf [ITEX]/[bitmappedfonts]}: should the bitmapped fonts be used? If yes, the {\TeXspider} will use the information in {\tt RFONTS.SET} to locate the bitmapped fonts. This value can be set to 0 (NO) or 1 (YES). \end{itemize} \section{Some No-No's} The {\TeXspider} is capable of handling majority of \LaTeX\ documents without user interventions. Nevertheless, there are cases when you may have to modify your source to accomodate the {\TeXspider}. Below are a few examples picked up from various sources we tried the {\TeXspider} upon. Most of them came from the {\tt AMSLATEX.TEX} documentation. \begin{itemize} \item Things may go slightly (or very) wrong if you redefine \LaTeX\ commands or enviroments within your document. For example, the {\tt AMSLATEX.TEX} documentation file redefines the {\tt verbatim} environment {\em after} it has been intercepted by {\tt export.sty}. This leads to extra spaces at the beginning of all {\tt verbatim} blocks. The right way (if you really must redefine a standard environment) is to put the redefinition into a style file and load this style {\em before} {\tt export.sty} is loaded. \item Do not use math mode to center tables and other non-math materials. This substantially increases the file size of the output since math is rendered into .GIF's. This is another glitch in the {\tt AMSLATEX.TEX} documentation. \item Do not use math symbols within text when it is not necessary. For example, the \verb+\P+, \verb+\S+, and \verb+\copyright+ definitions in ``normal'' \TeX\ refer to math characters. In {\tt export.sty} these were changed to refer to HTML characters. However, the {\tt AMSLATEX.TEX} file also uses the math \verb+\langle+ and \verb+\rangle+ characters within text. This looks bad; usual less or greater symbols would have looked much nicer. \item Ensure that math boxes are large enough to contain all their material. For example, do not use contracts like ``\verb+&\!\!=\!\!&+'' within an aligned equation. Since the {\TeXspider} will cut off anything that does not fit into the math box, the result of the above expression will be an equal sign deteriorated into a colon. \end{itemize} \section{Other known problems} One aggravation is the HTML's inability to have tables that contain {\em some} of the rules, but not all (or rules of different widths). The {\TeXspider} convention here is that a table which contains {\em any rules} will become a table that has {\em all rules} drawn; a table that has no rules will not have any. This is again the best which can be done with HTML; if you want your table to appear \TeX-way, \verb+\export{}+ it as explained above. \section{Additional switches} {\TeXspider}-specific switches are appended after the {\bf\verb+-$h+} switch. {\verb+h+} should be followed by the left parenthesis, followed by switch(es) with optional {\verb-+-} or {\verb+-+} sign. Multiple switches are separated by commas, no spaces allowed. The following additional switches are understood by the {\TeXspider}: \begin{itemize} \item {\bf \verb+(t-+}. Generate {\em non-transparent} .gif files. While the majority of browsers correctly handle transparent Gifs, NetScape on unix platforms is incapable of showing them on BW monitors. With this switch on, the bitmaps are generated with the background of the page. The disadvantage of doing this is another NetScape bug: in non-TrueColor modes, the color mapping logic of NetScape often will translate the page and gif background differently (even if the document source specifies the same number); this will make the gif backgrounds visible. Using this switch will make generated pages visible on {\em all} hardware but will make them less attractive for many users. In general, if you anticipage that SUN users may want to access your pages, you should generate non-transparent gifs. Yet another solution is to generate two sets of pages (with transparent- and non-transparent gifs). \item {\bf \verb+(h-+}. Suppress the {\tt Created by MicroPress {\TeXspider}} tag and hyperlink to MicroPress. By default, this tag is always generated on the first page of your document. \item {\bf \verb-(l+-}. Produce files with the extension {\tt.HTML} rather then {\tt.HTM}. \item {\bf \verb+(f-+}. Don't fix font names. By default, the {\TeXspider} will use names {\tt Courier}, {\tt Times} and {\tt Helvetica} even if under Windows these fonts are known as {\tt Courier New}, {\tt Times New Roman} and {\tt Arial}. Most browsers will correctly substitute the fonts when running on Windows platforms; on Unix or Mac the substitution is unneeded, since the fonts {\tt Courier}, {\tt Times} and {\tt Helvetica} are actually present. You should only disable font name fixing is if you are certain that the {\tt Courier New}, {\tt Times New Roman} and {\tt Arial} will present on all machines from which your document may be browsed. It should not be done if you are going to upload the resulting HTML to the Internet. \end{itemize}\par \smallskip For example: \begin{description} \item[\tt -\$h(h-,t-,f+] means to suppress the hyperlink to MicroPress, generate .gif without transparency, and fix font names. \end{description} \section{Frames} Version 1.5 of the {\TeXspider} supports generation of HTML frames out of a single \LaTeX\ source. The relevant syntax is defined by the new environments \verb+\frameset+ and \verb+\frame+. The \verb+\frameset+ environment has two arguments. The first one must be either the letter \verb+c+ (columns) or the letter \verb+r+ (rows) and defines how the frames should be layed out (vertically or horizontally). The second defines the initial allocation of screen space for each frame, in HTML syntax. For example, \begin{verbatim} \begin{frameset}{r}{10\%,*} \end{verbatim} defines two frames, arranged vertically, with 10\% of the screen area going to the first frame, and the rest to the second. \begin{verbatim} \begin{frameset}{c}{100,1*,2*} \end{verbatim} defines three column frames, arranged horizontally, with the first 100-pixel wide, and third twice as wide as the second. For more details here, please see HTML documentation. The \verb+frame+ environment should be used only within \verb+frameset+. It delimits individual frame data. The number of frames within a frameset should be exactly what is declared by the \verb+\begin{frameset}+ command. The \verb+frame+ command has one argument: the logical name of the frame. You should make sure that the logical names are unique. The logical names are used in generating frame-dependant hyperlinks. [An interested user can find the source of these environment within the {\tt export.sty} file. We welcome your comments and suggestions.] \section{What's new in version 1.01} We expect the {\TeXspider} to be updated many times and here is the first installment of improvements made in version 1.01: \begin{itemize} \item The smallcaps fonts are now emulated [in 1.00, the \verb+\sc+ font was rendered as \verb+\rm+.] \item The {\tt amsart.sty} is now fully supported. \item The gif files now can be optionally generated without transparent background. \item The \verb+url+ package (by Donald Arseneau) is now supported. \item One bug in generating .gif files has been fixed. \end{itemize} \section{What's new in version 1.5} This version contains two major improvements: user-defined HTML headers and Frame support. In the previous versions, the HTML headers (pointers to the next-, previous-, first-, and last- pages) were automatically supplied by the {\TeXspider}. In 1.5, they are entirely defined in \LaTeX ({\tt export.sty}). The user thus has a full control over their appearance. \section{What's new in version 1.51} This version corrects two problems: \begin{itemize} \item The \verb+\eqno(...)+ command was incorrectly handled by the previous versions of the software. This really was not a bug since \verb+\eqno(...)+ is technically not \LaTeX; well, now it works. \item Much more serious problem was incorrect handling of capital Greek letters. Previous versions of the software tried to get them from the HTML fonts, where they of course were not. Now works. \end{itemize} Both corrections were implemented in supporting \LaTeX\ files (\verb+export.sty+ and \verb+fontmath.lv3+); it was not necessary to change the convertor itself. \appendix \section{The Text Mode Symbol Reference} \par \smallskip \begin{center} \begin{tabular}{|lc|}% \hline textasciicircum & \textasciicircum \\ textasciitilde & \textasciitilde \\ textasteriskcentered & \textasteriskcentered \\ textbackslash & \textbackslash \\ textbar & \textbar \\ textblacksquare & \textblacksquare \\ textbraceleft & \textbraceleft \\ textbraceright & \textbraceright \\ textbrokenbar & \textbrokenbar \\ textbullet & \textbullet \\ textcent & \textcent \\ %%% textcircled & {\tiny\it undefined} \\% \textcircled \\ textcurrency & \textcurrency \\ textdagger & \textdagger \\ textdaggerdbl & \textdaggerdbl \\ textdollar & \textdollar \\ textellipsis & \textellipsis \\ textemdash & \textemdash \\ textendash & \textendash \\ textexclamdown & \textexclamdown \\ textflorin & \textflorin \\ textgreater & \textgreater \\ textless & \textless \\ %%% textmalteseH & {\tiny\it undefined} \\% \textmalteseH \\ %%% textmalteseh & {\tiny\it undefined} \\% \textmalteseh \\ textonehalf & \textonehalf \\ textonequarter & \textonequarter \\ textparagraph & \textparagraph \\ textperiodcentered & \textperiodcentered \\ %%% textpeseta & {\tiny\it undefined} \\% \textpeseta \\ textquestiondown & \textquestiondown \\ textquotedbl & \textquotedbl \\ textquotedblbase & \textquotedblbase \\ textquotedblleft & \textquotedblleft \\ textquotedblright & \textquotedblright \\ textquoteleft & \textquoteleft \\ textquoteright & \textquoteright \\ textquotesinglbase & \textquotesinglbase \\ textquotesingle & \textquotesingle \\ textregistered & \textregistered \\ textsection & \textsection \\ textsterling & \textsterling \\ textthreequarters & \textthreequarters \\ texttrademark & \texttrademark \\ textunderscore & \textunderscore \\ textvisiblespace & \textvisiblespace \\ textyen & \textyen \\ \hline \end{tabular} \end{center} \end{document}