TeX++

A modern and free C++ implementation of TeX based on CommonTeX

NOTE: this is a preliminary release, not everything described below has been implemented yet

Why a new TeX implementation?

The current de-facto standard implementation of TeX for Unix, namely the one based on Web2c, has a number of shortcomings:
• It is written in Pascal (dressed up in Web) which is a language that not many modern programmers are familiar with. Extensions to TeX are therefore very, very hard to get implemented. Moreover, because of this intermediate Pascal to C translation layer, the size of the sources is extremely big (despite the fact that the resulting "`tex`" binary may be small). It is also a pain to install from sources (with TeX++ it's a matter of `./configure`, `make`, `make install` together with some trivial settings to make it pick up the fonts and style files).
• It still uses many ancient ideas from the original 1982 implementation of TeX (excessive emphasis on reduction of memory usage, for instance). A lot of things can be done much more cleanly when present-day computing power is taken into account.
• It uses the (in my opinion) ridiculous path searching mechanism of kpathsea, which is enormously bloated simply because it tries to be backwards compatible with every single TeX installation in existence. There are much cleaner and simpler solutions for pathname resolution.
• It does in general not conform to the Unix "filter" philosophy, in which input comes from stdin, output goes to stdout and diagnostic messages go to stderr.
TeX++ solves all of these issues, while at the same time staying as close as possible to the original TeX. Moreover, it has a number of interesting new features which make it possible to use TeX as a typesetting engine for a web browser. More on this at some later stage.

Why not implement a TeX-like program in a completely modern object oriented fashion, from scratch (like NTS for instance, or ant)? The reason is simply that there is no point throwing away things that work. Moreover, there are zillions of documents out there written in TeX, using the macro facilities of TeX in such a way that they will never translate properly into a subset of the language. It is therefore very hard to write a program that is not TeX yet processes all existing .tex files properly. CommonTeX is an excellent implementation which is easy to maintain, and more importantly, it is completely functional now. A slow conversion to more and more "modern" programming styles is anticipated.

See also Never rewrite code from scratch (which I certainly do not agree with in general, but applies beautifully to this particular project).

I chose CommonTeX as the starting point for TeX++ because it was written from scratch in C, rather than being converted from the Pascal sources by an automatic converter (see TeX in C for an example of a TeX version in C which was based on an automatic conversion from the Pascal original). The automatically generated C sources are typically horrible to read, let alone extend.

Status

This is work in progress, but the full functionality of CommonTeX is already present (thanks to the excellent work of Pat Monardo 10 years ago). I have tested both plain TeX and LaTeX input files, the latter also with use of add-on packages like amsLaTeX. In order to preload plain.tex or latex.ltx, use the options `--plain` or `--latex` (you need the plain tex or latex input files for this to work; note that TeX++ currently does not use format files, but on modern machines reading and processing of plain.tex or latex.ltx typically takes only a small fraction of a second).

Current work focuses on getting the TeX++ specific features in place (margin settings and other page geometry) and cleaning up the I/O as well as the error messages and warnings. More information on recent progress can be found in the ChangeLog.

You need the following additional software in order to compile and use TeX++:
• A decent C++ compiler; gcc 3.x and higher will do. If you encounter compilation errors, mail me.

The current version is 1.27 dated 2003-09-14. The usual GNU installation instructions apply. See the file `COPYING` for license information (it's essentially the license of CommonTeX).

After installation, you should be able to go to the `tests` directory and run

`texpp --plain < tst.tex > out.dvi`
Run the program with the `--help` flag for a list of the options.

New features

TeX++ has the following new features (compared to Knuth's original TeX):
Command-line settings for page geometry
I intend to use TeX++ as an engine for a TeX document viewer. This should typeset based on the viewer window, not on the geometry settings in the input file. TeX++ can be driven as
`texpp --hsize=20cm --vsize=25cm`
to produce pages which have been typeset to the requested size.
Standard Unix behaviour for input/output
TeX++ reads input from stdin, writes dvi output to stdout, and diagnostic messages to stderr. There is no .log file anymore, but instead an option `--verbose`, which makes the diagnostics on stderr more verbose. Unless your TeX file does `\open` calls, TeX++ does not produce any files, just output to stdout and stderr.
Removal of old-fashioned terminal i/o
TeX has a whole bunch of routines that allow you to edit the input file while running the program, modify it, and do other types of interactive manipulation. This has been cut out completely, making its behaviour more robust (if there is an error, TeX++ program will print a message on stderr, then exit with an error code).
Complexity reduction
Good Unix programs are nice single binaries that you can install anywhere and configure with a couple of environment variables.

Other anticipated features, to be added in the near future:

Checkpointing
The idea of "format files", which simply represent the internal memory of TeX after loading (many) macros, is very useful, but was implemented in a rather old-fashioned way in the original TeX. With TeX++ you can dump the state of memory at any time in a readable format using the `\dump` command. Such dump files can be read with `\restore`.

The binary of TeX++ is a bit under 200Kb in size, linked to the standard C and C++ libraries, and that's it: no zillions of supporting programs and libraries, no huge directory trees with zillions of configuration files, but just a single nice Unix binary that you can install anywhere you like.

Error/warning messages

Error and warning messages have been cleaned up and are presented in a more structured way:
```file.tex:345-387:Overfull hbox (4.5pt too wide)
another.tex:843:
```

Options

The following startup options are available:
 --help --hsize set the initial value of the \hsize dimen --vsize set the initial value of the \vsize dimen --hoffset set the value of \hoffset --voffset set the value of \voffset --margin different way to set the margin size --verbose --debug --version
If you want a mode as compatible with Knuth's original TeX as possible, use the option `--verbose` (the `--compatible` option is now switched on by default and switches off automatically when you add margin options). Output, however, still goes to stdout instead of a .dvi file, and logging information goes to stderr instead of the .log file.

Path settings

TeX++ uses the simple though effective Unix philosophy: everything that belongs together should sit in the same directory. Therefore, path settings for TeX++ are extremely simple: use the environment variables
• TEXPP_TFM: location of .tfm files.
• TEXPP_STY: location of .sty and other input files.
These variables contain ':' separated lists of paths. Typically, you would have one system-wide `/usr/lib/tex/tfm` and one `/usr/lib/tex/sty` as well as per-user directories for these sitting in users' home directories.

If you don't like "everything in one spot" idea, you should not be tempted to re-invent kpathsea. Instead, take a look at Graft for the right solution to the separation of packages in their own directories (hint: soft links are your friend). If you have a performance problem without kpathsea, then you probably should be thinking about dropping NFS in favour of more intelligent networked filesystems.

\$Id: index.html,v 1.18 2004/01/16 16:10:31 kp229 Exp \$