A modern and free C++ implementation of TeX based on CommonTeX

Kasper Peeters, <K.Peeters@damtp.cam.ac.uk>

[motivation] [status] [download] [new features] [error/warning messages]

NOTE: this is a preliminary release, not everything described below has been implemented yet

Why a new TeX implementation?

The current de-facto standard implementation of TeX for Unix, namely the one based on Web2c, has a number of shortcomings: TeX++ solves all of these issues, while at the same time staying as close as possible to the original TeX. Moreover, it has a number of interesting new features which make it possible to use TeX as a typesetting engine for a web browser. More on this at some later stage.

Why not implement a TeX-like program in a completely modern object oriented fashion, from scratch (like NTS for instance, or ant)? The reason is simply that there is no point throwing away things that work. Moreover, there are zillions of documents out there written in TeX, using the macro facilities of TeX in such a way that they will never translate properly into a subset of the language. It is therefore very hard to write a program that is not TeX yet processes all existing .tex files properly. CommonTeX is an excellent implementation which is easy to maintain, and more importantly, it is completely functional now. A slow conversion to more and more "modern" programming styles is anticipated.

See also Never rewrite code from scratch (which I certainly do not agree with in general, but applies beautifully to this particular project).

I chose CommonTeX as the starting point for TeX++ because it was written from scratch in C, rather than being converted from the Pascal sources by an automatic converter (see TeX in C for an example of a TeX version in C which was based on an automatic conversion from the Pascal original). The automatically generated C sources are typically horrible to read, let alone extend.


This is work in progress, but the full functionality of CommonTeX is already present (thanks to the excellent work of Pat Monardo 10 years ago). I have tested both plain TeX and LaTeX input files, the latter also with use of add-on packages like amsLaTeX. In order to preload plain.tex or latex.ltx, use the options --plain or --latex (you need the plain tex or latex input files for this to work; note that TeX++ currently does not use format files, but on modern machines reading and processing of plain.tex or latex.ltx typically takes only a small fraction of a second).

Current work focuses on getting the TeX++ specific features in place (margin settings and other page geometry) and cleaning up the I/O as well as the error messages and warnings. More information on recent progress can be found in the ChangeLog.

Download and install

You need the following additional software in order to compile and use TeX++:

Then download and install texpp.tar.gz ( Kb).

The current version is 1.27 dated 2003-09-14. The usual GNU installation instructions apply. See the file COPYING for license information (it's essentially the license of CommonTeX).

After installation, you should be able to go to the tests directory and run

texpp --plain < tst.tex > out.dvi
Run the program with the --help flag for a list of the options.

New features

TeX++ has the following new features (compared to Knuth's original TeX):
Command-line settings for page geometry
I intend to use TeX++ as an engine for a TeX document viewer. This should typeset based on the viewer window, not on the geometry settings in the input file. TeX++ can be driven as
texpp --hsize=20cm --vsize=25cm
to produce pages which have been typeset to the requested size.
Standard Unix behaviour for input/output
TeX++ reads input from stdin, writes dvi output to stdout, and diagnostic messages to stderr. There is no .log file anymore, but instead an option --verbose, which makes the diagnostics on stderr more verbose. Unless your TeX file does \open calls, TeX++ does not produce any files, just output to stdout and stderr.
Removal of old-fashioned terminal i/o
TeX has a whole bunch of routines that allow you to edit the input file while running the program, modify it, and do other types of interactive manipulation. This has been cut out completely, making its behaviour more robust (if there is an error, TeX++ program will print a message on stderr, then exit with an error code).
Complexity reduction
Good Unix programs are nice single binaries that you can install anywhere and configure with a couple of environment variables.

Other anticipated features, to be added in the near future:

The idea of "format files", which simply represent the internal memory of TeX after loading (many) macros, is very useful, but was implemented in a rather old-fashioned way in the original TeX. With TeX++ you can dump the state of memory at any time in a readable format using the \dump command. Such dump files can be read with \restore.

The binary of TeX++ is a bit under 200Kb in size, linked to the standard C and C++ libraries, and that's it: no zillions of supporting programs and libraries, no huge directory trees with zillions of configuration files, but just a single nice Unix binary that you can install anywhere you like.

Error/warning messages

Error and warning messages have been cleaned up and are presented in a more structured way:
file.tex:345-387:Overfull hbox (4.5pt too wide)


The following startup options are available:
--hsizeset the initial value of the \hsize dimen
--vsizeset the initial value of the \vsize dimen
--hoffsetset the value of \hoffset
--voffsetset the value of \voffset
--margindifferent way to set the margin size
If you want a mode as compatible with Knuth's original TeX as possible, use the option --verbose (the --compatible option is now switched on by default and switches off automatically when you add margin options). Output, however, still goes to stdout instead of a .dvi file, and logging information goes to stderr instead of the .log file.

Path settings

TeX++ uses the simple though effective Unix philosophy: everything that belongs together should sit in the same directory. Therefore, path settings for TeX++ are extremely simple: use the environment variables These variables contain ':' separated lists of paths. Typically, you would have one system-wide /usr/lib/tex/tfm and one /usr/lib/tex/sty as well as per-user directories for these sitting in users' home directories.

If you don't like "everything in one spot" idea, you should not be tempted to re-invent kpathsea. Instead, take a look at Graft for the right solution to the separation of packages in their own directories (hint: soft links are your friend). If you have a performance problem without kpathsea, then you probably should be thinking about dropping NFS in favour of more intelligent networked filesystems.

  $Id: index.html,v 1.18 2004/01/16 16:10:31 kp229 Exp $