TeX
|
TEX, written as TeX in plain text, is a typesetting system created by Donald Knuth. It is popular in academia, especially in the mathematics, physics and computer science communities. It has largely displaced Unix troff, the other favored formatter, in many Unix installations.
TeX is generally considered to be the best way to typeset complex mathematical formulas, but, especially in the form of LaTeX and other template packages, is now also being used for many other typesetting tasks.
Contents |
The name and its pronunciation(s)
A homage to Caltech, where Knuth received his doctorate, the name TeX is intended to be pronounced "tekh", where "kh" represents the sound at the end of Scottish loch or the name of the German composer Bach (in IPA ). The X is meant to represent the Greek letter χ (chi). TeX is the abbreviation of τέχνη (technē), Greek for "art" and "craft", which is also the source word of technical. English speakers often pronounce it "tek", like the first syllable of technology.
The name is properly typeset with the "E" below the baseline; systems that do not support subscript layout use the approximation "TeX". Fans like to proliferate names from the word "TeX" — such as TeXnician (user of TeX software), TeXhacker (TeX programmer), TeXmaster (competent TeX programmer), TeXhax, and TeXnique.
History
Knuth began to write TeX because he had become annoyed at the declining quality of the typesetting in volumes I-III of his monumental The Art of Computer Programming. In a manifestation of the typical hackish urge to solve the problem at hand once and for all, he began to design his own typesetting language. He thought he would finish it on his sabbatical in 1978, but the language was not frozen until 1989, more than ten years later.
Guy Steele happened to be at Stanford during the summer of 1978, when Knuth was developing his first version of TeX. When Steele returned to MIT that fall, he rewrote TeX's I/O to run under ITS.
The first version of TeX was written in the SAIL programming language to run on a PDP-10 under Stanford's WAITS operating system. For later versions of TeX, Knuth invented the concept of literate programming, a way of producing compilable source code and high quality cross-linked documentation (typeset in TeX of course) from the same original file. The language used is called WEB and produces programs in Pascal.
TeX has an idiosyncratic version numbering system. Since version 3, updates have been indicated by adding an extra digit at the end of the decimal, so that the version number asymptotically approaches π. The current version is 3.141592. This is a reflection of the fact that TeX is now very stable, and only minor updates are anticipated. Knuth has stated that the "absolutely final change (to be made after my death)" will be to change the version number to π, at which point all remaining bugs will become features.
The typesetting system
TeX commands commonly start with a backslash and are grouped with curly braces. However, almost all of TeX's syntactic properties can be changed on the fly which makes TeX input hard to parse by anything but TeX itself. TeX is a macro and token based language: many commands, including most user-defined ones, are expanded on the fly until only unexpandable tokens remain which get executed. Expansion itself is practically side-effect free. Tail recursion of macros takes no memory, and if-then-else constructs are available. This makes TeX a Turing-complete language even at expansion level.
The system can roughly be divided in four levels: in the first characters are read from file and assigned a category code. Combinations of a backslash (really: any character of category zero) followed by letters (characters of category 11) or a single other character are replaced by a control sequence token. In this sense this stage is like lexical analysis, although it does not form numbers from digits. In the next stage, expandable control sequences (such as conditionals or defined macros) are replaced by their replacement text. The input for the third stage is then a stream of characters, including ones with special meaning, and unexpandable control sequences, typically assignments and visual commands. Here characters get assembled into a paragraph. TeX's paragraph breaking algorithm works by optimizing breakpoints over the whole paragraph. After the paragraph is broken into lines, the vertical list of lines and other material is broken into pages.
The TeX system has precise knowledge of the sizes of all characters and symbols, and using this information, it computes the optimal arrangement of letters per line and lines per page. It then produces a DVI file (for "device independent") containing the final locations of all characters. This dvi file can be printed directly given an appropriate printer driver, or it can be converted to other formats. Nowadays, PDFTeX is often used which bypasses DVI generation altogether.
Most functionality is provided by format files (predumped memory images of TeX after large macro collections have been loaded). Common formats are Knuth's original basic plain TeX, LaTeX (ubiquitous in the technical sciences), and ConTeXt (which is used primarily for Desktop Publishing).
The ultimate reference works for TeX are the first two volumes of Knuth's Computers and Typesetting, The TeXbook and TeX: The Program (which includes the complete documented source code for TeX).
TeX is usually distributed together with Metafont, a companion program also developed by Knuth which allows algorithmic description of fonts. The organisation of the directories in a TeX / Metafont installation is standardized in a tree called texmf.
License
The license allows free distribution and modification, but demands that any changed versions must not be called TEX, TeX, or anything confusingly similar. The American Mathematical Society has registered a trademark for TEX. A test suite called the TRIP test has been made to help testing whether an implementation is really a TEX.
Quality
TeX is written in WEB, a mixture of documentation written in TeX and a quite restricted Pascal subset. For example, TeX does all of its dynamic allocation itself from fixed-size arrays. As a result, TeX has been ported to almost all operating systems (usually by using the web2c converter).
Knuth offers monetary awards to people who find and report a bug in it. The award per bug started at $2.56 and doubled every year until it was frozen at its current value of $327.68. This has not made Knuth poor, however, as there have been very few bugs and in any case a cheque proving that the owner found a bug in TeX is usually framed instead of cashed.
Computer-science aspects of TeX
The TeX software incorporates several interesting algorithms, and has led to a number of theses of Knuth's students. For instance, a hyphenation algorithm (work by Frank Liang) is used that assigns priorities to breakpoints in letter groups. A list of hyphenation patterns can be generated automatically from a corpus of hyphenated words.
The line breaking algorithm is an example of dynamic programming. The problem of breaking a paragraph of n words into lines has a naive complexity of 2n, but with dynamic programming a globally optimal layout can be derived in time proportional to the number of words and the number of words per line. A thesis by Michael Plass shows how the page breaking problem can be NP-complete because of the added complication of placing figures.
The companion program Metafont for character generation uses Bezier curves in a fairly standard way, but Knuth devotes lots of attention to the rasterizing problem on bitmapped displays. Another thesis, by John Hobby, further explores this problem of digitizing "brush trajectories". This term derives from the fact that Metafont describes characters as having been drawn by abstract brushes.
Derived works
Several document processing systems are based on TeX, notably:
- LaTeX (Lamport TeX), which incorporates document styles for books, letters, slides, etc., and adds support for referencing and automatic numbering of sections and equations,
- ConTeXt, written mostly by Hans Hagen at Pragma (http://www.pragma-ade.com) is a professional document designing tool based on TeX. It's much younger than LaTeX and therefore maybe less popular than its older brother, but much more powerful.
- AMS-TeX, produced by the American Mathematical Society, this has a lot of more user-friendly commands, which can be altered by journals to fit with the house style. Most of the features of AMS-TeX can be used in LaTeX by using the AMS "packages". This is then referred to as AMS-LaTeX. The main AMS-TeX manual is entitled The Joy of TeX.
- jadeTeX which uses TeX as a backend for printing from James Clark's DSSSL Engine,
- Texinfo, the GNU documentation processing system.
- XeTeX is a new TeX engine that supports Unicode and the advanced Mac OS X font technologies.
Numerous extensions and companion programs for TeX exist, among them BibTeX for bibliographies (distributed with LaTeX), PDFTeX, which bypasses dvi and produces output in Adobe Systems' Portable Document Format, and Omega, which allows TeX to use the Unicode character set. All TeX extensions are available for free from CTAN, the Comprehensive TeX Archive Network.
Compatible tools
On UNIX-compatible systems (including Mac OS X), TeX is distributed in the form of teTeX. On Windows, there is the MiKTeX distribution and the fpTeX distribution.
The TeXmacs text editor is a WYSIWYG scientific text editor that is intended to be compatible with TeX. It uses Knuth's fonts, and can generate TeX output. LyX is a similar tool.
TeX and MediaWiki
As of 2003, the MediaWiki wiki software (as used on Wikipedia) implements TeX markup, using <math>...</math> tags enclosing blocks of TeX. This capability is implemented via Texvc which is basically a script that pipes the markup through TeX, then dvips to produce a PostScript file which Ghostscript renders into a PNG image. Due to the nature of the web environment, this is done in an efficient (cached) and security-conscious way – allowing third parties to pass unsanitised text through the standard TeX engine is a bad idea if you value your files.
The example fragments of TeX below are rendered using Texvc, and simple ones such as <math>a \over b</math> can be used to generate <math>a \over b<math>, although it is recommended that one write the HTML-rendered a/b instead.
Examples
A simple plain TeX example - Create a text file myfile.tex with the following content:
hello \bye
Then open a command line interpreter and type
tex myfile.tex
TeX then creates a file myfile.dvi Use a viewer to look at the file. MikTeX for example contains a viewer called yap:
yap myfile.dvi
The viewer shows hello on a page. \bye is a TeX command which marks the end of the file and is not shown in the final output.
The dvi file can either be printed directly from the viewer or converted to a more common format such as PostScript using the dvips program.
Alternatively PDF-files may be created directly, using pdfTeX:
pdftex myfile.tex
pdfTeX was originally created because converting generated PostScript into PDF resulted in poor font display, though printing performance was fine. This was because TeX natively uses bitmap fonts, which are only designed to display well at one particular size, whereas PostScript typically uses scalable Type 1 fonts.
It is now possible to make dvips output scalable fonts with a bit of tweaking (newer versions of Ghostscript support it), but direct conversion to PDF has other benefits: it is a one-step, not two-step process, and pdfTeX provides facilities such as bookmarks and hyperlinks not found in PostScript.
Mathematical examples
To see TeX further in action, look at its formatting of mathematical formulas. For example, to write the well-known quadratic equation, try entering
The quadratic formula is ${-b\pm\sqrt{b^2-4ac} \over {2a}}$ \bye
Use TeX as above, and you should get something that looks like
- The quadratic formula is <math>{-b\pm\sqrt{b^2-4ac} \over {2a}}<math>
Notice how the formula is printed in a way a person would write by hand, or typeset the equation. In a document, entering mathematics mode is done by starting with a $, then entering a formula in TeX semantics and closing again with another $. Display mathematics, or mathematics presented centered on a new line is done by using $$. For example, the above with the quadratic formula in display math:
The quadratic formula is $${-b\pm\sqrt{b^2-4ac} \over {2a}}$$ \bye
renders as
- The quadratic formula is
LaTeX examples
LaTeX is a collection of macros written in TeX. There are many predefined templates (with predefined styles) one can use. It is much more structured than TeX, providing a set of macros and utilities for indexing, tables, lists and so forth. For example:
\documentclass[a4paper]{book} \begin{document} \section{ ... a title } \subsection{ ... a subtitle} %% The text goes here \end{document}
To render the book as a PostScript file, use
latex myfile.tex dvips myfile.dvi
Alternatively, one way to render the book as a PDF file is
pdflatex myfile.tex
Application software for TeX
References
- This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.
See also
- Metafont
- MetaPost
- List of document markup languages
- Comparison of document markup languages
- Plain TeX Quick Reference (PDF) (http://www.csit.fsu.edu/~mimi/tex/tex-refcard-letter.pdf)
External links
- MediaWiki User's Guide to editing mathematical formulae
- The TeX users group (http://www.tug.org/)
- The UK TeX Users' Group FAQ (http://www.tex.ac.uk/cgi-bin/texfaq2html?introduction=yes)
- Getting started with LaTex (http://www.artofproblemsolving.com/LaTeX/AoPS_L_About.php) at Art of Problem Solving
- Simon Eveson, An Introduction to Mathematical Document Production Using AmSLaTeX (http://www.york.ac.uk/depts/maths/tex/texnotes.ps)
a PostScript file - Mac OS X TeX/LaTeX Web Site (http://www.esm.psu.edu/mac-tex/)
- Online LaTeX (http://sciencesoft.at/index.jsp?link=latex&lang=en) for converting online TeX to a PNG graphic
- ConTeXt: The ConTeXt wiki (http://contextgarden.net) and Homepage at Pragma (http://www.pragma-ade.com)
Software
- Comprehensive TeX Archive Network (http://www.ctan.org/): Repository of the TeX source and hundreds of add-ons and style files.
- Omega (http://omega.enstb.org/index.html) (16 bit version of TeX; includes lambda version of LaTeX)
- TeXnicCenter (http://www.toolscenter.org) (a feature rich integrated development environment (IDE) for developing LaTeX-documents on Microsoft Windows (Windows 9x/ME, NT/2000/XP) freely available under GPL.)
- GNU TeXmacs Scientific Editor (http://www.texmacs.org/)
- Chikrii Softlab (http://www.chikrii.com) (Word2Tex and Tex2Word)
- The TeXLive (http://www.tug.org/texlive/) distribution is said to be an easy start for beginners. It includes a multiplatform DVD which contains basically all of CTAN. For Windows users it includes fpTeX (see below).
- TeXShop (http://www.uoregon.edu/~koch/texshop/texshop.html) - a free TeX editor for Mac OS X (with syntax coloring and Cocoa spellchecking)
- MiKTeX (http://www.miktex.org/) – MiKTeX (pronounced mick-tech) is an up-to-date implementation of TeX and related programs for Windows (all current variants) on x86 systems.
- fpTeX (http://www.fptex.org) – fpTeX is an up-to-date port of tetex for Windows.
Books
- Donald E. Knuth, The TeXbook (Computers and Typesetting Volume A), Reading, Massachusetts: Addison-Wesley, 1984. ISBN 0201134489. The source code of the book in TeX (http://www.ctan.org/tex-archive/systems/knuth/tex/texbook.tex) is available online on CTAN. It is provided only as an example and its use to prepare a book like The TeXbook is not allowed.
- Victor Eijkhout, TeX by Topic (http://www.eijkhout.net/tbt/): Freely (as in beer) downloadable programmer's reference
- TeX for the Impatient (http://tug.org/ftp/tex/impatient/), a more tutorial book, now licensed under GFDL.
- Norman Walsh, Making TeX Work (http://makingtexwork.sourceforge.net/mtw/): Free online book
- Stefan Schwarz and Rudolf Potucek, TeXikon (http://texikon.artiverse.net/): (German) online reference work documenting over 1400 TeX and LaTeX commands. This website derives from an out-of-print book published by Addison-Wesley in 1996 as ISBN 3893196900.cs:TeX
da:TeX de:TeX es:TeX eo:TeX fr:TeX hu:TeX ko:TeX lt:Tex nl:TeX ja:TeX pl:TeX pt:TeX ru:TeX sl:TeX sv:TeX uk:TeX vi:TeX zh:TeX