A minimal LuaTeX setup on Windows (Part 4)

Time for a summary and review

My objective in writing this set of tutorials is to equip you with the basic information you need to create a truly minimal LuaTeX installation (running under Windows) for the purposes of experimenting with this amazing TeX engine. So, why might you want to do that, you may ask? A good question and here’s my personal viewpoint. LuaTeX is still rapidly evolving: it is in beta status (0.65 as I write this on 17 January, 2011) and a stable production version is likely to be at least a year away. Despite the brilliant team behind it, LuaTeX, like all beta software, is going to have bugs that need to be fixed. In addition, there are internals to the LuaTeX executable program that may change (APIs, libraries) as it evolves and matures. Running LuaTeX with an absolutely minimal setup allows you to create a controlled environment, of minimal complexity, which can help you save time when tracking down any problems you encounter during experiments and testing. There is, as always in life, a flip side to the story.

  • You will not have access to all the wonderful tools, packages, fonts etc that distributions such as TeX Live or MiKTeX provide, ready-installed for you. At least initially, be prepared to encounter lots of instances where LuaTeX cannot locate particular packages or other resources, simply because you have not installed them into your minimal environment.
  • When approaching some of the mailing lists for help or advice, you’ll need to be aware that your setup will be different to the ones being used by the wider community. On occasion, this may make it a bit more difficult to get advice or guidance but, to date, in my experience it has not been a serious issue.

However, you’ll soon build up the collection of packages, fonts etc that you need for your work or experiments. In addition, you learn a great deal about how LuaTeX operates and, through that, develop problem-solving techniques which will always be useful.

Where are we and where are we going?

To summarise:

  1. We have introduced the Kpathsea library that LuaTeX uses to locate files.
  2. We have seen that Kpathsea uses a mixture of environment variables and configuration files (texmf.cnf files) to locate files and resources on your computer and so tell LuaTeX where to find things.
  3. We have not yet covered, in any detail, the very important topic of “format files” (.fmt files) which are the last topic we need to address before moving on to the details of installation.

Format files, what are they?

In an earlier part of this tutorial series we discussed the difference between a “TeX engine” (e.g., LuaTeX, as in luatex.exe) and “macro packages” such as LaTeX. For anyone new to the TeX world, these differences can, initially, be quite confusing. I’ll recap here to help clarify the difference. When you download and run luatex.exe without any setup or installation you are running luatex.exe in its most “primitive form”: it understands hundreds of very “low-level” commands which are built into the executable file itself. These low-level commands, (so-called “primitives”), built into the executable file, are the fundamental building blocks from which tools and commands to typeset documents are written and constructed: commands written by “TeX programmers” for everyday use by people who want to typeset beautiful documents. In essence, the raw luatex.exe file understands a very powerful, but low-level, “typesetting programming language”, but one which is not intended to be used directly (entirely) “as is”. Instead, it is the job of “TeX programmers” to use this programming language to write “higher level commands” designed for “users” to write their documents. You will soon get to know these “higher level commands” as macros, for that is how they are referred to in the mainstream TeX literature. The ability to program with these hundreds of low-level commands is what gives TeX engines such power and flexibility to typeset almost anything you can think of. The “typesetting programming language”, which I’ll refer to as “TeX”, can be used to write anything from a quick one-line, time-saving macro through to complete document preparation systems containing of tens of thousands of lines of TeX programming code. In reality, the TeX typesetting (programming) language is, for many (including me :-(), an extraordinarily difficult language to program with, and I have the most profound admiration and respect (and some envy ;-)) for those who have mastered it.

So, the TeX language gives you the ability to write (i.e., program) a comprehensive set of commands that you can use to write and prepare your documents: i.e., to format them using your preferred “document preparation system” (i.e., large collection of TeX macros). And finally we get to the point: LaTeX is one such “document preparation system” written in the TeX language to format your document according to the commands provided through LaTeX. So, we can speak of the “LaTeX format”. Of course this is not the end of the story because although LaTeX provides a great many commands, it also provides a framework through which it can be extended even further. TeX programmers can, and do, of course, write “add on packages” for LaTeX which extend the core LaTeX program to provide specialist applications, additional features and so forth: so-called “LaTeX packages”. Think of there being a “set of core LaTeX commands” plus “add-on LaTeX packages”. The “set of core LaTeX commands” is intended to be stable and not change so that the commands and tools it provides can be relied upon by LaTeX package developers without fear that their code suddenly breaks due to changes in the “LaTeX core” (suddenly making their code incompatible). Now, I hope it is clear that if you want to typeset any document prepared with the LaTeX format then each time you run your “TeX engine” to typeset your LaTeX-based document, your TeX engine has to process the thousands of lines of code from which LaTeX is written.

Hmmm, sounds a painfully slow process? Executing all that raw LaTeX code each time I run my TeX engine? Well, no, because TeX offers a way round this using format files.

What TeX can do is process all that raw LaTeX code (i.e., just plain text) just once and dump out the processed code into a binary file version which it can reload and execute far faster than processing the original plain text source code form. This “binary brain dump” that TeX spits out is called a format file and has the extension .fmt. TeX engines can load and execute these “binary brain dumps” far faster than loading and executing the raw unprocessed plain text source code version, and that’s why they do it.

So, a “format file” is your TeX engine’s “binary brain dump” equivalent of a large collection of plain text macros which collectively provide document formatting capabilities. LaTeX is one example format, other include the venerable plain TeX (written by Donald Knuth) and the amazingly powerful ConTeXt format written by Hans Hagen.

Making TeX engines perform a binary brain dump

OK, so how do I convince TeX engines to do “binary brain dumps”? You just have to ask nicely, speaking gently via the command line. If have a copy of the raw luatex.exe file and you type luatex --help at the DOS prompt you will see a summary of the numerous command line options for LuaTeX. Two of these are of interest:

--fmt=FORMAT load the format file FORMAT
--ini be iniluatex, for dumping formats

–fmt=FORMAT: a primer
Suppose you have a document (written using the LaTeX format) called thesis.tex and you want to typeset it with the LuaTeX engine. Assuming you have already generated the latex.fmt file, you would type something like this at the command line:

luatex --fmt=latex thesis.tex

All being well, LuaTeX would output thesis.pdf containing your beautifully typeset masterpiece. Clearly, thesis.tex will have to be in a location where LuaTeX (i.e., Kpathsea) can locate it.

What’s this --ini option do?

In short, the --ini command line option tells the TeX engine to go into its special “brain dump mode” to generate a .fmt file. Clearly, the TeX engine needs a bit more than just the --ini option, it needs the source code that it is expected to process and dump out to a .fmt file.

Typically, to generate a particular format, e.g., the LuaLaTeX format (more on that in future posts), you need to acquire something called an .ini file named, unsurprisingly, “formatname.ini”, as in lualatex.ini for the LuaLaTeX format.
These .ini files contain all the magic and code required to generate the .fmt file and are usually available on CTAN or found by browsing TeX Live. Successfully processing .ini files will require additional files which the .ini file “inputs”. For example, here is the lualatex.ini file sitting on my PC.

% $Id: lualatex.ini 20014 2010-10-08 12:06:28Z mpg $
% Karl Berry, originally written 2008. public domain.
% must be done first (activation of primitives)
\input lualatexiniconfig.tex
\input luatex-unicode-letters.tex
\input pdftexconfig.tex
% pdf output by default.
\scrollmode
\let\dump\relax
\input latex.ltx
\input lualatexquotejobname.tex
\pdfprimitive\dump

To process lualatex.ini and generate the lualatex.fmt file you run the command:

luatex --ini lualatex.ini

which puts LuaTeX into “brain dump mode” mode (called INITEX) and lets LuaTeX process all the code and dump out lualatex.fmt:

luatex --ini lualatex.ini
This is LuaTeX, Version beta-0.65.0-2010122301 (INITEX)
(c:/..../formats/lualatex/lualatex.ini
(c:/..../formats/lualatex/lualatexiniconfig.tex)
(c:/..../formats/lualatex/luatex-unicode-letters.tex
loading Unicode properties)

Lots more output… and finally

Beginning to dump on file lualatex.fmt
(format=lualatex 2011.1.18)
4811 strings using 32200 bytes
145027 memory locations dumped; current usage is 124&44272
3489 multiletter control sequences
\font\nullfont=nullfont
\font\OMX/cmex/m/n/10=cmex10
\font\tenln=line10
\font\tenlnw=linew10
\font\tencirc=lcircle10
\font\tencircw=lcirclew10
\font\OT1/cmr/m/n/5=cmr5
\font\OT1/cmr/m/n/7=cmr7
\font\OT1/cmr/m/n/10=cmr10
\font\OML/cmm/m/it/5=cmmi5
\font\OML/cmm/m/it/7=cmmi7
\font\OML/cmm/m/it/10=cmmi10
\font\OMS/cmsy/m/n/5=cmsy5
\font\OMS/cmsy/m/n/7=cmsy7
\font\OMS/cmsy/m/n/10=cmsy10
14 preloaded fonts
0 words of pdf memory
0 indirect objects
No pages of output.
Transcript written on lualatex.log.

Job done!

Did you spot the missing link?

This is all very well, but before I can use LuaTeX I have to generate .fmt files, but how does LuaTeX find all these .ini files, .fmt files in the first place? You guessed it: we need texmf.cnf before we can do anything like this. Stay tuned.