How does \expandafter work: A detailed macro case study
Case study: \expandafter
example from The \(\varepsilon\mathrm{\text{-}{\TeX}}\) Manual
The \(\varepsilon\mathrm{\text{-}{\TeX}}\) typesetting engine was derived from Knuth’s TeX software and originally intended as an “interim” step toward development of the New Typesetting System (NTS), written in the Java programming language. \(\varepsilon\mathrm{\text{-}{\TeX}}\) was first developed in the late 1990s to add a suite of new primitive commands which provide additional functionality not available in Knuth’s original program. Although \(\varepsilon\mathrm{\text{-}{\TeX}}\) has received periodic updates since its initial release, today it is not widely used as a standalone typesetting engine although its innovations have been absorbed into later generations of TeX: pdfTeX, XeTeX and LuaTeX.
The \(\varepsilon\mathrm{\text{-}{\TeX}}\) manual contains an enlightening example of a macro which makes clever use of \expandafter
:
\def\foo#1#2{\number#1 \ifnum#1<#2, \expandafter\foo \expandafter{\number\numexpr#1+1\expandafter}% \expandafter{\number#2\expandafter}% \fi}
\foo
implements a looping mechanism, such that \foo{7}{13}
produces 7, 8, 9, 10, 11, 12, 13
; however, \foo
does not use any assignments to variables in order to control the looping process—which makes it an interesting macro to explore in some detail.
Some background: expressions and assignments
An important element of \foo
’s code is its use of the command \numexpr
, a command from a set of four related primitives first introduced by \(\varepsilon\mathrm{\text{-}{\TeX}}\): \numexpr
, \dimexpr
, \glueexpr
and \muexpr
. Their purpose is to construct so-called expressions which allow calculation/manipulation of TeX values of type number, dimen, glue, or muglue (respectively). As discussed on pages 8–9 of The \(\varepsilon\mathrm{\text{-}{\TeX}}\) Manual, an important characteristic of expressions is their evaluation (calculation) does not require TeX to perform any assignments.
In programming terms, assignment is the process of setting (assigning) a variable to a have a particular value; for example, assigning \count
register 99
to contain the value 12345
via \count99=12345
. Many other types of assignment take place during TeX processing—such as assigning token registers to contain a series of tokens, assigning box registers to contain box content, and so forth.
To perform an assignment, such as \count99=12345
, TeX needs to action (execute) the internal code which implements the behaviour of \count
or any other primitive that performs some sort of assignment. However, there are times when TeX is performing pure expansion and, at those times, such assignments are not actioned—at that point in TeX’s processing. Examples of this situation include the following commands:
\edef\command {token list}
the “expanded definition” macro-definition command which expands tokens in token list and stores the results as the definition of\command
.\write number {token list}
expands tokens intoken list
and writes them out a file represented bynumber
.\directlua {token list}
this LuaTeX primitive command is used to pass Lua code to the built-in Lua interpreter. All tokens intoken list
are fully expanded before being passed to the Lua interpreter for execution.
Quick example of \edef
If we write the following basic macros:
\def\mycount{\count99=12345} \edef\mymacro{\mycount}
\edef
will expand \mycount
into is constituent tokens but it goes no further: none of the commands contained in the definion of \mymacro
will be actioned: i.e., the assignment of 12345
to \count99
does not happen at this point; only when we call \mymacro
will that assignment take place as TeX executes the code to process the \count
primitive. When TeX is performing expansion-only activities any assignments will be actioned later in TeX’s processing, not during the expansion process itself.
Why are assignments of interest here?
When writing code to perform a loop—in any programming language—it is common practice to have a variable designated to act as a “loop counter”: used control the number of times a loop is executed. Looping is typically controlled by testing whether that designated loop-counter variable has reached a particular value—that variable is incremented (or decremented) for each iteration of the loop. However, modifying a loop-counter variable means assigning it a new value which, for TeX, usually requires the primitive command \advance
to increment (or decrement) a value stored in a \count
register. As we’ve seen, during TeX’s pure expansion process such assignments (including incrementing variables) cannot take place: the macro \foo
cleverly circumvents this restriction.
Back to explaining \foo
The macro \foo
is able to control the looping process without needing to assign values to any variables: it controls how often the loop takes place using data arising from expansion: data values stored in temporary token lists. Using our knowledge of TeX’s usage (creation) of temporary token lists we can take a closer look to see exactly how \foo
achieves its results.
Remember: We are working through the execution of a macro after the original text of its definition—contained in a physical .tex
file—has been scanned (read-in by TeX) and converted to a token list representing the macro definition. In essence, we are following TeX’s processing of those stored tokens whilst it is reading and processing tokens in the macro definition contained somewhere in TeX’s memory. Any space characters originally present in the TeX code of the macro’s definition (text within the .tex
file) will have been absorbed whilst TeX was scanning that text for commands (spaces as terminators), or they will have been converted to tokens, such as the space character after the comma (,
) in \ifnum#1<#2,
which arose from conversion of the end-of-line character (\r
) into a space.
Because the TeX code in \foo
uses multiple \expandafter
commands, we’ll assist our explanation by adding subscripts to each \expandafter
, indicating which one we are rererring to. In addition, we’ll extend the notation for tokens processed by \expandafter
to \(\mathrm{T^i_1}\) and \(\mathrm{T^i_2}\), representing tokens \(\mathrm{T_1}\) and \(\mathrm{T_2}\) for \expandafteri
: \expandafteri
\(\mathrm{T^i_1T^i_2}\)
Here is the annotated macro code:
\def\foo#1#2{\number#1 \ifnum#1<#2, \expandafter1\foo \expandafter2{\number\numexpr#1+1\expandafter3}% \expandafter4{\number#2\expandafter5}% \fi}
\foo
starts with \number#1
which uses the expandable command \number
to convert the first argument value into its typeset representation. The \number
command works by generating a temporary token list containing character tokens which represent the individual digits contained in the numeric value that \number
is operating on. That token list becomes TeX’s next input source. Here, that token list is read and the tokens are output to typeset the value of #1
.
Next, the macro performs the test \ifnum#1<#2
to check if the argument for #1
is less than the argument passed in for #2
. If so, a comma (,
) token is output (typeset) followed by some space arising from the <space> token that was generated from the linebreak character after the comma (,
). That space character was first generated when TeX read this line from the .tex
file.
The macro continues by processing this next section of code, which is the core of its operation:
\expandafter1\foo \expandafter2{\number\numexpr#1+1\expandafter3}% \expandafter4{\number#2\expandafter5}% \fi}
In essence, this code generates a series of temporary token lists which result in multiple calls to the \foo
macro, terminating when the if-test \ifnum#1<#2
is no longer true. But how is looping controlled because no assignments are taking place: where is the “loop counter”?
Let’s start by looking at the code \expandafter1\foo\expandafter2
. Note that we will use the subscript notation token
(or (token)
) to remind ourselves that, here, TeX is reading/processing numeric (integer) token values.
Here, we have the following tokens as input for \expandafter1
:
- \(\mathrm{T^1_1} =\ \)
\footoken
which is read-in and stored for later re-insertion back into the input - \(\mathrm{T^1_2} =\ \)
\expandafter2 (token)
which is expanded
For \expandafter2
we have:
- \(\mathrm{T^2_1} =\ \)
{token
which is saved for later re-insertion back into the input - \(\mathrm{T^2_2} =\ \)
\numbertoken
which is expanded
Note:\number
is an expandable command whose purpose is to “convert to tokens”: i.e., convert a numeric quantity into a series of character tokens which represent that quantity. When \number
is expanded, the first thing that TeX does is to scan the input looking for integers: a process which triggers further expansion.
The key to the story: Here, \number
is acting on the expression \numexpr#1+1
which calculates the value of #1+1
. The result of that calculation is processed by \number
to convert it into a temporary token list containing character tokens representing the value of #1 + 1
. That temporary token list, generated by \number
, will eventually be read-in as the first argument to another call of \foo
. Rather than incrementing a loop counter (via \advance
and assignment), the use of \numexpr
creates a new value but without assignment being necessary. Through this mechanism, the variable controlling the loop (\foo
’s parameter #1
) is incremented and iteration through the loop is controlled and terminated: quite ingenious!
Next, \expandafter3
is processed, yielding:
- \(\mathrm{T^3_1} =\ \)
}token
which is saved for later re-insertion back into the input - \(\mathrm{T^3_2} =\ \)
\expandafter4 (token)
, which is expanded:
For \expandafter4
we have:
- \(\mathrm{T^4_1} = \)
{token
which is saved for later re-insertion back into the input - \(\mathrm{T^4_2} = \)
\numbertoken
which is expanded and converts#2
into another temporary token list.
Finally,\expandafter5
is expanded:
- \(\mathrm{T^5_1} =\ \)
}token
which is saved for later re-insertion back into the input - \(\mathrm{T^5_2} =\ \)
\fitoken
, which is an expandable command.The expansion of
\fi
effectively terminates the\ifnum
and, in effect, closes this iteration of the macro. TeX now completes re-insertion of all the tokens temporarily saved by the multiple\expandafter
commands: this generates a series of single-token token lists arising from the tokens saved by each\expandafter
. In addition TeX has also created token lists from through the action of\number
.
Assembling the token lists
In essence, the \foo
macro generates a sequence of token lists: you can think of \foo
as a token-list “manufacturing facility”. Those token lists are read by TeX to become the next sources of input. The clever part is contained in one of the earlier actions of \foo
:
\expandafter1\foo\expandafter2
through which \foo
arranges to call itself again but with different arguments that are stored in token lists constructed by \number
. To make these token lists collectively behave as a macro call, the braces {
and }
have all been saved and re-inserted into the input (as single-token lists) by the actions of \expandafter
commands.
Overleaf guides
- Creating a document in Overleaf
- Uploading a project
- Copying a project
- Creating a project from a template
- Using the Overleaf project menu
- Including images in Overleaf
- Exporting your work from Overleaf
- Working offline in Overleaf
- Using Track Changes in Overleaf
- Using bibliographies in Overleaf
- Sharing your work with others
- Using the History feature
- Debugging Compilation timeout errors
- How-to guides
- Guide to Overleaf’s premium features
LaTeX Basics
- Creating your first LaTeX document
- Choosing a LaTeX Compiler
- Paragraphs and new lines
- Bold, italics and underlining
- Lists
- Errors
Mathematics
- Mathematical expressions
- Subscripts and superscripts
- Brackets and Parentheses
- Matrices
- Fractions and Binomials
- Aligning equations
- Operators
- Spacing in math mode
- Integrals, sums and limits
- Display style in math mode
- List of Greek letters and math symbols
- Mathematical fonts
- Using the Symbol Palette in Overleaf
Figures and tables
- Inserting Images
- Tables
- Positioning Images and Tables
- Lists of Tables and Figures
- Drawing Diagrams Directly in LaTeX
- TikZ package
References and Citations
- Bibliography management with bibtex
- Bibliography management with natbib
- Bibliography management with biblatex
- Bibtex bibliography styles
- Natbib bibliography styles
- Natbib citation styles
- Biblatex bibliography styles
- Biblatex citation styles
Languages
- Multilingual typesetting on Overleaf using polyglossia and fontspec
- Multilingual typesetting on Overleaf using babel and fontspec
- International language support
- Quotations and quotation marks
- Arabic
- Chinese
- French
- German
- Greek
- Italian
- Japanese
- Korean
- Portuguese
- Russian
- Spanish
Document structure
- Sections and chapters
- Table of contents
- Cross referencing sections, equations and floats
- Indices
- Glossaries
- Nomenclatures
- Management in a large project
- Multi-file LaTeX projects
- Hyperlinks
Formatting
- Lengths in LaTeX
- Headers and footers
- Page numbering
- Paragraph formatting
- Line breaks and blank spaces
- Text alignment
- Page size and margins
- Single sided and double sided documents
- Multiple columns
- Counters
- Code listing
- Code Highlighting with minted
- Using colours in LaTeX
- Footnotes
- Margin notes
Fonts
Presentations
Commands
Field specific
- Theorems and proofs
- Chemistry formulae
- Feynman diagrams
- Molecular orbital diagrams
- Chess notation
- Knitting patterns
- CircuiTikz package
- Pgfplots package
- Typesetting exams in LaTeX
- Knitr
- Attribute Value Matrices
Class files
- Understanding packages and class files
- List of packages and class files
- Writing your own package
- Writing your own class