Well, it happened again. Someone asks a perfectly innocent question, and the next thing I know I'm in touch with a deep vein of hostility and several paragraphs into a jeremiad on the evils of Tcl's expr command. It isn't Tcl's problem, really, that I hate writing code that uses expr. It's my problem. So here's my problem and here's my solution.
Expr Before
The
expr
command is the Tcl user's window into the world of
computations with numbers. It implements a variety of
unary and binary arithmetic operators, using the same
infix syntax as one finds in C, FORTRAN, and dozens of
other programming languages in common use. It also
implements a collection of math functions, using the same
prefix function call syntax as one finds in C, FORTRAN,
and dozens of other programming languages in common use.
Thus, we can write mathematical expressions like
2*atan2(-1,0) or sqrt(1+1), give
them to expr to evaluate, and we get the result
which our experience as programmers has led us to expect.
But the story doesn't end there, because the expr
command is embedded in a larger programming language
called Tcl which doesn't much resemble other
programming languages at all. So, when it came time to
implement variable reference and procedure call in
expr, Tcl's creator decided to use the Tcl syntax
for variable reference, which is $variable, and to
use the Tcl syntax for procedure call, which is
[command arg1 ...], neither of which has anything
to do with C, FORTRAN, or dozens of other programming
languages in common use. Thus, if we want to write the
recursive factorial function in Tcl, we write:
This is a result that only a Tcl programmer's experience
would lead one to expect.
proc factorial {n} {
expr {
$n > 1 ?
$n * [factorial [expr {$n-1}]] :
1
}
}
There were some reasons for this choice of hybrid expression syntax. Originally, the variable reference and function call syntax was handled by the main Tcl evaluation code, and all expr had to do was to sort out the result of the expression with these substitutions already done. However, this didn't work too well with the conditional expression operators, ?:, &&, and ||. In expressions involving these operators, only some parts of the expressions should be evaluated. So expr had to be taught to parse variable references and function calls and recursively call the Tcl evaluation routines, and expressions involving the conditional operators had to be quoted, as in our factorial example above, to prevent premature evaluation.
I suppose that there might be some argument made that consistency had a part in choosing the hybrid expression syntax, but it's a funny sort of consistency. One would have expected consistency to produce either a fully prefix, lisp-like syntax for expr, or a fully C-like syntax for expr. Either of those choices would have been fully consistent with some subset of the design decisions made for expr. But the hybrid syntax is neither consistent with C-like expressions, nor is it consistent with Tcl's syntax.
The distinctive function call syntax given to mathematical functions had a different motivation. Originally, all Tcl values were stored as strings. Distinguishing the mathematical functions allowed expr to maintain their operand and result values as numeric values, rather than converting the intermediate results back and forth between string and numeric representations. This meant that the mathematical functions were more efficient than Tcl commands implementing the same operations.
The original reason for using $variable and [ command arg1 ...] as expression syntax, to reuse the main Tcl parser and evaluator, actually went out the window as soon as the conditional operators were properly implemented. At that point, the syntax of expressions could have reverted to the syntax used in C and the expression parser, the expression evaluator, and the main Tcl parser all would have become simpler.
The original reason for distinguishing mathematical functions from general functions became moot when Tcl converted to the Tcl_Object representation for values. In a Tcl_Object, a value may be a string, a numeric, or some other type. These days the expr keeps all its intermediate values as Tcl_Object's with a numeric representation, and there is no implementation advantage for a mathematical function over a builtin Tcl command.
Meanwhile, Tcl has also become a compiled-on-the-fly
language, and there's a new reason for expr
operands to be quoted against evaluation by the main Tcl
evaluator. The expression compiler wants its operands
quoted because then it can see the syntactic boundaries in
expressions and compile the expression for most efficient
evaluation. If its operand is unquoted and contains
variable references, like:
then evaluation of those variable references
by the Tcl evaluator might produce something like:
expr $a+$b
which would need to be reparsed at runtime in order to get
the right answer. Quoting the operands of expr tells
the expression compiler that it's okay to generate code that
assumes that the results of variable and command
substitution will make sense in the expression.
expr 6-4+9/2
Tcl's expr implements a hybrid syntax, half of which is taken from programming languages like C, and half of which is taken from Tcl itself. But the half that is taken from Tcl must be quoted, at all costs, to protect it from evaluation by Tcl itself or the semantics of conditional operators will be violated and the compiler will generate suboptimal code.
Expr After
So, let's enhance Tcl's expr syntax so that it's more consistent with the C-like languages that started it on the path to inconsistency with Tcl so long ago. What's it take?
Well, it would be nice to reference Tcl variables without
the dollar signs, so that
returns the value of the variable a if one exists,
and an unknown variable error otherwise. That expression
currently yields a "syntax error" which comes from
in ParsePrimaryExpr in generic/tclParseExpr.c.
Instead of the error, we'll just stuff two tokens into the
parsed token stream that result in a variable reference, and
unread the token that wasn't an open parenthesis.
expr a
It would also be nice to call a Tcl procedure with the
same syntax that the mathematical functions use, so that
returns the result
of calling the Tcl command min if one exists, and an
unknown command error otherwise. That expr currently
yields an "undefined math function" error which comes from
CompileMathFuncCall in generic/tclParseExpr.c.
Instead of the error, we'll just compile a generic Tcl
command call. Then the unknown function can catch
any undefined functions. Hmm, I guess that implements
autoloaded math functions, too, or something very like
them.
expr min(a,b)
That, and a few lines to make the expr parser accept namespace qualifiers in identifiers, is all there is to the Tcl Expr Patch. The original expr syntax is fully supported, so code using the original syntax will continue to work just as it does.
Our factorial example,
can now be rewritten as,
proc factorial {n} {
expr {
$n > 1 ?
$n * [factorial [expr {$n-1}]] :
1
}
}
What has happened? The quotes went away, because there are
no variable or command substitutions to be protected from
premature evaluation. The dollar signs went away. The
square brackets turned into a conventional function call.
Oh, look, the nested call to expr went away, too.
Since the call to factorial is parsed as a math
function, expressions in its argument get evaluated without
explicitly calling expr. That wasn't part of the
original spec, but it's reason enough to adopt the
enhancement by itself. And the whole definition is now
short enough to write on one line without messing up my html
layout.
proc factorial {n} {
expr n > 1 ? n * factorial(n-1) : 1
}
So, 294 lines of patch and we have an expr command with expressions that:
- look like C expressions,
- work like C expressions,
- don't need to be quoted to evaluate correctly,
- don't need to be quoted to compile efficiently,
- autoload math functions,
- evaluate expressions in function call argument lists without explicit calls to expr,
The only gotcha that I've discovered thus far is a new
variation on quoting hell. If you call one of the
Tcl commands which takes a variable name as a parameter
using the new syntax, then you will need to quote the
variable name. So,
will set the variable named "1" to the value "2", while
set a 1; expr set(a,2)
will set the
variable named "a" to the value "2".
expr set({a},2)
The real reasons
I've given all sorts of reasons and half reasons for hacking on expr's syntax, but the real reasons I did it are quite simple. Firstly, the thought of writing anything else in expr's existing syntax makes me cringe. I can try to rationalize that by saying all sorts of nasty things about the existing syntax, but at root it's a personal, visceral reaction. Secondly, the more I thought about what needed to be done, the surer I was that it would be simple. And thirdly, it was simple.
A fellow traveler
John, who apparently wishes to remain anonymous, has contributed an version of tcl-expr-patch for the 8.4.0 release of tcl. Happy thanksgiving, 2002.
