Free variables and bound variables

In mathematics, and in other disciplines involving formal languages, including mathematical logic and computer science, a free variable is a notation for a place or places in an expression, into which some definite substitution may take place, or with respect to which some operation (summation or quantification, to give two examples) may take place. The idea is related to, but somewhat deeper and more complex than, that of a placeholder (a symbol that will later be replaced by some literal string), or a wildcard character that stands for an unspecified symbol.
The variable x becomes a bound variable, for example, when we write
 'For all x, (x + 1)^{2} = x^{2} + 2x + 1.'
or
 'There exists x such that x^{2} = 2.'
In either of these propositions it no longer much matters whether we use x or some other letter; but it would be confusing notationally to use the letter again elsewhere in some compound proposition. That is, free variables become bound, and then in a sense retire from further work supporting the formation of formulae.
Contents 
Examples
Before stating a precise definition of free variable and bound variable (or dummy variable), we present some examples that perhaps make these two concepts clearer than the definition would (unfortunately the term dummy variable is used by many statisticians to mean an indicator variable or some variant thereof; the name is really not apt for that purpose, but magnificently conveys the intuition behind the definition of this concept):
In the expression
 <math>\sum_{x=1}^{10} f(x,y),<math>
y is a free variable and x is a bound variable (or dummy variable); consequently the value of this expression depends on the value of y, but there is nothing called x on which it could depend.
In the expression
 <math>\sum_{y=1}^{10} f(x,y),<math>
x is a free variable and y is a bound variable; consequently the value of this expression depends on the value of x, but there is nothing called y on which it could depend.
In the expression
 <math>\int_0^\infty x^{y1} e^{x}\,dx,<math>
y is a free variable and x is a bound variable; consequently the value of this expression depends on the value of y, but there is nothing called x on which it could depend.
In the expression
 <math>\lim_{h\rightarrow 0}\frac{f(x+h)f(x)}{h},<math>
x is a free variable and h is a bound variable; consequently the value of this expression depends on the value of x, but there is nothing called h on which it could depend.
In the expression
 <math>\forall x\ \exists y\ \varphi(x,y,z),<math>
z is a free variable and x and y are bound variables; consequently the truthvalue of this expression depends on the value of z, but there is nothing called x or y on which it could depend.
Variablebinding operators
The expressions
 <math>\sum_{x=1}^{10}\qquad\qquad \int_0^\infty\cdots\,dx\qquad\qquad \lim_{h\to 0}\qquad\qquad \forall x<math>
are variablebinding operators. The variables that they bind are x (in the first, second, and fourth examples) and h (in the third example).
Formal explanation
Variablebinding mechanisms occur in different contexts in mathematics, logic and computer science but in all cases they are purely syntactic properties of expressions and variables in them. For this section we can summarize syntax by identifying expressions with trees whose leaf nodes are variables, function constants or predicate constants and whose nodes are logical operators. Variablebinding operators are logical operators that occur in almost every formal language. Indeed languages which do not have them are either extremely inexpressive or extremely difficult to use. A binding operator Q takes two arguments: a variable v and an expression P, and when applied to its arguments produces a new expression Q(v, P). The meaning of binding operators is supplied by the semantics of the language and does not concern us here.
Variable binding relates three things: a variable v, a location a for that variable in an expression and a node n of the form Q(v, P). Note: we define a location in an expression as a leaf node in the syntax tree. Variable binding occurs when that location is below the node n
To give an example from mathematics, consider an expression which defines a function
 <math> (x_1, \ldots , x_n) \mapsto \operatorname{t}<math>
where t is an expression. t may contain some, all or none of the x_{1}, ..., x_{n} and it may contain other variables. In this case we say that function definition binds the variables x_{1}, ..., x_{n}.
In the lambda calculus, x is a bound variable in the term M = λ x . T, and a free variable of T. We say x is bound in M and free in T. If T contains a subterm λ x . U then x is rebound in this term. This nested, inner binding of x is said to "shadow" the outer binding. Occurrences of x in U are free occurrences of the new x.
Variables bound at the top level of a program are technically free variables within the terms to which they are bound but are often treated specially because they can be compiled as fixed addresses. Similarly, an identifier bound to a recursive function is also technically a free variable within its own body but is treated specially.
A closed term is one containing no free variables.
See also
closure (computer science), closure (mathematics), lambda lifting, scope (programming), combinatory logic
References
A small part of this article was originally based on material from the Free Online Dictionary of Computing and is used with permission under the GFDL. Most of what now appears here is the result of later editing.