Contextfree grammar
From Academic Kids

In linguistics and computer science, a contextfree grammar (CFG) is a formal grammar in which every production rule is of the form
 V → w
where V is a nonterminal symbol and w is a string consisting of terminals and/or nonterminals. The term "contextfree" comes from the fact that the nonterminal V can always be replaced by w, regardless of the context in which it occurs. A formal language is contextfree if there is a contextfree grammar that generates it.
Contextfree grammars are powerful enough to describe the syntax of most programming languages; in fact, the syntax of most programming languages are specified using contextfree grammars. On the other hand, contextfree grammars are simple enough to allow the construction of efficient parsing algorithms which, for a given string, determine whether and how it can be generated from the grammar. An Earley parser is an example of such an algorithm, while LR and LL parsers only deal with more restrictive subsets of contextfree grammars.
BNF (BackusNaur Form) is the most common notation used to express contextfree grammars.
Not all formal languages are contextfree — a wellknown counterexample is <math> \{ a^n b^n c^n : n \ge 0 \} <math>. This particular language can be generated by a parsing expression grammar, which is a relatively new formalism that is particularly wellsuited to programming languages.
Contents 
Formal definition
Just as any formal grammar, a CFG G can be defined as a 4tuple:
<math>G = (V_t, V_n, P, S)<math> where
 <math>V_t<math> is a finite set of terminals
 <math>V_n<math> is a finite set nonterminals
 <math>P<math> is a finite set of productions rules
 <math>S<math> is an element of <math>V_n<math>, the distinguished starting nonterminal.
 elements of <math>P<math> are of the form
 <math>V_n \longrightarrow (V_t \cup V_n)^*<math>
Examples
Example 1
A simple contextfree grammar is
 S → aSb  ε
where  is used to separate different options for the same nonterminal and ε stands for the empty string. This grammar generates the language <math> \{ a^n b^n : n \ge 0 \} <math> which is not regular.
Example 2
Here is a contextfree grammar for syntactically correct infix algebraic expressions in the variables x, y and z:
 S → x  y  z  S + S  S  S  S * S  S/S  (S)
This grammar can, for example, generate the string "( x + y ) * x  z * y / ( x + x )".
Example 3
A contextfree grammar for the language consisting of all strings over {a,b} which contain a different number of a's than b's is
 S → U  V
 U → TaU  TaT
 V → TbV  TbT
 T → aTbT  bTaT  ε
Here, T can generate all strings with the same number of a's as b's, U generates all strings with more a's than b's and V generates all strings with fewer a's than b's.
Example 4
Another example of a context free language is <math> \{ a^n b^m c^{m+n} : n \ge 0, m \ge 0 \} <math>. This is not a regular language, but it is context free as it can be generated by the following CFG (Context Free Grammar):
 S → aSC  B
 B → bBC  ε
 C → c
Other examples
Contextfree grammars are not limited in application to mathematical ("formal") languages. The ancient Indian linguist Panini described Sanskrit using a contextfree grammar. Recently, it has been suggested that a class of Tamil poetry called Venpa is governed by a contextfree grammar.
Derivations and syntax trees
There are basically two ways to describe how in a certain grammar a string can be derived from the start symbol. The simplest way is to list the consecutive strings of symbols, beginning with the start symbol and ending with the string, and the rules that have been applied. If we introduce a strategy such as "always replace the leftmost nonterminal first" then for contextfree grammars the list of applied grammar rules is by itself sufficient. This is called the leftmost derivation of a string. For example, if we take the following grammar:
 (1) S → S + S
 (2) S → 1
and the string "1 + 1 + 1" then the left derivation of this string is the list [ (1), (1), (2), (2), (2) ]. Analogously the rightmost derivation is defined as the list that we get if we always replace the rightmost nonterminal first. In this case this would be the list [ (1), (2), (1), (2), (2)].
The distinction between leftmost derivation and rightmost derivation is important because in most parsers the transformation of the input is defined by giving a piece of code for every grammar rule that is executed whenever the rule is applied. Therefore it is important to know whether the parser determines a leftmost or a rightmost derivation because this determines the order in which the pieces of code will be executed. See for an example LL parsers and LR parsers.
A derivation also imposes in some sense a hierarchical structure on the string that is derived. For example the structure of the string "1 + 1 + 1" would, according to the leftmost derivation, be:
 S→S+S (1)
 S→S+S+S (1)
 S→1+S+S (2)
 S→1+1+S (2)
 S→1+1+1 (2)
 { { { 1 }_{S} + { 1 }_{S} }_{S} + { 1 }_{S} }_{S}
where { ... }_{S} indicates a substring recognized as belonging to S. This hierarchy can also be seen as a tree:
S /\ /  \ /  \ S '+' S /\  /  \  S '+' S '1'   '1' '1'
This tree is called a concrete syntax tree (see also abstract syntax tree) of the string. In this case the presented leftmost and the rightmost derivation define the same syntax tree, however there is another (leftmost) derivation of the same string possible
 S→ S + S (1)
 S→ 1 + S (2)
 S→ 1 + S + S (1)
 S→ 1 + 1 + S (2)
 S→ 1 + 1 + 1 (2)
and this defines the following syntax tree:
S /\ /  \ /  \ S '+' S  /\  /  \ '1' S '+' S   '1' '1'
If for certain strings in the language of the grammar there is more than one parsing tree then the grammar is said to be an ambiguous grammar. Such grammars are usually hard to parse because the parser cannot always decide which grammar rule it has to apply.
Normal forms
Every contextfree grammar which does not generate the empty string can be transformed into an equivalent one in Chomsky normal form or Greibach normal form. "Equivalent" here means that the two grammars generate the same language.
Because of the especially simple form of production rules in Chomsky Normal Form grammars, this normal form has both theoretical and practical implications. For instance, given a contextfree grammar, one can use the Chomsky Normal Form to construct a polynomialtime algorithm which decides whether a given string is in the language represented by that grammar or not (the CYK algorithm).
Properties of contextfree languages
 An alternative and equivalent definition of contextfree languages employs nondeterministic pushdown automata: a language is contextfree if and only if it can be accepted by such an automaton.
 The union and concatenation of two contextfree languages is contextfree; the intersection need not be.
 The reverse of a contextfree language is contextfree, but the complement need not be.
 Every regular language is contextfree because it can be described by a regular grammar.
 The intersection of a contextfree language and a regular language is always contextfree.
 There exist contextsensitive languages which are not contextfree.
 To prove that a given language is not contextfree, one may employ the pumping lemma for contextfree languages.
See also
Template:Formal languages and grammarsde:Kontextfreie Grammatik
fr:Grammaire horscontexte
pl:Język bezkontekstowy
ja:文脈自由言語
zh:上下文无关文法