Bottom-up parsing
|
Bottom-up parsing is a technique that can be used in the construction of compilers to translate human-readable computer languages to assembly language or pseudocode. The name refers to a parsing approach that starts from individual symbols and builds up lexemes (such as identifiers or keywords), then builds up larger units from there.
Different computer languages require different parsing techniques, although it is not uncommon to use a parsing technique that is more powerful than what is actually required.
Typically, a bottom-up parser is written as a general parsing engine, with the parsing rules of any specific computer language described in a specialized parser language.
The common classes of bottom-up parsing are:
Hand-coding for any of these parser-language classes is very complex, so typically one uses rules-based parser generators.
The Parser performs one of two actions (beside accept). These are "Shift" and "Reduce".
- Shift means moving a symbol from the input to the stack
- Reduce means matching a set of symbols in the stack for a more general symbol
For example see figure 1
Take the language: S --> AB A --> a B --> b And the input: ab Then the bottom up parsing is: Stack Input ----------+ +-+-+------ | |a|b| ----------+ +-+-+------ Shift a --------+-+ +-+-------- |a| |b| --------+-+ +-+-------- Reduce a (A --> a) --------+-+ +-+-------- |A| |b| --------+-+ +-+-------- Shift b ------+-+-+ +---------- |A|b| | ------+-+-+ +---------- Reduce b (B --> b) ------+-+-+ +---------- |A|B| | ------+-+-+ +---------- Reduce AB (S --> AB) --------+-+ +---------- |S| | --------+-+ +---------- Accept