A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term parse tree itself is used primarily in computational linguistics; in theoretical syntax, the term syntax tree is more common.

Parse trees concretely reflect the syntax of the input language, making them distinct from the abstract syntax trees used in computer programming. Unlike Reed-Kellogg sentence diagrams used for teaching grammar, parse trees do not use distinct symbol shapes for different types of constituents.

Parse trees are usually constructed based on either the constituency relation of constituency grammars (phrase structure grammars) or the dependency relation of dependency grammars. Parse trees may be generated for sentences in natural languages (see natural language processing), as well as during processing of computer languages, such as programming languages.

The constituency-based parse trees of constituency grammars (= phrase structure grammars) distinguish between terminal and non-terminal nodes. The interior nodes are labeled by non-terminal categories of the grammar, while the leaf nodes are labeled by terminal categories. The image below represents a constituency-based parse tree; it shows the syntactic structure of the English sentence John hit the ball:

The parse tree is the entire structure, starting from S and ending in each of the leaf nodes (John, hit, the, ball). The following abbreviations are used in the tree:

S for sentence, the top-level structure in this example

NP for noun phrase. The first (leftmost) NP, a single noun “John”, serves as the subject of the sentence. The second one is the object of the sentence.

VP for verb phrase, which serves as the predicate

V for verb. In this case, it’s a transitive verb hit.

D for determiner, in this instance the definite article “the”

N for noun

Each node in the tree is either a root node, a branch node, or a leaf node. A root node is a node that doesn’t have any branches on top of it. Within a sentence, there is only ever one root node. A branch node is a mother node that connects to two or more daughter nodes. A leaf node, however, is a terminal node that does not dominate other nodes in the tree. S is the root node, NP and VP are branch nodes, and John (N), hit (V), the (D), and ball (N) are all leaf nodes. The leaves are the lexical tokens of the sentence. A mother node is one that has at least one other node linked by a branch under it. In the example, S is a parent of both N and VP. A daughter node is one that has at least one node directly above it to which it is linked by a branch of a tree. From the example, hit is a daughter node of V. The terms parent and child are also sometimes used for this relationship.

PUSH DOWN AUTOMATION:

A pushdown automaton is a way to implement a context-free grammar in a similar way we design DFA for a regular grammar. A DFA can remember a finite amount of information, but a PDA can remember an infinite amount of information.

Basically a pushdown automaton is −

“Finite state machine” + “a stack”

A pushdown automaton has three components −

an input tape,
a control unit, and
a stack with infinite size.

The stack head scans the top symbol of the stack.

A stack does two operations −

Push − a new symbol is added at the top.
Pop − the top symbol is read and removed.

We use standard formal language notation: $\Gamma ^{*}$ denotes the set of strings over alphabet $\Gamma$ and $\varepsilon$ denotes the empty string.

A PDA is formally defined as a 7-tuple:

$M=(Q,\ \Sigma ,\ \Gamma ,\ \delta ,\ q_{0},\ Z,\ F)$ where

$\,Q$ is a finite set of states
$\,\Sigma$ is a finite set which is called the input alphabet
$\,\Gamma$ is a finite set which is called the stack alphabet
$\,\delta$ is a finite subset of $Q\times (\Sigma \cup \{\varepsilon \})\times \Gamma \times Q\times \Gamma ^{*}$ , the transition relation.
$\,q_{0}\in \,Q$ is the start state
$\ Z\in \,\Gamma$ is the initial stack symbol
$F\subseteq Q$ is the set of accepting states

An element $(p,a,A,q,\alpha )\in \delta$ is a transition of $M$ . It has the intended meaning that $M$ , in state $p\in Q$ , on the input $a\in \Sigma \cup \{\varepsilon \}$ and with $A\in \Gamma$ as topmost stack symbol, may read $a$ , change the state to $q$ , pop $A$ , replacing it by pushing $\alpha \in \Gamma ^{*}$ . The $(\Sigma \cup \{\varepsilon \})$ component of the transition relation is used to formalize that the PDA can either read a letter from the input, or proceed leaving the input untouched.

Parse tree

PUSH DOWN AUTOMATION:

Share this: