"In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context."
Context-free grammars are a notation for describing the structure of strings in a language. They are used to define context-free languages.
Formal language theory: The study of languages that can be described by formal grammars and automata models.
Chomsky hierarchy: A classification of formal grammars and the languages they generate, based on their expressive power.
Context-free grammar (CFG): A type of formal grammar, where every rule has a single symbol as its left-hand side, and a sequence of symbols as its right-hand side.
Derivation: A process of applying the production rules of a grammar to generate a string in the language.
Parse tree: A representation of the derivation process as a tree structure, where each node represents a symbol in the grammar and the edges show the substitution relations.
Ambiguity: A property of a grammar or a language, where a string can have multiple valid parse trees.
Normal forms: Standard forms of CFGs that simplify their analysis and transformation, such as Chomsky normal form and Greibach normal form.
Closure properties: A set of properties that describe how the class of context-free languages is closed under certain operations, such as union, intersection, concatenation, and complement.
Pushdown automata (PDA): A type of automaton that can recognize context-free languages by using a stack to keep track of the derivation.
Parsing algorithms: Methods for constructing parse trees or recognizing strings in a CFG efficiently, such as CYK algorithm, Earley algorithm, and LL(k) parsing.
Applications of CFGs: Natural language processing, programming languages, compilers, and data validation are some examples of areas where CFGs are commonly used.
Regular grammar: Is represented by a regular expression, which uses symbols to indicate a pattern of characters. It can be used to create simple languages such as those used to recognize or generate strings of digits, letters, or symbols.
Context-sensitive grammar: Is a type of grammar where the production rules depend not only on the symbol on the left side of the rule but also on the symbols that appear on the right side of the rule. They can create more complex languages than regular grammars.
Unrestricted grammar: Is a grammar that has no limitations or restrictions placed on its elements. It can create complex languages and can be used to recognize natural languages.
Chomsky hierarchy: Is a classification of formal languages or grammars into four levels based on their generative or expressive power. These levels are regular, context-free, context-sensitive, and recursively enumerable. Context-free grammars belong to the second level of this hierarchy.
Backus-Naur Form (BNF): Is a metalanguage used to describe context-free grammars. It is commonly used to describe programming languages and other formal languages. BNF consists of a set of production rules that define how the language's symbols are combined to form expressions.
"In a context-free grammar, each production rule is of the form A → α, with A a single nonterminal symbol, and α a string of terminals and/or nonterminals (α can be empty)."
"This distinguishes it from a context-sensitive grammar, which can have production rules in the form αAβ → αγβ with A a nonterminal symbol and α, β, and γ strings of terminal and/or nonterminal symbols."
"A formal grammar is essentially a set of production rules that describe all possible strings in a given formal language."
"Production rules are simple replacements."
"There can be multiple replacement rules for a given nonterminal symbol."
"Nonterminal symbols are used during the derivation process but do not appear in its final result string."
"Languages generated by context-free grammars are known as context-free languages (CFL)."
"Different context-free grammars can generate the same context-free language."
"It is important to distinguish the properties of the language (intrinsic properties) from the properties of a particular grammar (extrinsic properties)."
"The language equality question (do two given context-free grammars generate the same language?) is undecidable."
"Context-free grammars arise in linguistics where they are used to describe the structure of sentences and words in a natural language."
"They were invented by the linguist Noam Chomsky for this purpose."
"In computer science, as the use of recursively-defined concepts increased, they were used more and more. In an early application, grammars are used to describe the structure of programming languages."
"In a newer application, they are used in an essential part of the Extensible Markup Language (XML) called the document type definition."
"Some authors use the term phrase structure grammar to refer to context-free grammars, whereby phrase-structure grammars are distinct from dependency grammars."
"A popular notation for context-free grammars is Backus–Naur form, or BNF."
"A formal grammar is essentially a set of production rules that describe all possible strings in a given formal language."
"Regardless of which symbols surround it, the single nonterminal A on the left-hand side can always be replaced by α on the right-hand side."
"The language generated by a grammar is the set of all strings of terminal symbols that can be derived, by repeated rule applications, from some particular nonterminal symbol ('start symbol')."