ECPG's grammar (preproc.y) is built by parse.pl from the backend's grammar (gram.y) plus various add-on rules. Some notes: 1) Most input matching core grammar productions is simply converted to strings and concatenated together to form the SQL string passed to the server. This is handled mostly automatically, as described below. 2) Some grammar rules need special actions that are added to or completely override the default token-concatenation behavior. This is controlled by ecpg.addons as explained below. 3) Additional grammar rules are needed for ECPG's own commands. These are in ecpg.trailer, as is the "epilogue" part of preproc.y. 4) ecpg.header contains the "prologue" part of preproc.y, including support functions, Bison options, etc. 5) Additional terminals added by ECPG must be defined in ecpg.tokens. Additional nonterminals added by ECPG must be defined in ecpg.type, but only if they have non-void result type, which most don't. ecpg.header, ecpg.tokens, ecpg.type, and ecpg.trailer are just copied verbatim into preproc.y at appropriate points. In the pre-v18 implementation of ecpg, the strings constructed by grammar rules were returned as the Bison result of each rule. This led to a large number of effectively-identical rule actions, which caused compilation-time problems with some versions of clang. Now, rules that need to return a string are declared as having void type (which in Bison means leaving out any %type declaration for them). Instead, we abuse Bison's "location tracking" mechanism to carry the string results, which allows a single YYLLOC_DEFAULT call to handle the standard token-concatenation behavior for the vast majority of the rules. Rules that don't need to do anything else can omit a semantic action altogether. Rules that need to construct an output string specially can do so, but they should assign it to "@$" rather than the usual "$$"; also, to reference the string value of the N'th input token, write "@N" not "$N". (But rules that return something other than a simple string continue to use the normal Bison notations.) ecpg.addons contains entries that begin with a line like ECPG: ruletype tokenlist and typically have one or more following lines that are the code for a grammar action. Any line not starting with "ECPG:" is taken to be part of the code block for the preceding "ECPG:" line. "tokenlist" identifies which gram.y production this entry affects. It is simply a list of the target nonterminal and the input tokens from the gram.y rule. For example, to modify the action for a gram.y rule like this: target: tokenA tokenB tokenC {...} "tokenlist" would be "target tokenA tokenB tokenC". If we want to modify a non-first alternative for a nonterminal, we still write the nonterminal. For example, "tokenlist" should be "target tokenD tokenE" to affect the second alternative in: target: tokenA tokenB tokenC {...} | tokenD tokenE {...} "ruletype" is one of: a) "block" - the automatic action that parse.pl would create is completely overridden. Instead the entry's code block is emitted. The code block must include the braces ({}) needed for a Bison action. b) "addon" - the entry's code block is inserted into the generated action, ahead of the automatic token-concatenation code. In this case the code block need not contain braces, since it will be inserted within braces. c) "rule" - the automatic action is emitted, but then the entry's code block is added verbatim afterwards. This typically is used to add new alternatives to a nonterminal of the core grammar. For example, given the entry: ECPG: rule target tokenA tokenB tokenC | tokenD tokenE { custom_action; } what will be emitted is target: tokenA tokenB tokenC { automatic_action; } | tokenD tokenE { custom_action; } Multiple "ECPG:" entries can share the same code block, if the same action is needed for all. When an "ECPG:" line is immediately followed by another one, it is not assigned an empty code block; rather the next nonempty code block is assumed to apply to all immediately preceding "ECPG:" entries. In addition to the modifications specified by ecpg.addons, parse.pl contains some tables that list backend grammar productions to be ignored or modified. Nonterminals that construct strings (as described above) should be given void type, which is parse.pl's default assumption for nonterminals found in gram.y. If the result should be of some other type, make an entry in parse.pl's %replace_types table. %replace_types can also be used to suppress output of a nonterminal's rules altogether (in which case ecpg.trailer had better provide replacement rules, since the nonterminal will still be referred to elsewhere).