The Lexical and RegExp Grammars
5.1.2 The Lexical and RegExp Grammars
What is the lexical grammar?
The lexical grammar is the most rudimentary part of a grammar’s syntax. It is better described in section 11 of the specs, however, we will briefly describe it here, and leave some of the more intricate details to when we discuss section 11.
We will first quote the spec as a point of reference and then describe it in simpler language. The first paragraph states:
A lexical grammar for ECMAScript is given in clause 11. This grammar has as its terminal symbols Unicode code points that conform to the rules for SourceCharacter defined in 10.1. It defines a set of productions, starting from the goal symbol InputElementDiv, InputElementTemplateTail, or InputElementRegExp, or InputElementRegExpOrTemplateTail, that describe how sequences of such code points are translated into a sequence of input elements.
To better understand what the lexical grammar is we need to cover some of the terminology. The characters that make up the lexical grammar are called SourceCharacters, which are defined as being any Unicode character/symbol. These characters in turn define the lexical productions labeled InputElementDiv, InputElementTemplateTail, InputElementRegExp, and InputElementRegExpOrTemplateTail. These 4 productions are composed of what are termed tokens which are words like while or for or characters like ( or +, as well as non-tokens such as line terminators, comments, and white space. We have 4 distinct productions because each of these have additional valid grammars such as regular expressions or additional punctuators. These 4 productions are used in different contexts within the spec.
We can now better understand the remaining part of 5.1.2:
Input elements other than white space and comments form the terminal symbols for the syntactic grammar for ECMAScript and are called ECMAScript tokens. These tokens are the reserved words, identifiers, literals, and punctuators of the ECMAScript language.
Moreover, line terminators, although not considered to be tokens, also become part of the stream of input elements and guide the process of automatic semicolon insertion (11.9). Simple white space and single-line comments are discarded and do not appear in the stream of input elements for the syntactic grammar. A MultiLineComment (that is, a comment of the form
*/regardless of whether it spans more than one line) is likewise simply discarded if it contains no line terminator; but if a MultiLineComment contains one or more line terminators, then it is replaced by a single line terminator, which becomes part of the stream of input elements for the syntactic grammar.
A RegExp grammar for ECMAScript is given in 21.2.1. This grammar also has as its terminal symbols the code points as defined by SourceCharacter. It defines a set of productions, starting from the goal symbol Pattern, that describe how sequences of code points are translated into regular expression patterns.
Productions of the lexical and RegExp grammars are distinguished by having two colons “::” as separating punctuation. The lexical and RegExp grammars share some productions.
Just want to say your article is as astonishing. The clarity in your post is just great and i could assume you are an expert on this subject. Fine with your permission let me to grab your feed to keep up to date with forthcoming post. Thanks a million and please carry on the enjoyable work.
Have you ever thought about adding a little bit more than just your articles? I mean, what you say is important and all. Nevertheless imagine if you added some great graphics or video clips to give your posts more, "pop"! Your content is excellent but with pics and videos, this site could definitely be one of the best in its field. Superb blog!
bang gia thi cong phan tho
Hello would you mind sharing which blog platform you're using? I'm going to start my own blog in the near future but I'm having a hard time making a decision between BlogEngine/Wordpress/B2evolution and Drupal. The reason I ask is because your layout seems different then most blogs and I'm looking for something completely unique. P.S My apologies for being off-topic but I had to ask!