ES6 Language Specification

The Lexical and RegExp Grammars

5.1.2  The Lexical and RegExp Grammars

What is the lexical grammar?

The lexical grammar is the most rudimentary part of a grammar’s syntax. It is better described in section 11 of the specs, however, we will briefly describe it here, and leave some of the more intricate details to when we discuss section 11.

We will first quote the spec as a point of reference and then describe it in simpler language. The first paragraph states:

A lexical grammar for ECMAScript is given in clause 11. This grammar has as its terminal symbols Unicode code points that conform to the rules for SourceCharacter defined in 10.1. It defines a set of productions, starting from the goal symbol InputElementDiv, InputElementTemplateTail, or InputElementRegExp, or InputElementRegExpOrTemplateTail, that describe how sequences of such code points are translated into a sequence of input elements.

To better understand what the lexical grammar is we need to cover some of the terminology. The characters that make up the lexical grammar are called SourceCharacters, which are defined as being any Unicode character/symbol. These characters in turn define the lexical productions labeled InputElementDiv, InputElementTemplateTail, InputElementRegExp, and InputElementRegExpOrTemplateTail. These 4 productions are composed of what are termed tokens which are words like while or for or characters like ( or +, as well as non-tokens such as line terminators, comments, and white space. We have 4 distinct productions because each of these have additional valid grammars such as regular expressions or additional punctuators. These 4 productions are used in different contexts within the spec. 

We can now better understand the remaining part of 5.1.2:

Input elements other than white space and comments form the terminal symbols for the syntactic grammar for ECMAScript and are called ECMAScript tokens. These tokens are the reserved words, identifiers, literals, and punctuators of the ECMAScript language.

Moreover, line terminators, although not considered to be tokens, also become part of the stream of input elements and guide the process of automatic semicolon insertion (11.9). Simple white space and single-line comments are discarded and do not appear in the stream of input elements for the syntactic grammar. A MultiLineComment (that is, a comment of the form /**/regardless of whether it spans more than one line) is likewise simply discarded if it contains no line terminator; but if a MultiLineComment contains one or more line terminators, then it is replaced by a single line terminator, which becomes part of the stream of input elements for the syntactic grammar.

A RegExp grammar for ECMAScript is given in 21.2.1. This grammar also has as its terminal symbols the code points as defined by SourceCharacter. It defines a set of productions, starting from the goal symbol Pattern, that describe how sequences of code points are translated into regular expression patterns.

Productions of the lexical and RegExp grammars are distinguished by having two colons “::” as separating punctuation. The lexical and RegExp grammars share some productions.


Josh Miller

Josh Miller Josh Miller

I’m a full-stack web developer who’s especially enthusiastic about the rapid developments in JavaScript. I’ve created this blog as a medium to share with others a journey of knowledge and discovery.

  • marvel


    You should participate in a competition for one of the finest sites on the internet. I'm going to recommend this site!

  • comment espionner


    Just want to say your article is as astonishing. The clarity in your post is just great and i could assume you are an expert on this subject. Fine with your permission let me to grab your feed to keep up to date with forthcoming post. Thanks a million and please carry on the enjoyable work.

  • supplyconcepts


    Hi there, I enjoy reading through your post. I wanted to write a little comment to support you.

  • wanelo


    Have you ever thought about adding a little bit more than just your articles? I mean, what you say is important and all. Nevertheless imagine if you added some great graphics or video clips to give your posts more, "pop"! Your content is excellent but with pics and videos, this site could definitely be one of the best in its field. Superb blog!

  • Webster Wikipedia


    You could definitely see your enthusiasm within the work you write. The world hopes for even more passionate writers such as you who are not afraid to say how they believe. At all times follow your heart.

  • bang gia thi cong phan tho


    Hello would you mind sharing which blog platform you're using? I'm going to start my own blog in the near future but I'm having a hard time making a decision between BlogEngine/Wordpress/B2evolution and Drupal. The reason I ask is because your layout seems different then most blogs and I'm looking for something completely unique. P.S My apologies for being off-topic but I had to ask!

  • Nitro


    This is really attention-grabbing, You're an overly skilled blogger. I have joined your rss feed and look forward to in search of more of your magnificent post. Also, I have shared your site in my social networks

  • computers


    Hi, I do think this is an excellent site. I stumbledupon it ;) I will return yet again since i have book-marked it. Money and freedom is the best way to change, may you be rich and continue to guide others.

  • cặc


    I was extremely pleased to uncover this website. I want to to thank you for ones time for this fantastic read!! I definitely really liked every part of it and i also have you book marked to see new information in your site.


Leave a comment