Grammar Engine Architecture

Concept

MRules Grammar Engine has been designed according to the following objectives:

  • Allowing to read a text corresponding to a given syntax and turn it into Java Beans.
  • Being able to conceive from scratch an advanced grammar by writing little or no code.
  • Allowing to validate the source text syntactically and to propose an auto completion feature.

Although oriented towards the creation of functional grammars (close to natural language), this tool also makes it possible to quickly and simply create technical grammars.

For example, the demonstration project shows how to create a JSON parser in a few lines of configuration, respecting official standards, and returning the read result as an Object Map.

How it Works

The goal is to cut the read text into patterns. A pattern is represented after parsing as a Java object. A pattern is read in its entirety by an object named “Lexer”. A Lexer is composed of:

  • A Begin Matcher, to detect the beginning of a pattern.
  • Zero, one or many Blocks.
  • Potentially an End Matcher, responsible for finalizing the reading of a pattern.

A Matcher can be:

  • A fixed text detection
  • A regular expression match
  • A child Lexer detection
  • Several Matchers combination (And / Or)

A block is a child pattern, in other words the execution of a child Lexer.

Components

The diagram below shows the main components of the library.

Architecture globale du moteur de grammaire MRules
MRules grammar engine global architecture

Documentation

The module also provides the utilities needed to automatically generate grammar documentation. This documentation is composed of:

  • Global grammar informations
  • Lexers list :
    • Their description (if available) and general information
    • Detailed content
    • A syntactical schema

Only the documentation rendering is not included in the module. An extension allowing the generation in HTML format has been created. It is also possible to create another target format by writing other rendering extensions.

 Example

Let’s take the following sentence:

"If the current temperature is greater than 18 degrees Celsius then turn off the heat otherwise turn on the heat."

The source text will be divided as follow:

Begin Matcher: "If".
  |--> We enter a conditional action.
  |--> We instanciate the corresponding object.
  |--- First block: the condition
         |--> Composition of 3 matchers (value, operator, value)
         |--> Begin Matcher: "the current temperature".
                |--> Block Matcher, create an objet fetching the temperature
         |--> Begin Matcher: "is greater than".
                |--> Block Matcher, instanciate a comparison operator
         |--> Begin Matcher: "18".
                |--> Block Matcher, instanciate a temperature
                |--> End Matcher: "degrees Celcius". Inject the unit.
  |--> Do the same with action blocks "if true" and "if false".