Monads and parsing

A parser takes a string and produces a list of tokens and the remaining string, so we want a function p:str -> makeProduct([arr, str]).

To get the target type, let’s use a functor. The addState functor takes any contract x and turns it into the pair contract [arr, x], so it expects a tuple containing an array and a value satisfying x.

var addState = function (x) {
  return makeProduct([arr, x]);
};

To chain parsers together, we want to apply the next parser to the second element of the result of the last one, then concatenate the list of tokens. A monad will help with this process.

If we iterate the addState functor twice on the contract x, the resulting contract accepts objects of the form [a1, [a2, xVal]] where a1 and a2 are arrays and xVal satisfies x. The state monad’s multiplication concatenates the two arrays a1 and a2, while its unit provides the empty array for the first parser in the chain.

var state = monadOf(addState)({
  // Input is of the form [a1, [a2, xVal]]
  // where a1, a2 are arrays.
  '*': function (doubleState) {
    var a1 = doubleState[0];
    var a2 = doubleState[1][0];
    var xVal = doubleState[1][1];

    return [a1.concat(a2), xVal];
  },
  1: function (xVal) {
    return [[], xVal];
  }
})(str);

var Parser = function () {
  this.run = state[1];
};

The register method is where it all comes together: since addState is a functor, we can use it to apply a function to the second element of a tuple, and then the multiplication in the monad concatenates the two lists.

Parser.register = function (name, makeParser) {
  Parser.prototype[name] = function (varArgs) {
    var run = this.run;
    this.run = function (s) {
      return state['*'](
          addState( makeParser.apply({}, toArray(arguments)) )
          (run(s)));
    };
    return this;
  };
};

// Given a character, produce a parser
var char = function (c) {
  return hom([str], addState(str))(function (s) {
    if (s[0] !== c) {
      throw new Error('Expected the character "' + c + '"');
    }
    return [[c], s.substr(1)];
  });
};

Parser.register(char);

new Parser()
    .char('a').char('b').char('c')
    .run('abcde');
// === [["a", "b", "c"], "de"]
Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: