A Grit URI Parser

The URI format is defined by the IETF RFC 3986 specification.

The IETF RFCs uses ABNF grammar rules, but it is reasonably easy to transliterate ABNF rules into Grit grammar rules.

Here is the first ABNF rule:

    URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

In Grit this can be written as:

    URI = scheme ':' hier-part ('?' query)? ('#' fragment)?

In ABNF square brackets are used for optional components, but in Grit square brackets are used in regular expression components. The double quotes have been changed to single quotes becaue in Grit double quotes allow white-space before and after the quoted literal, and a URI can not contain any white-space.

We could go on and translate the full ABNF specification into a Grit grammar, but RFC 3986 also provides this regular expression to match a URI:

    ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?

We can break this regular expression into component parts as named rules in a Grit grammar:


This grammar matches in exactly the same way as the regular expression.

We can use action functions to generate a more useful result: