Report/Query DSL
Barney Boisvert
bboisvert@gmail.com
-------------------

(From http://www.barneyb.com/barneyblog/2008/01/05/reportquery-dsls/)

I use a pair of stacked DSLs (Domain Specific Languages) for searching,
reporting, and goal management in a couple applications. A discussion at
work (with Joshua and Koen) provided the solution for the final piece I
wanted to implement before releasing them. The first layer is for
parsing query strings, and the latter for using query strings to build
up report structures and such.

For the impatient, code is available at

	https://ssl.barneyb.com/svn/barneyb/reporting_dsl/trunk/

You want to read the docs in querydsl.cfc and reportdsl.cfc.

The Query DSL is quite simple:

    * spaces delimit terms
    * terms are ANDed together
    * terms can be quoted if they contain spaces
    * terms can contain a prefix (predicate) separated from the value by
      a colon
    * a leading dash (or exclamation point) can be used to negate a term
    * the OR keyword (caps mandatory) can be used to create alternatives
    * parentheses can be used to group terms

Here's some examples:

    * cat type:pet
    * cat -manx
    * cat OR dog
    * cat (-manx type:"house pet") OR dog

The last item is not possible with Google Search constructs, and is the
only construct available that Google doesn't provide. Which isn't to
suggest that the DSL supports the gamut of features Google supports.

The result of using the DSL parser is a graph of 'criterion' instances.
For example 'andcriterion', 'orcriterion', or 'simplecriterion'. The
second example above would return an 'andcriterion' instance with it's
left side pointing to a 'simplecriterion' instance for "cat", and the
right side pointing to another (negated) 'simplecriterion' instance for
"manx". More complex expression result in more complex graphs, but they
graph is always singly rooted.

The Report DSL is a bit more complex, and leverages the Query DSL. It is
line oriented, with a single-character command specifier at the
beginning of each line. A document contains one or more "condition"
lines (prefixed with a '+' or an '&') which are comprised of a value and
a query string. A document may also have one or more "globals" lines
(prefixed with a '+') that only contain a query string, and which is
added to each condition's query string (to reduce duplication). Here's a
sample document:

+ tag:dining -tag:home
? McDonalds : tag:mcdonalds
? Jack in the Box : tag:"jack in the box"
? Cheesecake Factory : tag:"cheesecake factory"
& Fast Food : tag:"fast food"
& Total : *

There is no real difference between the '?' and '&' lines (referred to
as conditions and aggregates respectively), the two variants are there
to provide a built-in differentiator. Here I'm using them as their names
imply: the conditions are raw conditions, and the aggregates aggregate
the conditions together. If this report document were used to generate a
chart, you might see the conditions as column series and the aggregates
as line series.

The result of parsing a document is a structure with 'coreCriterion',
'conditions' and 'aggregates' keys, all of which are optional if the
document doesn't specify anything to fill them. The former holds a
criterion graph (since it's just a query string). The latter two are
arrays of structs, where each struct contains 'value', 'query', and
'criterion' keys. The first two are simply the two halves of the
corresponding line of the document, and the 'criterion' key holds the
parsed query string.

So what is this good for? By itself, not much, but with a method for
converting between a criterion graph and a SQL expression, the Query DSL
will give you an easy way to build very granular search functionality on
your site. Once you have that, leveraging the Report DSL allows creation
of fairly complex reports with a minimum of fuss (do convert each
condition's criterion to SQL, run your query, and feed it into a chart
series). Of course, the output needn't be a chart, but that's a good
example.

What's most important here is that both of these DSLs are for users to
use, not coders. I.e. strings/documents are created by users to
customize the behaviour of your application, allowing them to build very
specific reports very easily. And the definitions for those reports are
simple strings, which makes persisting them a breeze.

With a little more creativity, you can get other behaviour. I alluded to
goal management up top. If you were to take the report document above,
remove the aggregates (the '&' lines), and replace the values
(restaurant names) with integers, according to how desirable they are,
you'd have a way to rank given dining adventures. Write yourslef a
method for converting the parsed document into a SQL CASE..END statement
(leveraging your criterion-to-SQL method, of course), you have a really
easy way to assign a "points" value to the items you're querying. Wrap
that with some sort of structure for managing a points-per-time-period
structure, and you have a very flexible and easy to use goal tracking
system.

Like so much else, I don't know how useful this is for others. The Query
DSL probably is, but the Report DSL might not be. But it's made my life
much easier, so worth the effort I've put into it. Code is only
available via Subversion (browser or SVN client) at the URL above. All
paths are relative, so you can svn:externals (use a revision number!) it
into your existing package-space and reference the CFCs natively. There
are docs and more examples in the files as well.