May 2016 - davidvgalbraith

Juttle can analyze data that is in any of several data stores. It has processors like reduce, sort, head, and tail that operate over a sequence of data points. For instance, head n emits the first n points it receives and drops the rest. reduce count() increments a counter for each data point and returns the total. The built-in implementations of these processors are written in Node.js.

These Node.js processors are functionally correct, but they are often not the most efficient way to operate on data stored in a given database. For instance, the fastest way to count the records in a SQL table is SELECT COUNT(*) FROM my_table. But to count the records in a table using the Node.js implementation of reduce count(), we’d have to SELECT * from my_table, build a Juttle data point from every record, and perform the count in Javascript. It would be much faster if Juttle knew it could use SELECT COUNT to calculate the result of reduce count(). The Juttle Optimizer is the piece in Juttle’s architecture that turns Juttle programs into efficient queries like this.

Continue reading “Juttle’s Cross Platform Optimization Strategy”