Juttle’s Cross Platform Optimization Strategy

Juttle can analyze data that is in any of several data stores. It has processors like reduce, sort, head, and tail that operate over a sequence of data points. For instance, head n emits the first n points it receives and drops the rest. reduce count() increments a counter for each data point and returns the total. The built-in implementations of these processors are written in Node.js.

These Node.js processors are functionally correct, but they are often not the most efficient way to operate on data stored in a given database. For instance, the fastest way to count the records in a SQL table is SELECT COUNT(*) FROM my_table. But to count the records in a table using the Node.js implementation of reduce count(), we’d have to SELECT * from my_table, build a Juttle data point from every record, and perform the count in Javascript. It would be much faster if Juttle knew it could use SELECT COUNT to calculate the result of reduce count(). The Juttle Optimizer is the piece in Juttle’s architecture that turns Juttle programs into efficient queries like this.

Continue reading “Juttle’s Cross Platform Optimization Strategy”

Pushing the performance limits of node.js

Building a data analysis platform in Javascript

Historical note: This was originally published as a post on Jut’s blog. Nobody wanted to pay for the product it describes, so Jut has gone in a very different direction of late, and Jut’s blog is a 404 at the moment. As a technical piece, though, I think it merits keeping alive.

We love node.js and Javascript. We love them so much, in fact, that when Jut decided to build a streaming analytics platform from scratch, we put node.js at the center of it all. This decision has brought us several benefits, but along with those came a few unique scaling challenges. With some careful programming, we’ve been able to largely overcome node.js’s limitations: I’ll share with you some of the tricks we used.

Continue reading “Pushing the performance limits of node.js”