Skip to main content

Background

MergeStat makes use of SQLite to power its SQL execution environment. However, data is not pre-loaded into a SQLite database file (as you might find in an ETL tool). Instead, we take advantage of a feature called the virtual table mechanism, which allows us to define how to access data sources, which SQLite then uses to fetch data as a query executes.

This allows us to query data sources directly, such as git repositories on disk, without a potentially time consuming pre-load step.

Naturally, our dialect of SQL is the SQLite implementation. MergeStat extends SQLite by defining table-valued functions and scalar functions (similar to what the popular osquery does).

Table-Valued Functions

Table-valued functions, as the name implies, return a table of results. This is how our commits table works for instance:

SELECT * FROM commits('https://github.com/mergestat/mergestat', 'HEAD')
SELECT * FROM commits -- no arguments supplied

Scalar Functions

Scalar functions return a scalar value, and so can only be used where a scalar value would be found:

-- GITHUB_TOKEN env must be set
SELECT github_stargazer_count('mergestat/mergestat')