During this year’s edition of the Paris Big Data conference, amid an infinite set of booths filled with flashy promises of performance and scalability, one company stood out from the rest. Couchbase, the document-oriented NoSQL database, came to the conference armed with the newly-released sixth version of its Couchbase Server database.

After it was first launched in 2010, Couchbase spent its first years struggling to get an audience to hear what it has to say. MongoDB was already being adopted by a large number of companies. Then, it rapidly became the norm in the document-oriented databases industry. Leaving no room for its contenders to showcase what they have under the hood.

But this year, Couchbase was aiming for a seat among the big players. And it came with the tech that warrants one. Via this article, we’ll discuss the two key additions to Couchbase’s offering. And, how this database could represent an interesting alternative to MongoDB in a vast range of use-cases.

Adding NoETL to NoSQL

The Extract-Transform-Load (ETL) mechanism has been the norm in data-driven analytics since the dawn of the Business Intelligence era two decades ago. Yet it was always regarded as one of the most flawed concepts of the Business Intelligence/Big Data universe. ETL is time-consuming, it gets incredibly complex very quickly, and it necessitates multiple resources and tools to function properly. The below figure consists of a traditional ETL pipeline, and we could immediately see how this pattern lacks efficiency:

couchbase 1
ETL Pipeline (source: Couchbase)

With the release of Couchbase Server 6.0, the company introduced Couchbase Analytics. It’s mainly a data-analysis service that allows users to analyze their data in real-time without the need to go through an ETL pipeline, as shown below:


Couchbase Analytics (source: Couchbase)

Via complex ad-hoc queries launched directly on Couchbase, you could implement both the operational and analytical needs within the same application. Which is a huge step towards minimizing the complexity of data pipelines. To learn more about this feature, Couchbase offer an extensive presentation of the analysis module on their blog.

Betting on SQL++

SQL++ is a highly expressive and composable semi-structured query language that encompasses both the SQL and the JSON data model. Which means that it can run queries with or without schemas. Couchbase opted for this language to build its own implementation, N1QL for Analytics, on top of it. N1QL for Analytics is mainly tuned for analytical queries using JSON data and offers a large set of built-in functions. This allows Couchbase to bring the power of SQL to the Big Data era and to offer users the possibility to query their JSON data without applying a schema. And, it’s the basis of the Couchbase Analytics module discussed above.

The following query can be easily mistaken for a traditional SQL statement, while it is in fact an SQL++ query that uses the JSON data model for both input and output:

SELECT  c.custid,
        c.name,
        c.orderno,
        o.order_date,
        o.ship_date,
FROM orders o 
JOIN customers c ON o.custid = c.custid
WHERE o.orderno = 1004;

More about SQL++ and how it manages to deliver the benefits of two worlds is available on the website. Additionally, they offer an interactive tutorial that can represent a great first step towards mastering SQL++. 

With these two new additions to an already impressive and feature-filled ecosystem. Couchbase took a big step on its path towards dethroning MongoDB. This company that keeps widening its range of use-cases while adding such features to its system definitely deserves the attention it has been getting.