SQL Select Analytics

Retrieves documents from the database and enables the selection of one or many documents or attributes from one or many collections

Select executes 100% in database and utilizes database indexes when available. Select also utilizes sparse bit map indexes and can outperform commercial analytical SQL Engines in some cases

The best way to learn how select handles documents and joins is to run queries in the mongo shell and analyze the output


This feature is only available from your Nodechef Cloud Search instance. None of the examples provided below will work on a mongodb database. To deploy a cloud search instance, login and navigate to the dashboard. From deployments > deploy cloud search, you can then spin up a cloud search server

Syntax

SELECT [ ALL | DISTINCT ] { select_item [ [AS] output_name ] } [, ...] FROM [ from_collection | ( SUB SELECT.. ) ] WITH [ with_options ] [ INNER | LEFT JOIN condition ] [ WHERE condition ] [ GROUP BY expression [, ...] [ HAVING condition [, ...] ] ] [ ORDER BY { order_list [ ASC | DESC] } [, ...] ] [ LIMIT { count [ OFFSET start ] | start [ , count ] }


Select command

db.runCommand({ select : "SELECT 'foo'" })

Passing parameters

You can stringify the entire query without using parameters. Parameters are however more efficient in some cases such as querying against documents and binary types and so on

# Parameters must always start with the character ? db.runCommand({ select : "SELECT * from collection where _id = ?param1", params : { "?param1" : ObjectId("507f1f77bcf86cd799439011") } })

With - Useful flags

# You can make SELECT behave like any mongodb find query by using the as_document flag # For this use case, both inner and left join are not supported. ie when the as_document flag is used db.runCommand({select : "select id, type, '$' + price as 'price', name, batters.batter.type FROM desserts WITH (as_document) WHERE id = '0001'"}) # The above query can be rewritten as a mongoDB find query, of course without the as price expression db.desserts.find( { id : "0001" }, { type : 1, price : 1, name : 1, "batters.batter.type" : 1 } ) # Select does not unwind arrays automatically. You can however override this # by using the unwind_arrays flags. This is similar to the $unwind aggregate operator db.runCommand({select : "select id, name, topping FROM desserts WITH (unwind_arrays)"})

Array positional operators

The concept of array positional operators come from mongodb. Positional operators allow you to project a specific element in an array based on the query criteria specified. Nodechef supports two positioanl operators which are the single position projector ($) and the multi position projector ($*)

Single position projector ($)

This projector allows you to project a single element from an array using the query predicate specified. The query returns the first element in the given array that matches the predicate. Unlike mongodb, the Nodechef single position projector can project elements nested multiple levels in an array.

{ "_id" : 1 "name" : "Dave Gahan" "medications" : [ { "id" : 23, "name" : "Dilaudid", "type" : "Rx", "prescriptions" : [ { "id" : 77, "quantity" : 45, "started" : 2009-02-01 } ] }, { "id" : 41, "name" : "Oxycodone", "type" : "Rx", "prescriptions" : [ { "id" : 77, "quantity" : 45, "started" : 2009-02-01 } ] } ] } SELECT _id, name, medications.prescriptions.$ as p FROM patients WHERE medications.id = 23 AND medications.prescriptions.id = 77 The above query will return the below document { "_id" : 1, "name" : "Dave Gahan", "p" : [ { "id" : 77, "quantity" : 45, "started" : 2009-02-01 } ] }

Multi position projector ($*)

This operator can be used to project multiple elements that match the query critera

{ "_id" : ObjectId("4d2d8deff4e6c1d71fc29a07"), "user_id" : "714638ba-2e08-2168-2b99-00002f3d43c0", "events" : [ { "handled" : 1, "profile" : 10, "data" : { } } { "handled" : 1, "profile" : 10, "data" : { } } { "handled" : 0, "profile" : 20, "data" : { } } ... ] } Consider the above document. If one wants to query for all events sub documents where the events.handle equals 1, the $* operator can be used to accomplish this. SELECT _id, events.$* as handled_events FROM devices WHERE events.handled = 1 The above query will return the below document. { "_id" : ObjectId("4d2d8deff4e6c1d71fc29a07"), "handled_events" : [ { "handled" : 1, "profile" : 10, "data" : { } } { "handled" : 1, "profile" : 10, "data" : { } } ] } SELECT _id, events.$*.profile as events_profile FROM devices WHERE events.handled = 1 The above query returns the below document. { "_id" : ObjectId("4d2d8deff4e6c1d71fc29a07"), "events_profle" : [ 10, 10 ] }

Array index projection

To project an element at a specific index in an array. You can specify the index of the element using dot notation. It is important to avoid using numbers as attribute names for arrays in your documents as this can possibly cause unexepcted behaviours when querying the array by index.

{ "_id" : ObjectId("4d2d8deff4e6c1d71fc29a07"), "user_id" : "714638ba-2e08-2168-2b99-00002f3d43c0", "events" : [ { "handled" : 1, "profile" : 10, "data" : { } } { "handled" : 1, "profile" : 10, "data" : { } } ... ] } From the above document, project the first document in the events array as in. { "_id" : ObjectId("4d2d8deff4e6c1d71fc29a07"), "first_event" : [ { "handled" : 1, "profile" : 10, "data" : { } } ] } SELECT _id, events.0 as first_event FROM devices WHERE user_id = '714638ba-2e08-2168-2b99-00002f3d43c0'

Hierarchical joins

The arrayfromSelect and documentfromSelect functions shorthand "[]" and "{}" respectively can be used to perform hierarchical joins. These two functions accept a select statement as an input parameter. These functions can then be nested multiple levels to output a rich JSON document with input from multiple tables.

Example

The below query joins two collections to the NewsFeed collection to retrieve Comments associated with an entry in the newsFeed as well as the user details associated with the comment. Note only 20 entries in the NewsFeed collection are returned as well as only the first 10 comments per post are returned.

SELECT summary, post_date, _id, [( SELECT _id, text, {(SELECT _id, display_name FROM Users where _id = c.userid)} as user FROM comments c WHERE postId = n.id LIMIT 10 )] as comments FROM news_feed n WHERE tags = 'tech' and postdate < '2015-10-02T01:11:18.965Z' LIMIT 20 # The above query will output documents with the below structure. { "summary" : "...", "post_date" : "...", "_id" : "..", "comments" : [ { "_id" : "...", "text" : "...", "user" : { "_id" : "...", "display_name" : "..." } } ] }

Select cursors

By default select returns a single batch of results, in some cases a cursor can be useful. To create a cursor, use the enablecursor flag. Note this cursor returned is the same returned from any mongodb aggregate query. The PHP driver supports this as a CommandCursor. However the node.js driver does not directly support this. Since you might not be able to hack the driver directly, For this cursor to be useful, MongoDB 3.2 introduces a new command getMore to make it easy to iterate over a command cursor like aggregate and Nodechef select.

You can repeatedly call this function to exhaust all the results in this cursor. By default SELECT returns 1024 documents when the limit clause is ommitted.

Example

{ getmore : <cursor_id>, batchsize : <int>, collection <string>, maxTimeMS : <int> } db.runCommand({select : "select id, name, topping FROM desserts WITH (enablecursor)"}) # Returns: { "cursor" : { "id" : NumberLong(0), "ns" : "test.desserts", "firstBatch" : [ ] }, "ok" : 1 } # Retrieve the cursor.id value and use it to call getmore until all results are read. db.runCommand( { getmore : 2, batchsize : 2048, collection : "desserts" })

Multi Select command

Executes one or more select statements either sequentially or in parallel.

In some cases, you might want to query several collections in a way which is not possible with a join. You can avoid the cost of multiple round trips to the database by sending multiple select statements in a single commands. Each select statement can be executed in parallel on the database if required.

db.runCommand({ multiSelect : 1, ordered : <1 | 0 | true | false> parallelExec : <1 | 0 | true | false> workers : <int> statements: ["sql_statement", "sql_statement"] })

Behaviour

  • Ordered: Optional. If true, perform an ordered execution of the select statements in the array, and if an error occurs with one of the statements, the database will return without processing the remaining statements in the array. Only applies when parallelExec is false. Default value is false
  • ParallelExec: Optional. Instructs the database to execute the select statements in parallel. Default value is false
  • Workers: Optional. Total number of threads to use for the parallel execution. The database resets this parameter to the number of CPUs * 2 in cases where the input exceeds CPUs * 2.
  • Statements: Required. One or more sql statements to be executed

db.runCommand({ multiSelect : 1, parallelExec : 1, statements : [ "select make, count(*) from listings_collection group by make", "select model, count(*) from listings_collection group by model", "select fuel_type, count(*) from listings_collection group by fuel_type" ] })


Case

# Simple CASE expression: CASE input_expression WHEN when_expression THEN result_expression [ ...n ] [ ELSE else_result_expression ] END # Searched CASE expression: CASE WHEN Boolean_expression THEN result_expression [ ...n ] [ ELSE else_result_expression ] END


Operators

Use arithmetic + (Add) for string concatenation

Arithmetic

  • + (Add)
  • - (Subtract)
  • * (Multiply)
  • / (Divide)
  • % (Modulo)

Bitwise

  • & (AND)
  • | (OR)
  • ^ (XOR)

Comparison

  • = (Equals)
  • > (Greater than)
  • < (Less than)
  • >= (Greater than or Equals)
  • <= (Less than or Equals)
  • <> (Not eqauls)
  • != (Not eqauls)

Logical

  • ALL
  • AND
  • ANY
  • BETWEEN
  • EXISTS
  • IN
  • LIKE
  • NOT
  • OR


Aggregate functions


String functions


Math functions

Parameter number refers to any integer or floating point data type which includes int32, int64, float, double, decimal

input means the function returns a value of data type as that of the input parameter


Date functions


Array functions


Cast functions


Geo search functions

Geo search functions can be used in combination with any other group of functions such as text, math etc. Nodechef uses spherical geometry for all operations except when the geojson object specified is a bounding box


Text functions