Skip to main content

OpenContrail Analytics Query API

By September 23, 2014Analytics, Uncategorized
Overview

OpenContrail analytics platform provides a rich interface to query analytics data stored.

Before we venture into actual API details, let us look at over all architecture of analytics query processing:

sept23_post_image1

Analytics query APIs are available as rest APIs and are conceptualized as queries on logical tables. Currently OpenContrail UI uses analytics query API but any other non-contrail client can leverage these rest APIs too.

All analytics logical tables look like following:

sept23_post_image2

All rows in logical tables have three different types of rows:

  1. Timestamp column
  2. Index columns: Taken together they are unique per row.
  3. Columns

For example “Flow Series Table” looks like following:

sept23_post_image3

Columns shaded above are index columns.

 Query API Parameters

OpenContrail analytics query API is modeled on SQL Just to remind the readers a typical SQL query on a table looks like:

SELECT col1, col2, … coln FROM  tablename WHERE index1=value1.

We do not yet support the SQL syntax but the parameters of the analytics query API and SQL query are similar.

Each analytics query has following parameters:

  1. Time range: Timestamp range of rows over which the query has to be performed.
  2. FROM parameter: The name of the logical table that is being queried.
  3. SELECT parameters: It contains the list of columns that need to be returned in the result.
  4. WHERE parameters: It contains the clauses based on index column values to determine the rows from which results have to be shown.
  5. FILTER parameters: It indicates how the result should be filtered after database query and before being returned to the API invoker.
  6. SORT parameters: It indicates if and how the result should be sorted.

In rest of the blog we will take example of a simple query to return all ingress TCP flows which were active in last 12 hours with source vn as default-domain:default-project:ip-fabric. As part of query result we also want to get the setup time of the flows.

 Below is the UI screen of this query:

sept23_post_image4

{
 "table":"FlowRecordTable",
 "start_time":"now-43200s",
 "end_time":"now",
 "select_fields":["vrouter","sourcevn","sourceip","sport","destvn","destip","dport","protocol","direction_ing","setup_time"],
 "limit":150000,
 "where":[[
 {"name":"sourcevn","value":"default-domain:default-project:ip-fabric","op":1},
 {"name":"protocol","value":"6","op":1}
 ]],
 "dir":1
 }
Table name

Logical table on which the query is performed is indicated by JSON field “table”. In our example query, we are query “FlowRecordTable” whose rows correspond to each flow record.

Query time range

You always have to provide the time range of the analytics data over which the query is applicable.  Time range is indicated by JSON fields: “start_time” and “end_time”.

Values of these JSON fields can be absolute time or relative time (like in our example query). In case of absolute time you just specify the date-time string.

Direction

In case of flow queries you should always specify direction using “dir” JSON field. For egress traffic value of 0 is used while for ingress traffic value of 1 is used.

Select fields

“select_fields” JSON field indicate the fields/columns of the logical table which should be returned as part of the query result. Semantic of this is same as SELECT operation in SQL language.

WHERE clause

WHERE clause has similar semantic as the WHERE operator in SQL language.

WHERE clause is represented by “where” JSON field.

Value of “where” field is represented by:
Where_clause := [AND_Clause1, AND_Clause2, … AND_ClauseN] AND_Clause := [Match_Clause1, Match_Clause2, … Match_ClauseK] Match_Clause := { “name” : <field_name>, “value”: <value_name>, “op”: <op>}

Basically where_clause is an OR expression of AND clauses to pick a subset of rows from the logical table. Match clauses indicate the fields to match against.

In our example query, Match_clauses are:

{“name”:”sourcevn”,”value”:”default-domain:default-project:ip-fabric”,”op”:1}

{“name”:”protocol”,”value”:”6″,”op”:1}

“name” indicates the field to match against. “value” is the value  used for,matching/filering rows.  “op” indicates the matching operation. Value of “1”  for “op” implies equality operator.

You can represent any kind of Boolean expression in the WHERE clause.

Post processing the query results

You can ask the analytics engine to do some post-processing after the basic query processing. This can include sorting, filtering.

In our example query, we use “limit” JSON field to indicate maximum number of records to return. This is always helpful as analytics API client may not be able to handle unlimited amount of data.

Conclusion

OpenContrail Analytics Query API is a rich API which allows you to drill down to the right information you need from the analytics engine. Example discussed above is just the tip of the iceberg of possibilities with the API. Do refer to the reference guide for full details.

Reference Guide: http://www.juniper.net/techpubs/en_US/contrail2.2/topics/task/configuration/analytics-apis-vnc.html