Azure Stream Analytics supports processing events in a variety of data formats (CSV, JSON, Avro).
JSON and Avro can contain complex types such as nested objects (records) or arrays. Many Stream Analytics customers use nested data and we have received various questions related to processing events containing records and/or arrays.
This blog post provides an overview of available record and array functions and includes query examples for typical operations.
Record data type
A Record is a collection of name/value pairs. The Record data type is used to represent JSON objects and Avro records when corresponding formats are used in the input data streams.
Let’s assume we are processing data events with sensor readings in JSON format and walk through few common operations over records.
Here is example of a single event:
{
"DeviceId" : "12345",
"Location" : {"Lat": 47, "Long": 122 }
"SensorReadings" :
{
"Temperature" : 80,
"Humidity" : 70,
"CustomSensor01" : 5,
"CustomSensor02" : 99
}
}
Access nested record fields
You can use dot notation to access nested fields. For example, this query selects lat/long coordinates of the device:
SELECT
DeviceID,
Location.Latitude,
Location.Longitude
FROM input
Access nested record fields when the filed name is dynamic
If property name is unknown at query authoring time, use the GetRecordPropertyValue.
For example, imagine our sample stream is joined with reference data containing thresholds for each device sensor:
SELECT
input.DeviceID,
thresholds.SensorName
FROM input
JOIN thresholds
ON
input.DeviceId = thresholds.DeviceId
WHERE
GetRecordPropertyValue(input.SensorReading, thresholds.SensorName) > thresholds.Value
Extract record fields as separate events
To convert record fields into separate events use CROSS APPLY operator with the GetRecordProperties function. For example, to convert our sample stream into stream of event with individual sensor readings use this query:
SELECT
event.DeviceID,
sensorReading.PropertyName,
sensorReading.PropertyValue
FROM input as event
CROSS APPLY GetRecordProperties(event.SensorReadings) AS sensorReading
Array data type
An Array is an ordered collection of values. The Array data type is used to represent JSON and Avro arrays when corresponding formats are used in the input data streams.
In this section we will explain how to perform common operations over array values. Examples assume input events that have “arrayField” property of array type.
Select array element at specified index
For example, to select the first array element use the GetArrayElement function:
SELECT
GetArrayElement(arrayField, 0) AS firstElement
FROM input
Select array length
To retrieve the length of an array, use the GetArrayLength function:
SELECT
GetArrayLength(arrayField) AS arrayLength
FROM input
Select all array elements as individual events
To extract all array elements as individual events use CROSS APPLY operator with the GetArrayElements function:
SELECT
arrayElement.ArrayIndex,
arrayElement.ArrayValue
FROM input as event
CROSS APPLY GetArrayElements(event.arrayField) AS arrayElement\