Data Streamer Upgrade

2019-10-04 00:00:00 +0000

With this update, the data streamer now includes more comprehensive Usage and Event data to the destination buckets, Kinesis streams and user-provided applications.

Additional S3 Data (CSV)

Note:
Users of the S3 integration should ensure that minimum security standards are enforced. Details of configuring buckets for this purpose are described in Security Guidelines.

Event Data

The following additional properties are now included in event stream CSV files as column headers (where applicable):

  • endpoint_id
  • endpoint_name
  • endpoint_imei
  • endpoint_ip_address
  • endpoint_tags
  • event_source_id
  • event_source_name
  • sim_id
  • sim_iccid
  • msisdn_msisdn
  • sim_production_date
  • imsi_id
  • imsi_imsi
  • user_id
  • user_username
  • user_name

Usage Data

The following additional properties are now included in usage stream CSV files as column headers (where applicable):

  • endpoint_name
  • endpoint_ip_address
  • endpoint_tags
  • endpoint_imei
  • msisdn_msisdn
  • sim_production_date
  • operator_mncs
  • country_mcc

Additional Kinesis Data (JSON)

Kinesis Event Data

The following additional properties are now included in event stream JSON for endpoint, sim and imsi objects:

{
 "endpoint" : {
   "name":  "...",
   "imei":  "...",
   "ip_address": "...",
   "tags": "..."
 },
 "sim" : {
   "msisdn": "...",
   "production_date": "..."
 },
 "imsi" : {
   "imsi": "...",
   "import_date": "..."
 }
}

Kinesis Usage Data

The following additional properties are now included in event stream JSON for endpoint, sim and operator objects:

{
  "endpoint" : {
    "name": "...",
    "ip_address": "...",
    "tags": "...",
    "imei": "...",
  },
  "sim": {
    "msisdn": "...",
    "production_date": "..."
  },
  "operator": {
    "mnc": "...",
    "country": {
      "mcc" : "..."
    }
  }
}

Additional Rest API Data (JSON)

The data streamer may also send usage and event data in JSON format toward a configurable URL specified by the user. In this case, users provide an application which consumes HTTP POST requests sent from the emnify platform. With this deployment, both Rest API stream types (Rest and Bulk) will include the additional data as specified in Kinesis Event Data and Kinesis Usage Data.

Security Guidelines

Event data that is sent via Data Streams may include usernames, email addresses and other data which can identify users or platform resources. The generated .csv files should therefore be treated as containing sensitive information. Precautions should be taken to ensure that the event and usage data in the destination S3 buckets are adequately secured.

The following three steps should be considered as the minimum security requirements for storing such data in S3:

  • Ensure that the S3 bucket is not publicly accessible. This can be applied in the Permissions tab of the S3 bucket.
  • Server-Side Encryption can be enabled per-bucket and S3 will encrypt objects before they are saved to disk. The decryption is then performed when downloading the objects. This can be enabled in the Properties tab of the S3 bucket.
  • The IAM user whose credentials (key ID & key secret) access the S3 bucket should have their permissions restricted to writing to the required bucket only. This can be done in the AWS console in Users -> {Data Stream User} -> Permissions -> Add Permissions -> Attach Policy Directly -> Create Policy. An example JSON Policy would look like:
{
  "Id": "Policy1569854716005",
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Stmt1569854710254",
      "Action": [
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:s3:::my-data-stream", // <1>
      "Principal": {
        "AWS": [
          "data-stream-writer" // <2>
        ]
      }
    }
  ]
}

<1> ARN of the destination S3 Bucket
<2> The IAM user with data stream write access

TIP
AWS provide an online JSON Policy Generator which can be used to create a policy like the example given above.

Backwards Compatibility

The emnify Data Streamer is under active development and is updated for performance and quality improvements regularly so that users of the platform may gain rich streaming insights into usage and event data.

Versioning
There is no external versioning of the Data Streamer that is necessary for developers to have to track. Updates are, therefore, always performed on the service with the intent of preserving backwards compatibility. This means that existing JSON or CSV entities have their ordering preserved and are not renamed or replaced when updates to the data streamer are performed.

Parsing S3 or Kinesis Artifacts
Users who have built custom integrations in AWS or otherwise which consume the JSON or CSV generated from S3 or Kinesis streams should expect that additional JSON or CSV data may be added in future. Mature and tested libraries designed for parsing or reading CSV and JSON data should be used for custom integrations (which may be in lambdas, for example) to ensure compatibility in cases where additional properties or objects are added to data streams in future.