RDPv3 Transmit

A rdpv3 data transmit transfers data to the Sight Machine cloud. The key difference w/ rdpv2 is that we preserve most of Remote Data Post Transmit functionality but we producer Kafka messages instead of MongoDB sslogs.

Additionally:

  • The data format will maintain its structure instead of encasing it inside a nested value like data.fieldvalues

This plugin works alongside with the sight_machine_cloud configuration section, sharing the API_key, API_key_ID, and base_url properties.

Example:

If we want to upload our records to our Sight Machine cloud, https://cloudyday.sightmachine.io, our configuration will look something like this:

{
    "sight_machine_cloud": {
        "base_url": "https://cloudyday.sightmachine.io",
        "api_key": "Castle_in_the_sky",
        "api_key_id": "Pazu"
    },
    "data_receiver": [{}],
    "data_transmit": [
        {
            "transmit_type": "rdpv3",
            "transmit_name": "tiger_moth",
            "topic_prefix": "cloudyday.some_category"
        }
    ]
}

Configuration:

Required and optional properties that can be configured for a Remote Data Post (RDPv3) transmit.

  • transmit_name: ID for the transmit. It must be unique.

  • transmit_type: Method to use in transmitting records.

  • filter_stream: A list of streams that will use the transmit. Each stream can either be * (all) or asset:stream.

  • base_url: URL of your Service Environment (e.g. https://my-environment.sightmachine.io). If this option is omitted, then the base_url configured in the sight_machine_cloud section will be used.

  • API_key: API key to allow for transmitting data to the Sight Machine cloud. If this option is omitted, then the value in the sight_machine_cloud section is used instead.

  • API_key_ID: API key ID to allow for transmitting data to the Sight Machine cloud. If this option is omitted then the value in the sight_machine_cloud section is used instead.

  • timeout: Number of seconds to wait until timing out.

  • poll_interval: Maximum number of seconds to wait between RDPv3 requests.

  • max_request_records: Maximum number of records to send in a single RDPv3 request.

  • max_request_size_bytes: Maximum number of bytes allowed in a single request. Regardless of this size, it is capped at 4 MB.

  • topic_prefix: Base name of the topic(s) to produce Kafka messages to. The prefix root must match the base_url, e.g. https://my-environment.sightmachine.io prefix: my-environment.aaa_123.

  • topic_suffix_field: Name of the field present in dataframe to get topic suffix. If topic_field not specified then default topic suffix will be asset:stream. The uniqueness of these values will determine how many different topics get produced.

Note:

By supplying the optional properties, base_url, API_key, and API_key_ID, in the transmit section, you can upload to a different Sight Machine cloud instance than the one specified in the sight_machine_cloud section.

For example, if we wanted to upload records from the bathhouse:water stream to a specific Sight Machine cloud instance, https://yubaba.sightmachine.io, our configuration would look something like this:

{
    "sight_machine_cloud": {
        "base_url": "https://cloudyday.sightmachine.io",
        "api_key": "Castle_in_the_sky",
        "api_key_id": "Pazu"
    },
    "data_receiver": [{}],
    "data_transmit": [
        {
            "transmit_name": "sen",
            "transmit_type": "rdpv3",
            "filter_stream": [
                "bathhouse:water"
            ],
            "base_url": "https://yubaba.sightmachine.io",
            "API_key": "Spirited_Away_some_api_key_content_here",
            "API_key_ID": "Haku",
            "topic_prefix": "yubaba.some_category"
        }
    ]
}
Warning:

This plugin is very sensitive to upload bandwidth: if it can’t upload a payload completely within the timeout, it will abort the connection and try again. You may need to increase the timeout from the default of two minutes. For example:

{
    "transmit_name": "SightMachine",
    "transmit_type": "rdpv3",
    "filter_stream": ["*"],
    "topic_prefix": "cloudyday.some_category",
    "timeout": 300
}