replicate Transform

The replicate transform replicates a Pandas DataFrame. It will replicate it once for each asset in the asset_list. It will also add an ‘asset’ column that contains the name of the asset.

Example:

Say we have a DataFrame that looks like this:

some_field,stream_type,timestamp
3.14      ,'a_stream' ,2018-01-01 00:00:00.000000
2.71828   ,'b_stream' ,2018-01-02 00:00:00.000000
0.0       ,'my_stream',2018-01-03 00:00:00.000000

And we want to replicate it for three different assets:

asset   ,some_field,stream_type,timestamp
'asset1',3.14      ,'a_stream' ,2018-01-01 00:00:00.000000
'asset1',2.71828   ,'b_stream' ,2018-01-02 00:00:00.000000
'asset1',0.0       ,'my_stream',2018-01-03 00:00:00.000000
'asset2',3.14      ,'a_stream' ,2018-01-01 00:00:00.000000
'asset2',2.71828   ,'b_stream' ,2018-01-02 00:00:00.000000
'asset2',0.0       ,'my_stream',2018-01-03 00:00:00.000000
'asset3',3.14      ,'a_stream' ,2018-01-01 00:00:00.000000
'asset3',2.71828   ,'b_stream' ,2018-01-02 00:00:00.000000
'asset3',0.0       ,'my_stream',2018-01-03 00:00:00.000000

Our configuration will look something like this:

{
    "transform_name": "Replicate",
    "transform_type": "replicate",
    "filter_stream": [
        "*"
    ],
    "asset_list": ["asset1", "asset2", "asset3"],
}

Configuration:

Required and optional properties that can be configured for a replicate transform.

  • asset_list: The list of assets that will be added to the DataFrame

  • transform_name: Unique name for the transform.

  • transform_type: Type of transform to apply. Should be replicate.

  • filter_stream: List of data streams to transform. Each stream can either be * (all) or asset:stream.