Restructure JSON or hash objects

By default, RDI adds fields to hash or JSON objects in the target database that closely match the columns of the source table. The examples below show how you can create a completely new object structure from existing fields using the map transformation.

Map to a new JSON structure

The first job file example creates a new JSON object structure to write to the target. The source section selects the employee table of the chinook database (the optional db value here corresponds to the sources.<source-name>.connection.database value defined in config.yaml).

In the transform section, the map transformation uses a JMESPath expression to specify the new JSON format. (Note that the vertical bar "|" in the expression line indicates that the following indented lines should be interpreted as a single string.) The expression resembles JSON notation but with data values supplied from table fields and JMESPath functions.

Here, we rename the employeeid field to id and create two nested objects for the address and contact information. The name field is the concatenation of the existing firstname and lastname fields, with lastname converted to uppercase. In the contact subobject, the email address is obfuscated slightly, using the replace() function to hide the '@' sign and dots.

In the output section of the job file, we specify that we want to write to a JSON object with a custom key. Note that in the output section, you must refer to fields defined in the map transformation, so we use the new name id for the key instead of employeeid.

The full example is shown below:

source:
  db: chinook
  table: employee
transform:
  - uses: map
    with:
      expression: |
        {
          "id": employeeid,
          "name": concat([firstname, ' ', upper(lastname)]),
          "address": {
            "street": address,
            "city": city,
            "state": state,
            "postalCode": postalcode,
            "country": country
          },
          "contact": {
            "phone": phone,
            "safeEmail": replace(replace(email, '@', '_at_'), '.', '_dot_')
          }
        }        
      language: jmespath
output:
  - uses: redis.write
    with:
      connection: target
      data_type: json
      key:
        expression: concat(['emp:', id])
        language: jmespath

If you query one of the new JSON objects, you see output like the following:

> JSON.GET emp:1 $
"[{\"id\":1,\"name\":\"Andrew ADAMS\",\"address\":{\"street\":\"11120 Jasper Ave NW\",\"city\":\"Edmonton\",\"state\":\"AB\",\"postalCode\":\"T5K 2N1\",\"country\":\"Canada\"},\"contact\":{\"phone\":\"+1 (780) 428-9482\",\"safeEmail\":\"andrew_at_chinookcorp_dot_com\"}}]"

Formatted in the usual JSON style, the output looks like the sample below:

{
  "id": 1,
  "name": "Andrew ADAMS",
  "address": {
    "street": "11120 Jasper Ave NW",
    "city": "Edmonton",
    "state": "AB",
    "postalCode": "T5K 2N1",
    "country": "Canada"
  },
  "contact": {
    "phone": "+1 (780) 428-9482",
    "safeEmail": "andrew_at_chinookcorp_dot_com"
  }
}

Map to a hash structure

This example creates a new hash object structure for items from the track table. Here, the map transformation uses SQL for the expression because this is often more suitable for hashes or "flat" JSON objects without subobjects or arrays. The expression renames some of the fields. It also calculates more human-friendly representations for the track duration (originally stored in the milliseconds field) and the storage size (originally stored in the bytes field).

The full example is shown below:

source:
  db: chinook
  table: track
transform:
  - uses: map
    with:
      expression:
          id: trackid
          name: name
          duration: concat(floor(milliseconds / 60000), ':', floor(mod(milliseconds / 1000, 60)))
          storagesize: concat(round(bytes / 1048576.0, 2), 'MB')
      language: sql
output:
  - uses: redis.write
    with:
      connection: target
      data_type: hash
      key:
        expression: concat('track:', id)
        language: sql

If you query the data for one of the track hash objects, you see output like the following:

> hgetall track:16
1) "id"
2) "16"
3) "name"
4) "Dog Eat Dog"
5) "duration"
6) "3:35.0"
7) "storagesize"
8) "6.71MB"
RATE THIS PAGE
Back to top ↑