Introduction
You may find that you have nested JSON data Snowflake that you need to parse out so that the data may be used for analytics. See below for an example of how this works to help you get started before manipulating your production data.
Environment
Destination: Snowflake
Recommendation
Fivetran does not automatically unpack nested JSON objects to separate columns or tables in the destination. Fivetran recommends using Snowflake’s native FLATTEN function to parse out the values you are looking for into separate columns. You will need to follow this process to Flatten and Query JSON Data.
First, try out this example data to get familiar with how it works:
- Create a sample of JSON data in Snowflake:
CREATE OR REPLACE SCHEMA TEST;
CREATE OR REPLACE TABLE TEST.JSON_TEST AS
SELECT
parse_json(column1) AS FT
FROM VALUES
('{more:
{test:{
"ID": 2,
"name": "Fivetran",
"function": "Data Pipeline As a Service",
"code": {
"abbr": "5T",
"st":
{
stattrib:"1"
}
}
}
}
}
')
as raw_json;
- Study the raw output:
SELECT FT:more::STRING as moreFROM TEST.JSON_TEST;
- Manipulate the above select SQL code to flatten out the JSON completely:
SELECT
FT:more.test.ID::INTEGER as ID,
FT:more.test.name::STRING as Name,
FT:more.test.function::STRING as "Function",
FT:more.test.code.abbr::STRING as codeAbbr,
FT:more.test.code.st.stattrib::STRING as codeStStattrib
FROM
JSON_TEST;
Considerations
JSON does not always come up in a fixed format and keys change all the time, which can make it hard to know what keys exist. To understand your JSON structure, you can use a flatten function:
select
upper(regexp_replace(f.path, '\\[[0-9]+\\]', '[]')) as path
, typeof(f.value) as type
, count(*) as record_count
from
TEST.JSON_TEST
, lateral flatten(parse_json(FT), recursive=>true) f
group by 1, 2
order by 1, 2;