Connector Improvement: AWS Lambda/GCP CF Connector Schema improvement

Based on the documentation there is "primary_key" field, and there is a list of standard data types in Fivetran docs in another topic.
Which is expected naturally: If there is a "schema", there should be possibility to define actual schema for BigQuery destination for example, something like:

...
"schema": {
  "primary_key": ["message_id"],
   "fields": [
     {
       "name": "message_id",
       "type": "STRING"
     },
     {
       "fields": [
         {
           "name": "message_control_id",
           "type": "STRING"
         },
         {
           "name": "action_type",
           "type": "STRING"
         }
    ],
    "name": "header",
    "type": "RECORD"
    }
   ]
},
...

I got this format as example from BigQuery schema description, I guess similar DSL structure can be unified for other target sources.

The main reason for this is to avoid not necessary transformations on later steps, because I have issues that for me it was easier to write custom CloudFunction to directly insert into BQ with Cloud Scheduler and save 'cursor' payload externally than sell the client idea that they need to use dbt or Dataform just to transform .. because of bad integration process (client POV)?
As an engineer I feel lack of possibility in some cases to bring Fivetran to clients, because it is OK to bring 1 new tools, but hard to explain why we need to use +1 another. (Views on top of integration layer are actually ok but looks like workaround)

Official comment

Alison Fivetranner

October 17, 2024 22:26

Hi Rakija,

Thank you for your thoughtful feedback.

We have just released a new Connector SDK service that does allow you to fully control your connector schema. I think that may meet this need - take a look at: https://fivetran.com/docs/connectors/connector-sdk

I'd love to hear how it works out.

Alison

Community

Connector Improvement: AWS Lambda/GCP CF Connector Schema improvement

Comments