Skip to main content

Community

Connector Improvement: Automatically delete absent/outdated rows

Completed

Please sign in to leave a comment.

Comments

13 comments

  • Official comment

    I'm pleased to let everyone know that we now support a 'soft delete' mode in our three Function connectors.

    You can now include a table in a "soft delete" response node to cause us to add a "_fivetran_deleted" system column and set all existing rows to TRUE at the start of the sync.

    See our documentation for details

    Alison

     

    As a soft delete would be fine as well, sticking with the _fivetran_deleted convention. 

    Hi Ernie,

    Thank you very much for taking the time to provide a detailed write up of this request - do you have a particular source you are working with right now that you need this for? How often do you think you need this?

    While we will explore our options, we don't have this planned at this time. We will continue to collect customer demand to justify the commitment to support & maintain a high-quality Fivetran connectors. Every upvote on this request increases the case to build it.

    Best regards,

    Alison

    Hi Alison,

    In this case we needed it to bulk import user, group, and membership data from Azure Active Directory (AAD). Since Fivetran doesn't have a connector for AAD, we wrote a custom one using the Microsoft Graph API, but reliably keeping track of deletions was a hurdle.

    We have a few other SaaS providers that also don't have a way to track deletions via API, so those custom connectors also have the same issue. For now, we've gotten around this by comparing the fivetran_synced column to the latest value, but it's not always reliable so it'd be nice to get fivetran_deleted functionality for bulk imports.

    Hi Ernie,

    Thank you so much for the additional details it will really help us scope the opportunity.

    Regards

    Alison

    Hi! We are having the same problem with Azure Blob Storage. 

    We use Fivetran to pull data from blobs that are in a Microsoft Azure container.
    Every time that we create new blobs the data updates well in our destination table, but when we have to delete one of them, the blobs will no longer appear in the container but they will still be visible in the table.

    We are pulling sales forecast data from an API via AWS lambda.  That API returns a complete current set.  Some users have since left the company, and their data is still marked as active (_fivetran_deleted = false), even though they are no longer in the result set.   I would have expected the connector to mark missing records as _fivetran_deleted when no longer present, since I did return a schema (indicating primary fields by which to merge), per https://fivetran.com/docs/functions/aws-lambda. 

    I was told by support that in order to delete these users, I need to figure that out myself and add it to the "delete" key.  That would require me to connect to the destination data within the lambda and essentially do the merge myself.

    This seems like a bug, not an enhancement request. (support ticket 102532)

    Does anyone have any recommendations for how to handle this currently? I am in the same situation where the API is giving me a list of users, but does not give me an endpoint to see deleted users. Therefore they are just falling out of the response list, meanwhile in my warehouse the deleted users appear as if they are still active.  

    Would almost be better if Fivetran offered a kill & fill option where the data gets dropped each time the connector syncs. That would allow my warehouse to accurately reflect the data from the API. 

    Hi Justin,

    I'm sorry you are experiencing difficulties.

    Have you been able to explore the SoftDelete option? I think it might provide the 'kill and fill' experience you were suggesting.

    Best regards

    Alison

    I have tested the softDelete feature but it is not working as I would expect. 
    I took the existing connector and modified the AWS Lambda to include the new softDelete field as follows: 

    After making the changes I synced the connector and then also did a full historical resync. I checked the snowflake table after each one and the _fivetran_deleted field has no been added to my table. Is there anything additional that needs to be done in order for softDelete to work? 

    Please note that in my snowflake table I have records that are "deleted" (meaning they are no longer in the current payload) and their _FIVETRAN_SYNCED field is from 2+ months ago. I would expect that these records would now show up as _fivetran_deleted = True.

    Hi Justin,

    That sound fishy! Can you please create a support ticket so we can troubleshoot to identify and fix any issues that we find.

    I'm looking forward to getting this up and running for you.

    Alison

    For anyone who finds this in the future, this may be useful. My connector was using S3 to transfer the data from the lambda function to fivetran. Initially I added the `softDelete` field only to the S3 file. 
    Turns out that the `softDelete` field needs to be included in the lambda response payload as well. 

    Hi Justin,

    Thank you so much for following up and I'm glad it was a quick fix.
    Alison