Skip to main content

Community

Proper identifier for special characters when creating columns

Answered

Please sign in to leave a comment.

Comments

2 comments

  • Official comment

    Hi Russell,

    Kevin from the Product team here - apologies for the delay! The team has been discussing potential solutions for this problem.

    In order to stay in line with our core product principles, it is crucial that the solution to this situation is simple, predictable, and default for all connectors. For this particular case of special characters, we discussed several potential solutions:

    1. Don't rename symbols, and keep column names the same when writing to destinations: This solution is not possible since many warehouses support different characters. For example, while Snowflake supports characters like '+' and '-', Azure Synapse does not. We have renaming conventions because the destinations we partner with all have different character set requirements.

    2. Rename symbols to a set of readable strings (e.g. '+' resolves to 'plus'): This solution seemed promising at first, but there are many characters where we don't have context on what the symbol should mean. For example, 'dollars lost (-)' might intuitively become 'dollars lost (minus)', but what about a column such as 'long-term data' which should be 'long hyphen term data'. There are some columns where we don't know what the symbol should become, and might cause confusion when analysts see these names in their warehouse.

    3. Whenever we detect a duplicate, append some number to the column name: An example of this would be when you create duplicate files on your computer. You might see a syntax like 'myColumn' and 'myColumn (1)'. This solution does not guarantee preventing duplicates for all custom column types, and we felt does that fix the core of the issue, which is having different columns that are resolving to the same name.

    Each of these proposals creates issues/problems because the solutions do not clearly solve the underlying issue. Would you have any feedback on the discussion points above?

    Also, would it be possible for you to run a script prior to Fivetran syncs to change the names of the columns to prevent the name duplications that occur?

    Hello support team, following up on this please. Thank you