Question
A sync is failing with one of the following error messages:
Error Unexpected character ('X' (code 72)): Expected separator ('"' (code 34)) or end-of-line at line: ?, column: ?.
{
...
"data":{
"status":"FAILURE_WITH_TASK",
"reason":"com.fasterxml.jackson.dataformat.csv.CsvMappingException: Too many entries: expected at most 1 (value #1 (? chars) \"???\")\n at [Source: (BufferedReader); line: ?, column: ???]",
"taskType":"poorly_formatted_file"
},
...
}
Environment
File connectors:
- Amazon S3
- Azure Blob Storage
- Email connector
- FTP
- FTPS
- Google Cloud Storage
- SFTP
File connectors, only with Merge Mode enabled:
- Box
- Dropbox
- Google Drive
Answer
-
Verify in the source document that double quotes are correctly escaped as specified in RFC 4180 standard.
-
In the Setup tab, click Edit connection data.
-
Toggle Enable Advanced Options on.
-
Declare a Delimiter in the connector settings.
-
If needed, specify an Escape Character.
Cause
The file appears correctly formatted and is ingested or parsed correctly by other tools or software. Different tools use different rules and libraries to parse CSV. The tool used at Fivetran ingests CSV files that follow RFC 4180 standard.
A common error, in this case, is having double quotes inside a field that also contains other characters such as a JSON string.
For example:
"{'automation': {'type': 1, 'rule': "HS - MRR - Billing Mail delivery failure on xneelos auto replies "}, 'added_tags': ['delivery failure'], 'ticket_type': 'System', 'status': 'Closed', 'Product': 'None'}"
The RFC states that:
6. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes. For example: "aaa","b CRLF bb","ccc" CRLF zzz,yyy,xxx
7. If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example: "aaa","b""bb","ccc"
This makes the above string invalid