Source Salesforce: decode(chunk) produces garbled text #15950
Labels
autoteam
community
connectors/source/salesforce
python
Pull requests that update Python code
team/connectors-python
type/bug
Something isn't working
Environment
Current Behavior
UTF-8 characters seem to be decoded as ISO-8859-1 mistakenly.
Expected Behavior
Multi-byte text should always be decoded as UTF-8
Logs
Steps to Reproduce
The root cause might be this line of code, https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-salesforce/source_salesforce/streams.py#L297
Splitting data into small chunks and decoding each may not be a good idea since the data could be divided in the middle of a multi-byte character, which can be from one to four bytes depending on the character.
--
source: https://stackoverflow.com/a/10229225
Are you willing to submit a PR?
No
The text was updated successfully, but these errors were encountered: