Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError when installing dbt #1771

Closed
1 of 5 tasks
alejandro-flores-1 opened this issue Sep 18, 2019 · 10 comments
Closed
1 of 5 tasks

UnicodeDecodeError when installing dbt #1771

alejandro-flores-1 opened this issue Sep 18, 2019 · 10 comments
Labels
adapter_plugins Issues relating to third-party adapter plugins
Milestone

Comments

@alejandro-flores-1
Copy link

Describe the bug

When attempting to pip install dbt a UnicodeDecodeError is thrown.
Our machine's default encoding is ANSI_X3.4-1968

This error was traced back to a specific character in the README. It is the dash that follows the line: Models frequently build on top of one another. If this character is removed dbt can install correctly.

UnicodeEncodeError: 'ascii' codec can't encode character '\u2013' in position 0: ordinal not in range(128)

Steps To Reproduce

pip install dbt

Expected behavior

We expect to properly install dbt.

Screenshots and log output

    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-6skr9ygr-build/setup.py", line 9, in <module>
        long_description = f.read()
      File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1860: ordinal not in range(128)

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: N/A)

The output of dbt --version:

N/A

The operating system you're using:
GNU/Linux
The output of python --version:
Python 3.6.8

Additional context

Add any other context about the problem here.

@alejandro-flores-1 alejandro-flores-1 added bug Something isn't working triage labels Sep 18, 2019
@drewbanin drewbanin removed the triage label Sep 18, 2019
@drewbanin
Copy link
Contributor

Thanks for the report @alejandro-flores-1 - we can definitely update that particular character. Are you interested in sending over a PR for the change?

I'm just curious: Is there a reason you're using the ANSI_X3.4-1968 encoding on your machine? I want to better understand if this is something specific to you, if it's your linux distro, etc.

@drewbanin drewbanin added this to the 0.14.3 milestone Sep 18, 2019
@drewbanin drewbanin added the good_first_issue Straightforward + self-contained changes, good for new contributors! label Sep 18, 2019
@beckjake
Copy link
Contributor

I think you're going to have a lot less problems if you just use a UTF-8 encoding when running dbt. This might be enough fix the individual case of installing but we have unicode characters in dbt.

@alejandro-flores-1
Copy link
Author

Thanks for the report @alejandro-flores-1 - we can definitely update that particular character. Are you interested in sending over a PR for the change?

I'm just curious: Is there a reason you're using the ANSI_X3.4-1968 encoding on your machine? I want to better understand if this is something specific to you, if it's your linux distro, etc.

I can submit a PR for this. This encoding was never set manually. It seems to have defaulted to this encoding in our setup.

@alejandro-flores-1
Copy link
Author

I think you're going to have a lot less problems if you just use a UTF-8 encoding when running dbt. This might be enough fix the individual case of installing but we have unicode characters in dbt.

Using UTF-8 will solve the issue. However, simply removing this character also solves all of the issues. I've tried removing this character locally and installing dbt worked. dbt also ran successfully without any further issues.

@drewbanin drewbanin added wontfix Not a bug or out of scope for dbt-core and removed bug Something isn't working good_first_issue Straightforward + self-contained changes, good for new contributors! labels Sep 26, 2019
@drewbanin
Copy link
Contributor

I just changed this to a #wontfix - while I'd still be happy to merge a PR which changes this one particular character (i don't feel opinionated on a dash vs. an em-dash at all), there are surely going to be other such characters placed in and around the dbt codebase in the future. I'm not inclined to make a rule that we will never include such characters in the dbt codebase/readme, so going to close this on that principle.

@alejandro-flores-1 if you're able to sign the CLA, please do feel free to re-open the PR against dev/louisa-may-alcott.

@markberger
Copy link

Hi @drewbanin, I'm also encountering this bug. A better fix might be to require setuptools v40.1.0+ because this version supports unicode chars:

https://setuptools.readthedocs.io/en/latest/history.html#v40-1-0

It looks like the core package already requires this version of setuptools, so the enforcement just needs to occur in the overall setup.py:

https://github.com/fishtown-analytics/dbt/blob/dev/0.15.1/core/setup.py#L6

Would you be open to this change? If so I'm happy to open a PR. Thanks!

@drewbanin
Copy link
Contributor

Hey @markberger - thanks for the heads up! Check out the discussion over here: #1978

I think that the try/catch block present in each of the setup.py files should be sufficient for this purpose, right? Are you saying that the /setup.py in the repo should be updated with the same logic? I think I buy that, but would want to loop in @beckjake who is way more of an expert on python packaging than I am :)

@markberger
Copy link

I think that the try/catch block present in each of the setup.py files should be sufficient for this purpose, right? Are you saying that the /setup.py in the repo should be updated with the same logic?

Yes that is what I was trying to say, sorry it wasn't clear! I don't know much about python packaging though so I'll defer to Jake.

FWIW looks like it is also possible to check the version of setuptools directly as in the second solution here: https://stackoverflow.com/a/48049510

Either solution should fix this bug for me

@beckjake
Copy link
Contributor

Yeah, adding the same try/catch to setup.py sounds totally reasonable, so does checking the version directly. The former is probably easier.

@drewbanin drewbanin reopened this Jan 29, 2020
@drewbanin drewbanin removed the wontfix Not a bug or out of scope for dbt-core label Jan 29, 2020
@drewbanin drewbanin modified the milestones: 0.14.3, 0.15.2 Jan 29, 2020
@drewbanin
Copy link
Contributor

closed by #2076

@jtcohen6 jtcohen6 added the adapter_plugins Issues relating to third-party adapter plugins label Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
adapter_plugins Issues relating to third-party adapter plugins
Projects
None yet
Development

No branches or pull requests

5 participants