-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make data provider dependencies optional #199
Comments
Original comment by Reece Hart (Bitbucket: reece, GitHub: reece): We have considered this several times, but believe that this is not the right time to remove this dependency. The rationale was that mapping requires a data provider and that we expect mapping to be an oft-used feature. Thus, this decision is a balance between A) the cost of installing psycopg2 for people who don't use mapping or validation and B) the difficulty created for people who discover this dependency at runtime. We are considering providing a REST interface to UTA, which would obviate psycopg2 for public UTA access. Please put alternative views in comments. If the majority of people really are using only the parsing and formatting component -- that is, without mapping or validation -- then that would certainly sway this decision. We will continue to consider this suggestion. |
Original comment by Reece Hart (Bitbucket: reece, GitHub: reece): Update: Requiring postgresql native libs for all clients/users is clearly non-ideal. On the other hand, a significant portion of the utility of the package is lost without having access to gene, transcript, and sequence data. We will push to implement a REST-based data source for UTA soon, thus eliminating the libpq dependency entirely. |
Original comment by Maximilian Haeussler (Bitbucket: maximilianh, GitHub: maximilianh): For me at least, it would be good if you document the psql dependency. The |
Original comment by Maximilian Haeussler (Bitbucket: maximilianh, GitHub: maximilianh): I am on OSX/Homebrew: brew install postgresql |
This closes biocommons#623 Per the authors of the `psycopg2` package, `psycopg2-binary` "is meant for beginners to start playing with Python and PostgreSQL without the need to meet the build requirements." However, it's preferable to depend on `psycopg2` for production software. Issues in flight which should make this change irrelevant --------------------------------------------------------- - [make data provider dependencies optional][hgvs-optional-deps] - [replace psycopg2 with asyncpg][hgvs-asyncpg] - [use a REST interface, eliminating libpq dependency][hgvs-rest] ------------------------------------------------------------------------ Why move away from `psycopg2-binary`? ===================================== The psycopg2 project makes two recommendations in the [psycopg2 vs psycopg2-binary][psycopg2-vs-psycopg2-binary] docs: > If you are the maintainer of a published package depending on `psycopg2` you > shouldn’t use `psycopg2-binary` as a module dependency. Secondarily: > **For production use you are advised to use the source distribution.** This means that any production software depending on `hgvs` will have to install `psycopg2-binary`. History of the two packages =========================== Starting with Psycopg2 2.8, there are now two packages (explained in the [2.7.4 release notes][psycopg2-274]): - `psycopg2`: source distribution, advised for production. - `psycopg2-binary`: "quickest way to install" (a "pre-compiled binary version") There is substantial discussion in [an issue on the `psycopg2` project][gh-thread], which I won't re-summarize here, but one takeaway is that many Python packages now depend on `psycopg2-binary`, and others on `psycopg2`. Ideally, all libraries just depend on `psycopg2`. Alternate solutions to the split exist (like [`Provides-Dist` from PEP 345][provides-dist]). However, the [`Provides-Dist` field is "rarely used"][provides-dist-rarely-used] and thus ignored most of the time. [`hgvs` switched to `psycopg2-binary`][hgvs-switch-to-binary] shortly after the 2.7.4 release. [gh-thread]: psycopg/psycopg2#674 [hgvs-asyncpg]: biocommons#603 [hgvs-optional-deps]: biocommons#199 [hgvs-rest]: biocommons#199 (comment) [hgvs-switch-to-binary]: biocommons@4c75a4e5 [provides-dist-rarely-used]: https://packaging.python.org/specifications/core-metadata/#provides-dist-multiple-use [provides-dist]: https://www.python.org/dev/peps/pep-0345/#provides-dist-multiple-use [psycopg2-274]: https://www.psycopg.org/articles/2018/02/08/psycopg-274-released/ [psycopg2-vs-psycopg2-binary]: https://www.psycopg.org/docs/install.html#psycopg-vs-psycopg-binary
This closes #623 Per the authors of the `psycopg2` package, `psycopg2-binary` "is meant for beginners to start playing with Python and PostgreSQL without the need to meet the build requirements." However, it's preferable to depend on `psycopg2` for production software. Issues in flight which should make this change irrelevant --------------------------------------------------------- - [make data provider dependencies optional][hgvs-optional-deps] - [replace psycopg2 with asyncpg][hgvs-asyncpg] - [use a REST interface, eliminating libpq dependency][hgvs-rest] ------------------------------------------------------------------------ Why move away from `psycopg2-binary`? ===================================== The psycopg2 project makes two recommendations in the [psycopg2 vs psycopg2-binary][psycopg2-vs-psycopg2-binary] docs: > If you are the maintainer of a published package depending on `psycopg2` you > shouldn’t use `psycopg2-binary` as a module dependency. Secondarily: > **For production use you are advised to use the source distribution.** This means that any production software depending on `hgvs` will have to install `psycopg2-binary`. History of the two packages =========================== Starting with Psycopg2 2.8, there are now two packages (explained in the [2.7.4 release notes][psycopg2-274]): - `psycopg2`: source distribution, advised for production. - `psycopg2-binary`: "quickest way to install" (a "pre-compiled binary version") There is substantial discussion in [an issue on the `psycopg2` project][gh-thread], which I won't re-summarize here, but one takeaway is that many Python packages now depend on `psycopg2-binary`, and others on `psycopg2`. Ideally, all libraries just depend on `psycopg2`. Alternate solutions to the split exist (like [`Provides-Dist` from PEP 345][provides-dist]). However, the [`Provides-Dist` field is "rarely used"][provides-dist-rarely-used] and thus ignored most of the time. [`hgvs` switched to `psycopg2-binary`][hgvs-switch-to-binary] shortly after the 2.7.4 release. [gh-thread]: psycopg/psycopg2#674 [hgvs-asyncpg]: #603 [hgvs-optional-deps]: #199 [hgvs-rest]: #199 (comment) [hgvs-switch-to-binary]: 4c75a4e5 [provides-dist-rarely-used]: https://packaging.python.org/specifications/core-metadata/#provides-dist-multiple-use [provides-dist]: https://www.python.org/dev/peps/pep-0345/#provides-dist-multiple-use [psycopg2-274]: https://www.psycopg.org/articles/2018/02/08/psycopg-274-released/ [psycopg2-vs-psycopg2-binary]: https://www.psycopg.org/docs/install.html#psycopg-vs-psycopg-binary
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue was closed because it has been stalled for 7 days with no activity. |
I realize this is stale, but I wanted to add another "voice" to the discussion. I have been working with the hgvs library for the better part of a year now, mostly for the ability to have a consistent framework and language to describe sequence variants. The use case I am working on is more or less completely distinct from using any datasource (at this time). We are using it to track variations in synthetic sequences and proteins. We've actually released an open source package to assist with this kind of thing. The idea is to give it 2 sequences, and get back the variants between them in HGVS spec, based on a global alignment. The package is here: https://github.com/mammothbio-os/palamedes. It is mostly just a translation layer between a biopython Focusing in on this postgresql issue, we've already had a few folks (and the CI on our web app) mention that the postgres system dependency was a bit "big" for what our library provides directly (it depends on try:
import psycopg2
except ModuleNotFoundError:
print(
"ModuleNotFoundError raised when trying to import psycopg2, if you need dataproviders support please "
"install system dependencies and pip install with [dataproviders] extra!"
)
raise Getting around this would require the user to install the system deps as well as dependencies=[
"attrs >= 17.4.0", # https://github.com/biocommons/hgvs/issues/473
"biocommons.seqrepo >= 0.6.5",
"bioutils >= 0.4.0,<1.0",
"configparser >= 3.3.0",
"ipython",
"parsley",
"six",
]
[project.optional-dependencies]
dataproviders = ['psycopg2'] This would obviously be a breaking change, which is a huge downside. The other option I can see would be a "core" package with the models and parser, which is installed into the tl;dr |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
Originally reported by: Reece Hart (Bitbucket: reece, GitHub: reece)
Reviewer 3 suggested:
The text was updated successfully, but these errors were encountered: