Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prepare as a python package #20

Closed
davanstrien opened this issue Oct 28, 2021 · 5 comments
Closed

prepare as a python package #20

davanstrien opened this issue Oct 28, 2021 · 5 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@davanstrien
Copy link
Member

No description provided.

@andrewphilipsmith
Copy link
Contributor

andrewphilipsmith commented Jan 27, 2022

What format of python package do we which to use for publishing alto2txt?

  • The README.md recommends installing in an Anaconda environment. As far as I can tell, there aren't any conda specific dependencies (I haven't checked any of the Spark-related code). If we recommend conda, then should we create a conda package and publish to conda-forge?
  • If we publish to PyPI, should we remove the conda recommendation in the README?
  • Is there a use case for publishing to both PyPI and conda-forge?

Apologies if I've missed something obvious here.

@mialondon
Copy link
Contributor

Does it matter which version of ALTO files were created in? e.g. some parts of the BL are moving to 4.2, but there's a lot of material in v2.

@davanstrien
Copy link
Member Author

Does it matter which version of ALTO files were created in? e.g. some parts of the BL are moving to 4.2, but there's a lot of material in v2.

You can see the current XSLTs here https://github.com/Living-with-machines/alto2txt/tree/master/extract_text/xslts. Depending on the changes in 4.2 it might be quite simple to adapt existing ones or it might be a lot of work. I won't have time to look into this soon. @GiorgiatolfoBL has a much better grasp on XML than me so might be able to give a sense of what changes (if any) are involved.

@andrewphilipsmith
Copy link
Contributor

Does it matter which version of ALTO files were created in? e.g. some parts of the BL are moving to 4.2, but there's a lot of material in v2.

I don't think it matters for this issue. My question was intended to be about the options for publishing alto2txt itself as a software package. I'll reword it to make it clearer.

However, within the readme/other documentation for alto2txt, it should be explicit which versions of ALTO are and are not supported. Also, within issue #18, we should include tests that cover XML produced by different versions of ALTO.

@andrewphilipsmith andrewphilipsmith added this to the v0.3.1 milestone Jun 30, 2022
@andrewphilipsmith andrewphilipsmith self-assigned this Jun 30, 2022
@griff-rees
Copy link
Collaborator

The package is up on PyPI: https://pypi.org/project/alto2txt/ which may be enough to close this. If there are objections to its current release on ticket #48 that has implications here.

@griff-rees griff-rees added the enhancement New feature or request label Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants