Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken documentation link to mycorpus.txt #2874

Closed
wants to merge 3 commits into from
Closed

Fix broken documentation link to mycorpus.txt #2874

wants to merge 3 commits into from

Conversation

benbroks
Copy link

@benbroks benbroks commented Jul 7, 2020

Fixes #2872

Motivation: Old notebook was dependent on a URL for a text file. Wanted to make sure this was able to run locally. No has to download to run the notebook.

Fixes: Creates mycorpus.txt inline. Fixed an image reference at the end of the same notebook.

@piskvorky
Copy link
Owner

piskvorky commented Jul 7, 2020

@mpenkov this file is auto-generated, right? What was the process for updating the existing docs?

@gojomo
Copy link
Collaborator

gojomo commented Jul 10, 2020

Given that the file is only ~19 lines and ~69 words, I'm surprised it was ever set up to be downloaded from a remote URL!

@piskvorky
Copy link
Owner

piskvorky commented Jul 10, 2020

It was a demonstration of "remote input streaming" IIRC. That's what the surrounding text explains.

But not worth the trouble, probably. I put the file back now, in any case.

@mpenkov
Copy link
Collaborator

mpenkov commented Jul 18, 2020

Yes, the .ipynb is an autogenerated file. We should not be changing it directly. If we want to go ahead with this update, we have to update the .py file in docs/src/gallery.

Personally, I'm not sure of what the benefit of ensuring that notebook runs without a network connection out of the box, but if we do ahead with it, then:

  1. We should update the .py file and generate the .ipynb from it
  2. We don't need to import smart_open anymore.
  3. We don't need to write the corpus to a file. The generator can just yield the strings as necessary.

@mpenkov mpenkov added the documentation Current issue related to documentation label Jul 18, 2020
@benbroks
Copy link
Author

Apologies if this isn't along the lines of what you were looking for. Made the same changes in the .py file.

@mpenkov
Copy link
Collaborator

mpenkov commented Aug 16, 2020

@benbroks Sorry for the delay in reviewing this.

It looks like the documentation build is failing. Is it working for you locally?

If yes, then it's likely you haven't included all autogenerated files in this PR. At a glance, gensim/docs/src/auto_examples/core/run_corpora_and_vector_spaces.rst should also be included.

If no, then please make sure the docs build locally, and then update this PR.

@piskvorky
Copy link
Owner

piskvorky commented Jun 6, 2021

Ping @mpenkov this issue keeps coming back, people following the tutorial still get tripped up.

I placed the file under https://radimrehurek.com/mycorpus.txt, instead of https://radimrehurek.com/gensim/mycorpus.txt, so it doesn't get overwritten / deleted with each release. Do we just update the location in the tutorial(s)?

I'm not a fan of the approach from this PR – not portable (/tmp/…) and adds too much distracting setup (read/write files, irrelevant).

@piskvorky piskvorky changed the title Addressing Issue 2872 Fix broken link to mycorpus.txt Jun 6, 2021
@piskvorky piskvorky changed the title Fix broken link to mycorpus.txt Fix broken documentation link to mycorpus.txt Jun 6, 2021
@piskvorky piskvorky added this to the 4.1.0 milestone Jun 6, 2021
@mpenkov
Copy link
Collaborator

mpenkov commented Jun 6, 2021

We don't need to do anything with this PR, this is already resolved by #3148

@benbroks Thank you for your contribution, but the problem has already been resolved by another PR.

The public-facing documentation will get updated as soon as we do the next release.

@mpenkov mpenkov closed this Jun 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Current issue related to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Broken file link in run_corpora_and_vector_spaces tutorial
4 participants