ensure import routines include glossary #37

bdolor · 2018-07-31T16:25:22Z

With the introduction of a new post type we need to ensure the import routines are capable of bringing over the data.

How do we ensure that a shortcode in the content area in the old book that was referencing a glossary term with id='99 is still referencing a glossary term that exists in the new book?

bdolor · 2018-08-09T18:51:57Z

XML will bring in glossary terms if they exist via a641351

Including glossary terms in the cloning operation is being handed off to the PB team.

All other import items will bring glossary items if they are represented as html in a page but will not bring in the glossary terms, since they aren't represented anywhere in an epub, docx, or odt file as they are in xml or via the api. Work on using shortcodes to facilitate import has some momentum...pressbooks@330bee8 will discuss with PB team to understand if including [pb_glossary] shortcode fits in with the use case that class-complex was intended to address.

@josieg - let's look at this ticket as done for now. Please verify that XML imports do bring in glossary terms. Feel free to verify that other export file import the HTML display of glossary terms, but know that if it's HTML, the answer is likely yes, it works.

josieg · 2018-08-09T20:48:54Z

This works. I tested importing via XML, epub, and html files. XML brought in glossary terms. The epub and html brought in the html.

A few questions which might not be solvable:

What if someone tries to import a book with glossary terms via an XML file into their existing book that also has glossary terms? These books have totally different glossary terms, but two of the glossary terms have the same ID.

Is there a fair chance of this happening?
If so, would that cause a big problem?
In my testing, I imported the xml file into the same book that produced that xml file. It created two of all the terms, but didn't change the IDs of the new terms added. Obviously this isn't a likely scenario, but it may demonstrate the behaviour if a book has two different terms with the same ID.

When importing via an XML file, I am give the option of what pages I want to import. The glossary terms all appear as individual "Glossary" items with no label describing what term is which (See screenshot). This makes it impossible to import just one term, it would either be all or nothing. This could be a problem if a person only wants to import a small section of the book. Can we add the glossary term titles to these?
Also, what if I don't select any of the glossary terms when importing, just the chapters. The shortcode with the glossary ID within the content is still imported, but now there is no corresponding glossary term. Does this have the potential to cause problems or will it just make the editor view look messy? Here is a book where I did this in: https://pressbooksdev.bccampus.ca/testingimpotwglossary/

bdolor · 2018-08-09T21:05:43Z

Thanks Josie - number 2 is the most relevant. Will look to address that and then provide responses to the others once #2 is complete.

bdolor · 2018-08-09T22:53:15Z

once PR #47 is merged, it will take care of 2. To address 1 re: glossary terms being imported into a book with glossary terms and their being an ID collision. For xml import anyways, an ID collision will never happen. Similar to how chapters are created on import...it creates a new post for every new glossary term. The act of creating a new post creates a new unique ID, therefore the relationship between the old ID and the new ID is broken. Everything except the id of the old glossary term is carried over is another way of putting it.

The above process describes the challenge of number 3. Similar to how images have to be 'scraped and kneaded' during the import process, a similarly aggressive routine might be considered for glossary term reference found as [pb_glossary id='32']apple[/pb_glossary] in the content area. For some added complexity there's no guarantee at the point of the import process that the combination of glossary term and corresponding chapter (or vice versa) will be selected. Nevertheless, looking for shortcodes in content and transforming them to meaningful html is the point of the the new class-complex.php but AFAIK hasn't been implemented yet. It would seem that adding [pb_glossary] to class-complex.php would be reasonable, but I have to confirm with them. Plus, they might want to do it. At any rate, it would only work when importing from xml or or the api when glossary ID's refer to something real. In xhtml or any flavour of html (besides web), a reference to a glossary id is meaningless/doesn't go anywhere. Hope that helps.

bdolor · 2018-08-09T23:09:54Z

@josieg - can be (re)validated

josieg · 2018-08-10T15:14:55Z

Looks great!

bdolor added the BIG - 7 label Jul 31, 2018

bdolor self-assigned this Aug 1, 2018

bdolor removed their assignment Aug 9, 2018

bdolor mentioned this issue Aug 9, 2018

fix xml import, individual glossary term selection #47

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ensure import routines include glossary #37

ensure import routines include glossary #37

bdolor commented Jul 31, 2018

bdolor commented Aug 9, 2018

josieg commented Aug 9, 2018

bdolor commented Aug 9, 2018

bdolor commented Aug 9, 2018 •

edited

Loading

bdolor commented Aug 9, 2018

josieg commented Aug 10, 2018

ensure import routines include glossary #37

ensure import routines include glossary #37

Comments

bdolor commented Jul 31, 2018

bdolor commented Aug 9, 2018

josieg commented Aug 9, 2018

bdolor commented Aug 9, 2018

bdolor commented Aug 9, 2018 • edited Loading

bdolor commented Aug 9, 2018

josieg commented Aug 10, 2018

bdolor commented Aug 9, 2018 •

edited

Loading