Skip to content
This repository has been archived by the owner on Apr 30, 2021. It is now read-only.

CSL YAML biblio files don’t work without “references” header #124

Closed
njbart opened this issue Apr 13, 2015 · 13 comments
Closed

CSL YAML biblio files don’t work without “references” header #124

njbart opened this issue Apr 13, 2015 · 13 comments

Comments

@njbart
Copy link
Contributor

njbart commented Apr 13, 2015

CSL YAML biblio files produced from a valid CSL JSON file by json2yaml don’t work because the content is not wrapped in references/ --- and ....
I don’t think this should be a requirement for YAML files when they are explicitly identified as biblio files via the command line (--bibliography=) or via a bibliography: metadata value.

@jgm
Copy link
Owner

jgm commented May 3, 2015

Can you link to a sample CSL JSON file?

@njbart
Copy link
Contributor Author

njbart commented May 3, 2015

Sure. Try this short one: https://gist.github.com/nickbart1980/006edc5db336da89541f
Or try this one, https://raw.githubusercontent.com/zotero/styles-repo/master/include/data/previews.json (this has ids that start with a number and aren't wrapped in quotation marks, so it will have to be modified a bit first.)
Or convert any of your bibtex/biblatex files using pandoc-citeproc -j, or export some items from Zotero to "CSL JSON".

@jgm
Copy link
Owner

jgm commented May 3, 2015

Note that you can convert these CSL JSON files to pandoc YAML using pandoc-citeproc -y. I'm not sure why it's important to support a different path, via json2yaml, but it probably wouldn't be too hard to do.

@jgm
Copy link
Owner

jgm commented May 3, 2015

Just to amplify this a bit: pandoc-citeproc -y is the only right way to do this, since it will produce proper markup (pandoc markdown) in the yaml file. I don't believe json2yaml will behave correctly for things like emphasis.

@njbart
Copy link
Contributor Author

njbart commented May 4, 2015

I see. What I found attractive about using a generic json2yaml is to sort the arrays in a JSON file for better readability, and then use a json2yaml that preserves this order, such as https://github.com/drbild/json2yaml.
Of course this does not add the references/--- and ... bits, but I feel these are unnecessary anyway if the file is explicitly declared to be a biblio file (as opposed to a markdown file with inline metadata).
What I did not realise is that the markup is different, too, and while I can see that using pandoc markdown in inline metadata makes sense, I’m a little worried that such a CSL YAML file does not fully conform to the (admittedly as of yet unofficial) CSL conventions that use a HTML-like syntax (see https://www.zotero.org/support/kb/rich_text_bibliography). EDIT: Actually, it’s a little more official than that: see http://docs.citationstyles.org/en/1.0/release-notes.html#rich-text-markup-within-fields.
I’m not sure about the best way to handle this. Could pandoc-citeproc be made to parse HTML-like syntax, too? Any other thoughts?

@jgm
Copy link
Owner

jgm commented May 4, 2015

Pandoc's YAML reference format uses pandoc Markdown, rather than the
HTML-ish CSL conventions, for formatted text. This gives us
a lot more expressive power. I don't see a problem here, as
long as you don't expect one-one bidirectional conversion.
pandoc-citeproc can interpret the unofficial HTML-ish syntax when
it is reading CSL JSON, if I'm not mistaken.

The order of fields is another issue. It would be nice if
we could preserve what's in the JSON, but this seems
impossible since this passes through an internal
representation (a hash map) that does not preserve the
order.

@njbart
Copy link
Contributor Author

njbart commented May 5, 2015

OK, so it seems not much can be done at the moment. I submitted a pull request to update the documentation on this in https://github.com/jgm/pandoc/blob/master/README (BTW, http://pandoc.org/README.html has not been updated in a while …).
Other thoughts:

  1. It might be useful to add a comment to yaml files generated by pandoc-citeproc, maybe
# Citation Style Language (CSL) bibliographic database, in YAML format
# See http://citationstyles.org/ and http://pandoc.org/README.html#citations 
# for details.

Something similar could be added to CSL JSON output, too, but I’m not sure what comment format can be used in CSL JSON files.
2) Currently, pandoc-citeproc can convert to CSL JSON from every accepted biblio format, except from CSL YAML. Adding this would be nice (but is by no means a top priority).
3) As to controlling serialization order, you are probably aware of snoyberg/yaml#37 and https://hackage.haskell.org/package/yaml-0.8.10.1/docs/Data-Yaml-Builder.html.

@jgm
Copy link
Owner

jgm commented May 6, 2015

Good! Data.Yaml.Builder will allow us to specify the order of fields.
I don't think we have any way of preserving the field order that was in the CSL JSON input, because the JSON parser doesn't preserve order. But maybe we could just establish a canonical order of fields that we always use in YAML output? For example, the order here?

@jgm
Copy link
Owner

jgm commented May 6, 2015

This API is not so easy to work with (and it's marked as unstable), so this may need to go on the back burner.

@njbart
Copy link
Contributor Author

njbart commented May 8, 2015

Fair enough.

@jgm
Copy link
Owner

jgm commented May 8, 2015

Btw, I'm working on better YAML output with a more rational ordering of
fields. Should have this pretty soon. This may also make it possible
to have CSL JSON output.

+++ nickbart1980 [May 08 15 03:59 ]:

Fair enough.


Reply to this email directly or view it on GitHub:
#124 (comment)

@jgm
Copy link
Owner

jgm commented May 8, 2015

I have now pushed the changes for better YAML output (much
better, I think). Ids are always at the top, etc.

+++ nickbart1980 [May 08 15 03:59 ]:

Fair enough.


Reply to this email directly or view it on GitHub:
#124 (comment)

jgm added a commit that referenced this issue May 9, 2015
@jgm
Copy link
Owner

jgm commented May 9, 2015

OK, I'm going to close this.

  1. YAML output has been vastly improved. The fields are printed in a rational and predictable order. Blank lines are printed between entries.
  2. CSL JSON output has been tweaked to use the rich text markup conventions described in their documentation.
  3. You can now convert YAML files to CSL JSON (or to YAML, for pretty-printing) using pandoc-citeproc.
  4. Pandoc README (in the repo) has been updated. (This won't be updated on the website til the next pandoc version is released.)

@jgm jgm closed this as completed May 9, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants