Extended export plugin support #3400

austinmm · 2019-10-11T02:43:08Z

The original export plugin only exported as JSON so I added the ability to export in CSV or XML format. To do this I created a new option -f or --format which specifies the export format, the default is still JSON. For example...
$ beet export -f csv -i "title,album" ACDC
I also updated the plugin documentation to reflect the changes I've made to the plugin. Lastly, I also created a unit test for the export plugin; since one hadn't been created yet.

…so XML and CSV

This update reflects the code changes I made to the export plugin

sampsyo · 2019-10-12T13:41:56Z

Cool; thanks for getting this started!

Here's one question for context: while I can totally see the utility of a CSV export, I'm struggling to think of a use case for an XML export. Can you elaborate a little more on what you want to use CSV and XML data for? I think that would help calibrate our approach to these features.

sampsyo · 2019-10-12T13:43:20Z

beetsplug/export.py

@@ -40,6 +42,9 @@ class ExportPlugin(BeetsPlugin):

    def __init__(self):
        super(ExportPlugin, self).__init__()
+        # Used when testing export plugin
+        self.run_results = None


Hmm; I don't think I see where this is actually used?

I was going to use that class field as a way of checking the output during the testing of the export plugin but I ended up going another route so I will go ahead and remove it.

sampsyo · 2019-10-12T13:44:17Z

beetsplug/export.py

@@ -40,6 +42,9 @@ class ExportPlugin(BeetsPlugin):

    def __init__(self):
        super(ExportPlugin, self).__init__()
+        # Used when testing export plugin
+        self.run_results = None
+        self.export_format = None


I also don't quite see why this is a class field instead of just being a local variable in the run method. Can we please revert to the old way?

I will revert it.

sampsyo · 2019-10-12T13:45:45Z

beetsplug/export.py

+                    'indent': 4,
+                    'separators': ('>'),
+                    'sort_keys': True
+                }


I don't think any of these config options are actually used in the CSV or XML emitters.

I was just trying to follow the same format, but you are right they are not being used. I have updated them to now be format specific to json, csv and xml and updated the csv and xml outputs to use them...

'json': { # json module formatting options 'formatting': { 'ensure_ascii': False, 'indent': 4, 'separators': (',', ': '), 'sort_keys': True } }, 'csv': { # csv module formatting options 'formatting': { 'delimiter': ',', # column seperator 'dialect': 'excel', # the name of the dialect to use 'quotechar':'|' } }, 'xml': { # xml module formatting options 'formatting': { 'encoding': 'UTF-8', # the output encoding 'xml_declaration':'True', # controls if an XML declaration should be added to the file 'default_namespace': 'None', # sets the default XML namespace (for “xmlns”) 'method': 'xml', # either "xml", "html" or "text" (default is "xml") 'short_empty_elements': 'True' # controls the formatting of elements that contain no content. }

sampsyo · 2019-10-12T13:46:23Z

beetsplug/export.py

@@ -78,17 +101,21 @@ def commands(self):
            u'-o', u'--output',
            help=u'path for the output file. If not given, will print the data'
        )
+        cmd.parser.add_option(
+            u'-f', u'--format', default='json',
+            help=u'specify the format of the exported data. Your options are json (deafult), csv, and xml'


This could be shorter: perhaps, "the output format: json (default), csv, or xml".

Sounds good

sampsyo · 2019-10-12T13:46:44Z

beetsplug/export.py

        file_mode = 'a' if opts.append else 'w'
+        file_format = opts.format


Please preserve the effect of default_format.

would you prefer I get rid of the default=json in...

cmd.parser.add_option( u'-f', u'--format', default='json', help=u"the output format: json (default), csv, or xml" )

and instead have something like...

file_format = opts.format if opts.format else self.config['default_format'].get(str)

sampsyo · 2019-10-12T13:50:58Z

beetsplug/export.py

+    """Saves in a json file"""
+    def __init__(self, file_path, file_mode=u'w', encoding=u'utf-8'):
+        super(JsonFormat, self).__init__(file_path, file_mode, encoding)
+        self.export = self.export_to_file if self.path else self.export_to_terminal


This technique where a method is defined conditionally is somewhat surprising. I now see how it works after some puzzling about it, but it could use an explanatory comment.

Sounds good, I'll add some comments explaining its purpose

sampsyo · 2019-10-12T13:53:38Z

beetsplug/export.py


+    def export_to_file(self, data, **kwargs):
+        with codecs.open(self.path, self.mode, self.encoding) as f:


I think it would be really useful to factor out the file-opening part of the logic so it doesn't need to be re-implemented for every export format. In general, the output format and output destination (file or stdout) seem orthogonal.

I'll explore that idea. Sounds like a better way

sampsyo · 2019-10-12T13:55:48Z

beetsplug/export.py


    def export(self, data, **kwargs):
-        json.dump(data, sys.stdout, cls=ExportEncoder, **kwargs)
+        if data and type(data) is list and len(data) > 0:


I'm a little confused by the list type check here—when will the passed object not be a list? And what happens when it's not—so self.header remains empty?

Also, this condition could be simplified to:

if data and isinstance(data, list):

because data is only "true" if it is non-empty.

I was just trying to be extra cautious but I completely agree that it is a little overkill, this should work just fine instead; if data and len(data) > 0:

sampsyo · 2019-10-12T13:56:19Z

docs/plugins/export.rst

+
+      $ beet export -f csv beatles
+      $ beet export -f json beatles
+      $ beet export -f xml beatles


I think we can do without the full command-line examples here. Can we just list the options?

Sounds good

sampsyo · 2019-10-12T13:56:33Z

docs/plugins/export.rst

+                ensure_ascii: False
+                indent: 4
+                separators: ['>']
+                sort_keys: true


As above, I don't think these options are actually used for the new formats.

I updated them to work correctly now with the added formats CSV and XML

austinmm · 2019-10-12T21:08:44Z

I am still working on all the changes but almost done. Thank you for all the feedback as it was very helpful and constructive. Also to address your question about the XML format. The reason I added the ability to export in the XML format was that this is the preferred export format used by Apple in Apple Music and simply figured that if Apple does it then there must exist a good reason. For CSV exporting, I imagined that some people, including myself, might want to use Excel tools to better analyze their music library.

austinmm · 2019-10-12T21:49:54Z

I have just pushed the changes that were discussed above. Please let me know if you recommend any other changes. :)

sampsyo

Looking pretty good so far!

Could you please look into the CI results above? We use a style checker to enforce PEP-8 conventions.

One other thing that the style checker won't catch: can you please try to use complete sentences in comments, complete with a capital letter and a period? It would be nice to be consistent with the way (most) other comments in beets are written.

sampsyo · 2019-10-12T22:04:33Z

beetsplug/export.py

+                'formatting': {
+                    'delimiter': ',', # column seperator
+                    'dialect': 'excel', # the name of the dialect to use
+                    'quotechar': '|'


I think you probably want quotechar to default to "? In fact, it doesn't seem like the most useful thing to want to override… maybe we can get away without this option.

Sounds good, I'll remove it

sampsyo · 2019-10-12T22:05:47Z

beetsplug/export.py

+                # csv module formatting options
+                'formatting': {
+                    'delimiter': ',', # column seperator
+                    'dialect': 'excel', # the name of the dialect to use


This option seems useful enough! But maybe it's worth linking (in the documentation) to the place in the Python docs where dialects are described?

Oh yes I forgot to do that, thank you for reminding me

sampsyo · 2019-10-12T22:06:24Z

beetsplug/export.py


+    def export(self, data, **kwargs):
+        if data and len(data) > 0:


Suggested change

if data and len(data) > 0:

if data:

sampsyo · 2019-10-12T22:08:10Z

docs/plugins/export.rst

+                ensure_ascii: False
+                indent: 4
+                separators: ['>']
+                sort_keys: true


I believe these examples are now out of sync with the implementation. It would also be good to describe the options in English above.

…d up code to better follow PEP-8 conventions and just work more efficiently all around.

austinmm · 2019-10-12T22:41:43Z

I have made the changes you suggested alongside some other just general code cleanup. I am not sure where I might be disobeying the PEP-8 conventions so I apologize if I am anywhere. I am still new to CIs and so would love any recommendations you might have so I can pass the two CI test.

sampsyo · 2019-10-13T12:44:10Z

For the CI results, take a look at GitHub's "checks" interface on this page. It has "Details" links to the logs from Travis and CircleCI. For example, here's the log from the style checker:
https://travis-ci.org/beetbox/beets/jobs/597112966

You can also consider installing flake8 and running it locally.

Also, about the docs: while the text seems good, could you please move these from inline comments in the example YAML to the prose text above? There's already a section there with a bulleted list describing the options, before the page gets to the YAML example. That would be a nice place to add similar descriptions of the new options.

…ustinmm/beets into Extended_Export_Plugin_Support Pulling updated exort.rst doc

austinmm · 2019-10-13T23:04:11Z

Both the test_export.py and export.py pass the flake8 test so I'm not sure why there is still an issue with the CLI. How do I run the test_export.py on my computer because if I just run python test/test_export.py it doesn't compile correctly and neither do any of the other python tests files?

sampsyo · 2019-10-14T18:56:36Z

We have a wiki page about testing that might help! You might want to try using Tox or check out the section about test dependencies.

You can also see the results of the tests directly on Travis:
https://travis-ci.org/beetbox/beets/jobs/597334455

austinmm · 2019-10-15T07:34:29Z

I was able to get all the tests to pass :). Please let me know if you want me to update anything else, but I believe it is production-ready.

sampsyo

Looks good overall! I left a few more comments inline.

I also took a look at the exported XML format. It seems to look like this:

<library>
  <key>tracks</key>
  <dict>
    <key>0</key>
    <dict>
      <Track ID>0</Track ID>
      <artist>...</artist>
      ...
    </dict>
    <key>1</key>
    ...
  </dict>
</library>

But I guess I would have expected there to be a tag called <tracks> that contained individual <track> elements with the attributes of each. That would be more XML-like—I'm not sure I see the utility of the extra level of "dict" indirection here.

Also, "Track ID" is not a valid XML tag name—they can't contain spaces.

sampsyo · 2019-10-15T11:25:38Z

beetsplug/export.py

+                    # Controls if XML declaration should be added to the file.
+                    # 'xml_declaration': True,


Looks like this is no longer used and can be deleted?

sampsyo · 2019-10-15T11:26:19Z

beetsplug/export.py

+        file_format = opts.format if opts.format else \
+            self.config['default_format'].get(str)


This should suffice: opts.format or self.config['default_format'].get(str)

sounds good

sampsyo · 2019-10-15T11:27:33Z

beetsplug/export.py

+        """ self.out_stream =
+                sys.stdout if path doesn't exit
+                codecs.open(..) else
+        """


Perhaps this is out of date? Or it can be rephrased as a natural-language comment?

Here is the new comment...
# creates a file object to write/append or sets to stdout

sampsyo · 2019-10-15T11:29:52Z

beetsplug/export.py

+                    track_details = ET.SubElement(track_dict, key)
+                    track_details.text = value
+
+        # tree = ET.ElementTree(element=library)


Out of date?

sampsyo · 2019-10-15T11:32:14Z

docs/plugins/export.rst


- **separators**: A ``[item_separator, dict_separator]`` tuple.
+    - **dialect**: A dialect, in the context of reading and writing CSVs, is a construct that allows you to create, store, and re-use various formatting parameters for your data.


Instead of duplicating the text from the Python CSV documentation, can we instead link to the explanation there?

I will link to an external description but I would imagine it is still good to have some description in the document itself. I just shortened it a little. Let me know if you want me to get rid of the description altogether.

sampsyo · 2019-10-15T11:32:58Z

docs/plugins/export.rst


- **sort_keys**: Sorts the keys in JSON dictionaries.
+- **XML Formatting**
+    - **encoding**: Use encoding="unicode" to generate a Unicode string (otherwise, a bytestring is generated).


Hmm… how does this affect the usage of the plugin? Because we're writing to a file, it seems like the kind of Python string that gets generated is irrelevant. Perhaps there's an argument to be made that this should not be configurable in that case?

I agree it is not a config option now.

sampsyo · 2019-10-15T11:33:54Z

docs/plugins/export.rst

+- **XML Formatting**
+    - **encoding**: Use encoding="unicode" to generate a Unicode string (otherwise, a bytestring is generated).
+
+    - **xml_declaration**: Controls if an XML declaration should be added to the file. Use False for never, True for always, None for only if not US-ASCII or UTF-8 or Unicode (default is None).


TBH I'm not sure if I know what an "XML declaration" is. Is this something that people will actually want to control?

Its been removed

sampsyo · 2019-10-15T11:34:42Z

docs/plugins/export.rst

+
+    - **xml_declaration**: Controls if an XML declaration should be added to the file. Use False for never, True for always, None for only if not US-ASCII or UTF-8 or Unicode (default is None).
+
+    - **method**: Can be either "xml", "html" or "text" (default is "xml")


Maybe it's worth explaining what these options actually do? I'm not sure how setting method to text will affect the output, for example.

If the user changes the config to use text or HTML then the XML structure built will reformat to be displayed as text or as HTML. It kind of adds two other export formats unintentionally...
XML:

<library><key>tracks</key><dict><key>0</key><dict><Track ID>0</Track ID><title>Catch Me</title><album>BRÅVES</album></dict><key>1</key><dict><Track ID>1</Track ID><title>A Toast</title><album>BRÅVES</album></dict><key>2</key><dict><Track ID>2</Track ID><title>Joan of Arc</title><album>BRÅVES</album></dict></dict></library>

Text:

tracks00Catch MeBRÅVES11A ToastBRÅVES22Joan of ArcBRÅVES

HTML:

<library><key>tracks</key><dict><key>0</key><dict><Track ID>0</Track ID><title>Catch Me</title><album>BRÅVES</album></dict><key>1</key><dict><Track ID>1</Track ID><title>A Toast</title><album>BRÅVES</album></dict><key>2</key><dict><Track ID>2</Track ID><title>Joan of Arc</title><album>BRÅVES</album></dict></dict></library>

Hmm… correct me if I'm wrong, but isn't the HTML example here identical to the XML example? Is there a difference I'm missing?

Also, TBH, the text format doesn't look very useful—it's just all the values strung together, without spacing…

In this case, the HTML is the same but I don't think that is always the case. I agree the text is not very useful. Would you like me to remove this option?

I think so! Unless you can think of a compelling reason why someone would want to use it, let's take it out.

austinmm · 2019-10-15T20:12:33Z

I believe I have successfully implemented all your recommended changes. Let me know if I need to make any other changes. Also, here is the updated XML format...

<library>
    <tracks>
        <track>
            <title>Catch Me</title>
            <album>BRÅVES</album>
            ....
        </track>
        ....
    </tracks>
</library>

sampsyo · 2019-10-15T21:12:15Z

Cool! Thanks for continuing to make progress!

I have only two more things in mind:

We'll need a changelog entry about the new feature.
I'm still uncertain about the utility of XML export formats. It seems to me that "HTML" export is exactly the same as the "XML" format of XML, isn't it? And the text format doesn't seem all that useful to me…

austinmm · 2019-10-15T21:14:15Z

How do we create a changelog entry?

sampsyo · 2019-10-15T21:15:21Z

There's a file in docs/changelog.rst that serves that purpose. 😃

austinmm · 2019-10-15T22:46:16Z

Let me know if I didn't create the changelog entry correctly or if you notice anything else that needs fixing :)

sampsyo

Looks great! Thanks for your continued effort on this. I made some tiny comments which I will apply now before merging.

sampsyo · 2019-10-16T02:45:48Z

docs/plugins/export.rst

+- **CSV Formatting**
+    - **delimiter**: Used as the separating character between fields. The default value is a comma (,).
+
+    - **dialect**: A dialect is a construct that allows you to create, store, and re-use various formatting parameters for your data.


Suggested change

- **dialect**: A dialect is a construct that allows you to create, store, and re-use various formatting parameters for your data.

- **dialect**: The kind of CSV file to produce. The default is `excel`.

Keeping this focused on what matters to users of this plugin, not to Python programmers.

sampsyo · 2019-10-16T02:46:36Z

docs/plugins/export.rst

@@ -62,4 +77,8 @@ The default options look like this::
                ensure_ascii: False
                indent: 4
                separators: [',' , ': ']
-                sort_keys: true
+                sort_keys: True


Lower case is OK here—it’s YAML, not Python.

Suggested change

sort_keys: True

sort_keys: true

sampsyo · 2019-10-16T02:47:03Z

docs/plugins/export.rst

+        csv:
+            formatting:
+                delimiter: ','
+                dialect: 'excel'


Suggested change

dialect: 'excel'

dialect: excel

No quotes are usually necessary in YAML.

sampsyo · 2019-10-16T02:51:49Z

Arg, turns out GitHub won’t let me commit my own suggestions on my phone. :( But I will wrap this up when I’m back at a normal computer.

austinmm · 2019-10-16T02:54:05Z

Your changes look great! Thank you for all the help you provided me with this. This was my first time contributing to an open-source project so all your assistance was very much appreciated.

sampsyo · 2019-10-16T18:30:22Z

Wow! In that case, thank you for your diligent efforts, congratulations, and welcome to the world of open source. 😃 Nicely done!

Austin Marino and others added 6 commits October 9, 2019 00:52

Extended the file type export options to include not only JSON but al…

e5b43d4

…so XML and CSV

Updated export.rst

93e1026

This update reflects the code changes I made to the export plugin

Added examples of new format option

4dfb6b9

Updated config formats

a8a480a

Addec Unit test for export plugin

8ff875b

Updated class fields to allow for easier unit testing

c31b488

sampsyo requested changes Oct 12, 2019

View reviewed changes

Made changes to reflect comments and suggestions made by sampsyo

0e2c1e0

sampsyo requested changes Oct 12, 2019

View reviewed changes

Updated documents and comments to reflcet recent code changes. Cleane…

ec705fa

…d up code to better follow PEP-8 conventions and just work more efficiently all around.

Added Comments to formating configurations

c5ebbe0

Austin Marino and others added 4 commits October 13, 2019 11:36

Aligned export related code with flake8 standards

fa2c9ba

Merge branch 'Extended_Export_Plugin_Support' of https://github.com/a…

160e4db

…ustinmm/beets into Extended_Export_Plugin_Support Pulling updated exort.rst doc

updated format descriptions

294b3cd

Updated Format description layout

4f0a2b7

Austin Marino added 6 commits October 14, 2019 13:57

Updated tests

db5d216

Ran test to ensure it works

0d818ec

Updated export doc

5193f1b

Updated test structure for export

a9440ad

Fixed styling and test to pass cli

07138f8

updated the way in which xml is outputted

4251ff7

sampsyo reviewed Oct 15, 2019

View reviewed changes

Austin Marino added 4 commits October 15, 2019 11:45

Cleaned up comments and code

eb6055e

Updated Test structure

21d8091

Updated Test structure

623f553

Updated to reflect code changes and updated styling/format

d86e31d

Austin Marino and others added 4 commits October 15, 2019 14:16

removed xml configs from doc and code

7f6630c

Updated changelog to reflect export plugin changes

d7b0e93

fixed conflicting files issues with changelog

5d7c937

Merge branch 'master' into Extended_Export_Plugin_Support

c1b646f

sampsyo reviewed Oct 16, 2019

View reviewed changes

sampsyo added 2 commits October 16, 2019 14:27

Docs fixes from my code review

d45b8bb

Docs simplifications for #3400

2291778

sampsyo merged commit a1d1265 into beetbox:master Oct 16, 2019

		file_mode = 'a' if opts.append else 'w'
		file_format = opts.format


		def export_to_file(self, data, **kwargs):
		with codecs.open(self.path, self.mode, self.encoding) as f:

		# Controls if XML declaration should be added to the file.
		# 'xml_declaration': True,

		file_format = opts.format if opts.format else \
		self.config['default_format'].get(str)


		- separators: A ``[item_separator, dict_separator]`` tuple.
		- dialect: A dialect, in the context of reading and writing CSVs, is a construct that allows you to create, store, and re-use various formatting parameters for your data.


		- xml_declaration: Controls if an XML declaration should be added to the file. Use False for never, True for always, None for only if not US-ASCII or UTF-8 or Unicode (default is None).

		- method: Can be either "xml", "html" or "text" (default is "xml")

	- dialect: A dialect is a construct that allows you to create, store, and re-use various formatting parameters for your data.
	- dialect: The kind of CSV file to produce. The default is `excel`.

Extended export plugin support #3400

Extended export plugin support #3400

Conversation

austinmm commented Oct 11, 2019

sampsyo commented Oct 12, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austinmm Oct 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austinmm commented Oct 12, 2019

austinmm commented Oct 12, 2019

sampsyo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austinmm commented Oct 12, 2019

sampsyo commented Oct 13, 2019

austinmm commented Oct 13, 2019

sampsyo commented Oct 14, 2019

austinmm commented Oct 15, 2019

sampsyo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austinmm Oct 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sampsyo Oct 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austinmm commented Oct 15, 2019

sampsyo commented Oct 15, 2019

austinmm commented Oct 15, 2019

sampsyo commented Oct 15, 2019

austinmm commented Oct 15, 2019

sampsyo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sampsyo commented Oct 16, 2019

austinmm commented Oct 16, 2019

sampsyo commented Oct 16, 2019

austinmm Oct 12, 2019 •

edited

Loading

austinmm Oct 15, 2019 •

edited

Loading

sampsyo Oct 15, 2019 •

edited

Loading