Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding DCML v2 parsing to tsvConverter.py #1267

Merged
merged 32 commits into from
Aug 12, 2022

Conversation

malcolmsailor
Copy link
Contributor

See discussion at #1214

There is an outstanding issue converting from music21 to the DCML format, as discussed in #1214. I do not believe that this pull request introduces this issue. Rather, I wrote a test that uncovers the issue.

@malcolmsailor
Copy link
Contributor Author

Actually, I just noticed that the commented-out text in the linked code is out-of-date: the test no longer fails in quite the same way---I've fixed the issue where #vii became vii became bvii. But quality and inversion information seems not to propagate correctly, so 'viio7' becomes 'vii'. (This is all for v2; for v1, the test seems to fail more dramatically.)

@coveralls
Copy link

coveralls commented Apr 12, 2022

Coverage Status

Coverage increased (+0.01%) to 93.021% when pulling bf8ff44 on malcolmsailor:master into c538fb8 on cuthbertLab:master.

@malcolmsailor
Copy link
Contributor Author

Whoops I missed pylint and flake in the contributing guidelines. I will address their objections.

Copy link
Member

@mscuthbert mscuthbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone through a bit of the PR -- there are a lot of changes that are not necessary for the issue. Please limit changes to only what has to be changed to make the new version work. I'm reviewing a part of the code base that I am not intimate with and lots of changes of spacing, etc. slow down the reviewing. I stopped at line 200/300 of the tsvConverter. Please contribute documentation and tests if you want this to be accepted. Thank you!

music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
@malcolmsailor
Copy link
Contributor Author

Regarding the formatting changes to parts of the code I did not otherwise edit, I believe what happened was that I have black configured to run in my text editor on saving Python code and I neglected to turn that off before saving this file so it changed a great deal of the formatting without my noticing it. Sorry about that. I will try to restore the formatting and otherwise address your helpful comments. Thanks for your time!

@malcolmsailor
Copy link
Contributor Author

I made a new commit in response to your comments:

  1. I restored the original formatting in the otherwise unchanged portions. Again, sorry about the inadvertent auto-format.
  2. TabChord and TabChordV2 both inherit from a new abstract base class TabChordBase; attributes are defined explicitly rather than dynamically in the respective init functions.
  3. I tried to address your other comments.

I also made a few improvements. In particular, I addressed the bug discussed in #1214 by implementing writing to the form and figbass columns of tsv files, so that the test where we convert tsv -> m21 -> tsv -> m21 and then compare the m21 streams now works. (With one small issue concerning augmented sixth chords explained in a comment in the file.)

@MarkGotham
Copy link
Contributor

Hi all. Where are we with this? Ready to go? I'm happy to pitch in with dev. as necessary.

@malcolmsailor
Copy link
Contributor Author

Hi all. Where are we with this? Ready to go? I'm happy to pitch in with dev. as necessary.

As far as I know it is ready to go; I did my best to address Michael's previous comments. If I missed anything or there are other outstanding issues I'm happy to address them too.

Copy link
Member

@mscuthbert mscuthbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Many good things. Be sure that you're familiar with music21 coding style before contributing a large quantity of code to the repo, as opposed to making an external tool. Since I'll be taking responsibility for maintenance of this tool indefinitely, it needs to conform closely to existing style, docs, and testing. Thanks!

music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
@malcolmsailor
Copy link
Contributor Author

malcolmsailor commented Aug 3, 2022

I did my best to address all your comments, Michael, and added a new PR.

-1. Do not count on users being connected to the net to run tests. Create small, comprehensive test files that do not slow down testing.

Running on an actual ABC corpus file was giving me more confidence in the code, but I absolutely understand why that would be a bad idea. So today (in a separate script) I did tsv -> music21 -> tsv -> music21 conversion on the entire ABC corpus, and then compared the music21 streams to make sure they were the same. There were quite a few harmonies that caused problems (mostly to do with vi in minor keys) so I added these to the test files tsvEg_v2major.tsv and tsvEg_v2minor.tsv as appropriate. Thanks to this I am confident that this PR makes the conversion quite a bit more accurate.

Copy link
Member

@mscuthbert mscuthbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments and notices. Good work-- wanted to get the t.List etc. in for typing before you go too far.

music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
@malcolmsailor
Copy link
Contributor Author

when you're done with all parsing but before running makeMeasures, run Stream.extendDurations(harmony.Harmony, inPlace=True) if you'd like. Or if measures are already made, it'll be a bit harder, but still possible.

The code doesn't call makeMeasures at all. The relevant methods (which I didn't write) seem to be TsvHandler.toM21Stream and TsvHandler.prepStream. The latter creates measures and inserts them into the stream, then the former populates these with the chords. I tried calling extendDuration on the whole stream but that didn't seem to work. I also tried

for m in self.m21stream.recurse().getElementsByClass(stream.Measure):
    m.extendDuration(harmony.Harmony, inPlace=True)

That didn't seem to work either. Any pointers would be appreciated!

@mscuthbert
Copy link
Member

for m in self.m21stream.recurse().getElementsByClass(stream.Measure):
    m.extendDuration(harmony.Harmony, inPlace=True)

That didn't seem to work either. Any pointers would be appreciated!

Try self.m21stream.flatten().extendDuration(harmony.Harmony, inPlace=True)

@johentsch
Copy link

Using the current version @ 2f7db83, I wrote a small notebook to compare the chord tones expressed by the DCML labels with those of the m21 chords. Currently, there is divergence for all DCML labels containing chord tone replacement.

Here a few examples from n01op18-1_01. Left DCML, right the converted music21.roman.RomanNumeral (dots stand for congruence):

V(64) <-> V
('C', 'F', 'A') != ('C', 'E', 'G')
.....
#viio7(6)/vi <-> #viio7/vi
('C#', 'E', 'A', 'Bb') != ('C##', 'E#', 'G#', 'B')

#viio7/vi <-> #viio7/vi
('C#', 'E', 'G', 'Bb') != ('C##', 'E#', 'G#', 'B')

#viio7(4)/ii <-> #viio7/ii
('F#', 'Bb', 'C', 'Eb') != ('F##', 'A#', 'C#', 'E')

#viio7/ii <-> #viio7/ii
('F#', 'A', 'C', 'Eb') != ('F##', 'A#', 'C#', 'E')

ii6(11#7b6) <-> ii6
('Bb', 'Eb', 'F#') != ('Bb', 'D', 'G')

Do the RomanNumeral objects provide a way to express chord tone replacement? If not, could it be an idea to create chord objects from the actual pitch class collections (left-hand) to maintain that information when creating the stream?

The handling of vi and vii in minor contexts is currently not working correctly by default (I know there must be a setting somewhere). Since the DCML labels express scale degrees of the natural minor scale it would probably make sense to fix the behaviour to 'flat' for both 6 and 7.

@malcolmsailor
Copy link
Contributor Author

I'm sorry about all the double-quotes, it is pretty deeply ingrained habit to use them. I've mostly gotten the hang of going through and replacing them after making changes for this project but yesterday I forgot. Have you thought about using something like https://github.com/zheller/flake8-quotes/ for the sake of people like me?

Coverage is very good, but look at https://coveralls.io/builds/51509206/source?filename=music21%2FromanText%2FtsvConverter.py#L269 and see that there are test lines that are not being run, and the regexp for Mm7 isn't being tested. Most of the other uncovered lines are pretty trivial.

The test lines that were not being run turn out to be obsolete, so I removed them. And I believe the revised doctest for _changeRepresentation should now cover the regexp for Mm7.

@malcolmsailor
Copy link
Contributor Author

As you may observe the workflow tests seem to be failing on the following test:

FAIL: testPickleMidi (freezeThaw.Test)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/music21/music21/music21/freezeThaw.py", line 1226, in testPickleMidi
    self.assertEqual(d.parts[1].flatten().notes[20].volume._client.__class__.__name__,
AssertionError: 'ReferenceType' != 'weakref'
- ReferenceType
+ weakref

When i run multiprocessTest.py in my local environment I don't get any such error. And I just merged the latest master in this morning. So I'm not sure why the error is happening, or how to reproduce it for further debugging. It doesn't seem to have anything to do with this PR but of course I can't be sure. Any suggestions?

@mscuthbert
Copy link
Member

Have you thought about using something like https://github.com/zheller/flake8-quotes/ for the sake of people like me?

I did not know about this! Adding it! Thanks!

@mscuthbert
Copy link
Member

The error seems to be a change in Python 3.10 version naming for weakreferences. Pushing a fix on another branch.

@mscuthbert
Copy link
Member

Apparently this happened python/cpython#79512

@mscuthbert
Copy link
Member

Closing and reopening because of bugs in Github right now.

@MarkGotham
Copy link
Contributor

Hey @malcolmsailor, all. Picking up a few points here.

  1. Ger6: M21 has a mapping method for getting aug 6ths from the literal roman numeral figures for which they stand as shorthand. So one approach would be to create the Roman numerals and then run chord.isAugmentedSixth().
  2. Write Roman: thanks for spotting that. I'm happy to deal with it.
  3. Single/double quotes. Double does seem to be more standard. Is m21 firmly committed to single or is there any prospect of moving over? A switch would needs careful handling for cases like single within double (or vice versa) but otherwise doesn't affect functionality, so not a breaking change, right?

Please let me know if I've missed anything else – thanks!

@mscuthbert
Copy link
Member

mscuthbert commented Aug 11, 2022

  1. Single/double quotes. Double does seem to be more standard. Is m21 firmly committed to single or is there any prospect of moving over? A switch would needs careful handling for cases like single within double (or vice versa) but otherwise doesn't affect functionality, so not a breaking change, right?

Music21 is firmly committed to single quotes and camelCase.

String Quotes

In Python, single-quoted strings and double-quoted strings are the same. This PEP 
does not make a recommendation for this. Pick a rule and stick to it. When a string 
contains single or double quote characters, however, use the other one to avoid 
backslashes in the string. It improves readability.

Music21 does not follow PEP 257 because (a) I didn't know about it in 2005 when we started, and (b) the first major dev and I both find single quotes more easily counted to make sure there are three of them.

The nice part of starting a project is that you get to make the rules. I don't like following others' unless they have a very good reason for them. Music21 uses CamelCase because the first version of m21 was written in Perl where that was the standard. See https://github.com/cuthbertLab/music21/blob/master/CONTRIBUTING.md

Copy link
Member

@mscuthbert mscuthbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome -- I believe that everything I wanted except the GNU regex rewrite has been done. I would put the rewrite in myself, except I don't know what two of the capture groups are doing:

re.findall(
    r'''(
          (\+|-)?   # front alterations, or whatever this is.
          (\^|v)?   # what is this?
          (#+|b+)?    # sharps or flats
          (1\d|\d)   # numbers 0-19, in practice, 1-13
    )''', added_tones, re.VERBOSE)

It seems like members of the DCML, TSV, Annotated Beethoven Corpus, are still weighing in with suggestions and potential improvements. My thought, unless there are objections, it to merge this great contribution as soon as that fix is in, and then to continue the discussion in an issue and future pull requests. We're over 150 conversation points on this PR as it is, and it's time to let people try this out with real data and see what we all can do with it.

(But if there are objections let them be heard).

music21/romanText/tsvConverter.py Show resolved Hide resolved
music21/romanText/tsvConverter.py Show resolved Hide resolved
music21/romanText/tsvConverter.py Outdated Show resolved Hide resolved
@johentsch
Copy link

...except I don't know what two of the capture groups are doing

A suggestion how the comments in the verbose regEx could look like:

re.findall(
    r'''(
          (\+|-)?   # added non-chord or removed chord tone
          (\^|v)?   # (up/down) inverting the default direction of chord tone replacement
          (#+|b+)?  # sharps or flats altering the local key's scale degree
          (1\d|\d)  # numbers 0-19, in practice, 1-14
    )''', added_tones, re.VERBOSE)

@malcolmsailor
Copy link
Contributor Author

Awesome -- I believe that everything I wanted except the GNU regex rewrite has been done.

I've done this, but as I did I noticed there also a few edge cases where that function is not matching what DCML expects so I'm fixing that up. I'll have a commit shortly.

@MarkGotham
Copy link
Contributor

Thanks @malcolmsailor , @johentsch , @mscuthbert.

+1 for merge, further testing, and new smaller issue-discussions as required.

@malcolmsailor
Copy link
Contributor Author

malcolmsailor commented Aug 11, 2022

handleAddedTones now gives the same result as @johentsch's ms3 library parser in the vast majority of cases in the ABC corpus. When it doesn't, it is often due to #1369. In any case the remaining disparities are very rare and all concern highly unusual chords. I have a script (based on a notebook from Johannes) to compare the ms3 and music21 output that I can share with anyone who wants to continue working on that.

As far as I can tell, the flake8 and mypy failures are not due to code introduced by me.

@mscuthbert
Copy link
Member

As far as I can tell, the flake8 and mypy failures are not due to code introduced by me.

yeah. I did a wrong branch push yesterday. I've now introduced branch protection to avoid that happening again. Will merge after required tests pass.

@mscuthbert mscuthbert merged commit 60cdb76 into cuthbertLab:master Aug 12, 2022
@mscuthbert
Copy link
Member

Merged! Congrats!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants