Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gstreamer-less flac encoder and tagging #121

Merged
merged 6 commits into from
Feb 2, 2017

Conversation

MerlijnWajer
Copy link
Collaborator

@MerlijnWajer MerlijnWajer commented Jan 29, 2017

Two commits - the first commit perform the FLAC encoding using the Xiph.org 'flac' program. Because whipper/morituri's gstreamer 'encoding' code also tagged - the second commit adds tagging using mutagen.

Note that these two commits mostly do not remove gstreamer code (the tag one does, however). We need to add an equivalent of the CRC32 task (or simply remove it?). Once we have that in place, as far as I am concerned, we can start removing ALL the gstreamer code.

See #29

This adds a FlacEncodeTask that encodes wave files to flac files.
This commit also replaces morituri's EncodeTask with FlacEncodeTask, however, in
morituri, EncodeTask also does the tagging.

FlacEncodeTask will not perform the tagging.
So we will need an extra task for the tagging - this will be added soon.

Meanwhile, do not merge this commit to master yet.
@MerlijnWajer
Copy link
Collaborator Author

Please do not merge it yet until a few people have tested/verified it.

@MerlijnWajer
Copy link
Collaborator Author

MerlijnWajer commented Jan 29, 2017

It works for me, at least the encoding and the tagging does. I didn't yet test the ARC calculation on these files, mostly because I forgot to put "accuraterip-checksum" in my PATH, and it's taking some time to re-run this task now, and I want to go to bed.

Testers wanted.

@MerlijnWajer
Copy link
Collaborator Author

Apparently accuraterip-checksum can already read flac files by itself, TMYK.

Anyway, it works for me.

Replace the gstreamer tagging code with mutagen tagging code.
getTagList is rewritten to return a dictionary of tags, which are then simply
passed to mutagen.

The way it is set up right now is not the best - I don't think it makes sense
for tagging to take place in program/cdparanoia.py ; but this is how the current
code did it.

I suggest that, when we rip out all the gstreamer code, we also move the tagging
to a more sensible place; and then also make the tagging not be an actual
'task.Task'.
@MerlijnWajer
Copy link
Collaborator Author

MerlijnWajer commented Jan 30, 2017

This code should perform the CRC32 task: master...MerlijnWajer:crc32

Only works on wave files at this point. Should not be a problem, I think.
@MerlijnWajer
Copy link
Collaborator Author

Added CRC32 code for testing purposes. I am able to rip CDs fine with this branch.

"""
try:
check_call(['flac', '--totally-silent', '--verify', '-o', outfile,
'-f', infile])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, could we get compression option configuration? I’d like to compress with --best, and unless we make it the default, this will require a config option. More generally, I suppose people might want to change flac call parameters, so having it in config would be great. ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue for using --best and not allowing for customization.

And please, let's not add anything because "someone might want to do X". Add features when there's a documented use-case applicable to more than a few users.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--best is spending a lot of CPU cycles (and electricity) for a very minor gain in size. If there is no configuration choice, I'd argue for using the default FLAC settings.

Copy link

@IvanDSM IvanDSM Jan 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FLAC -8 (--best) isn't that slow on current machines. FLAC -5 is not a smart option, it often does produce FLAC files that are bigger than -8 by 1MB or more. "a lot of CPU cycles" is a very big exaggeration in my opinion. We aren't in the Pentium III era anymore, on semi-current machines the time difference between -5 and -8 is negligible.

As for @tobbez's idea of disabling customization, that's a very bad idea. There are people who will want -5, there are other people who will want -8, there are people who want ReplayGain during encode, there are people who don't. There are people who will want "-e", there are people who won't. Those are the two most common use cases that justify enabling option customization. These use cases are why EAC, XLD, CUERipper and dBpowerAmp (the four most reputable rippers) allow customization. Are we really going to convince people to ditch EAC on Wine for whipper when our response to "How do I add the replaygain option to the FLAC encoding process on whipper?" is "Change the source code and recompile it"?

Adding a customization option for this isn't bloat, it's a fundamental feature.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must be sure that whatever we encode to initially is the same as what we read from the CD, since we use the FLACs for accurate rip checksums. So anything that changes the contents is a bad idea.

I don't understand why people can't run something like this:

for i in $(ls *.flac); do flac -d $i -o foo.wav && flac --best --custom-option-1 foo.wav -o $1; done

Copy link
Contributor

@RecursiveForest RecursiveForest Jan 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe the goal of whipper is to attract or support users who are never going to learn how to use a proper Unix operating environment. There is no reason to add more lines of code to support a non-essential "feature" when users can do the same thing with a tiny shell script. No information about the disc/CDDA is lost by changing the compression option, so in my opinion it's bloat that flies in the face of the Unix philosophy.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A script would be a good idea, but then the encoding process would be ran twice. Is a customization option (a simple entry in the configuration file with the encoding parameters, for example) that much of bloat really?

Oh, and before I forget, congratulations on the work on removing gstreamer! Great progress!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for i in $(ls *.flac); do flac -d $i -o foo.wav && flac --best --custom-option-1 foo.wav -o $1; done

^ Not safe. [1]

Rather use:
for i in *.flac; do flac -d "$i" -o foo.wav && flac --best --custom-option-1 foo.wav -o "$1"; done

Because, for example:

$ ls -1
flac file1.flac
flac file2.flac
other file.mp3

$ for i in $(ls *.flac); do echo $i ; done
flac
file1.flac
flac
file2.flac

# but:
$ for i in *.flac; do echo "$i" ; done
flac file1.flac
flac file2.flac

So maybe newcomers should not necessarily be expected to be proficient in something that even experienced users can slip sometimes...

[1] http://mywiki.wooledge.org/ParsingLs

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use --best? There is no reduction in quality, the internal check sums still work and it only takes 2 seconds to encode a track which is essentially nothing when compared to the fact that it takes 20-30 minutes to actually rip the data from CD. Conversely as pointed out above bash scripting is hackish, error prone and requires everyone to come up with their own solution to the same problem. Including the script in whipper would probably be about the same if not more cognitive load / LOC than adding an option, and certainly more than just hard-coding --best.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, wasn't morituri using gstreamer's --best equivalent?

@MerlijnWajer
Copy link
Collaborator Author

I am fine with making it the default. I am not sure if it is a sane default, however. I'm also fine with making it a config option - but preferrably after gst code is out. Shouldn't be too hard for someone to pick this up if they care.

tags['ALBUMARTIST'] = albumArtist.encode('utf-8')
tags['ARTIST'] = trackArtist.encode('utf-8')
tags['TITLE'] = title.encode('utf-8')
tags['DISC'] = disc.encode('utf-8')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.encode('utf-8') should be removed from all of these. mutagen prefers unicode, and will encode the tags when writing them.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that mutagen prefers UTF-16 instead? UTF-8 is the most common Unicode codepage.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is what tobbez said. He just said that mutagen itself will encode it to utf-8.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see now! Never mind my comment then, I'm dumb!

Copy link
Collaborator

@JoeLametta JoeLametta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace:

  • tags['DISC'] -> tags['ALBUM']
  • tags['musicbrainz-trackid'] -> tags['MUSICBRAINZ_TRACKID']
  • ret["musicbrainz-artistid"] -> ret['MUSICBRAINZ_ARTISTID']
  • tags['musicbrainz-artistid'] -> tags['MUSICBRAINZ_ARTISTID']
  • ret["musicbrainz-albumid"] -> ret['MUSICBRAINZ_ALBUMID']
  • tags['musicbrainz-albumid'] -> tags['MUSICBRAINZ_ALBUMID']
  • ret["musicbrainz-albumartistid"] -> ret['MUSICBRAINZ_ALBUMARTISTID']
  • tags['musicbrainz-albumartistid'] -> tags['MUSICBRAINZ_ALBUMARTISTID']
  • ret["musicbrainz-discid"] -> ret['MUSICBRAINZ_DISCID']
  • tags['musicbrainz-discid'] -> tags['MUSICBRAINZ_DISCID']

Remove:

  • FLAC = 'flac' (morituri/program/flac.py, line 6)

"""
try:
check_call(['flac', '--totally-silent', '--verify', '-o', outfile,
'-f', infile])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using --silent instead of --totally-silent and capturing any errors written (stderr=subprocess.PIPE). Would be much more helpful to know that flac failed with a specific error, rather than just the fact that it failed.

@JoeLametta
Copy link
Collaborator

I've tested the without-gstreamer branch against whipper drive analyze, whipper cd rip, whipper offset find (a CD without AR entries but no crash) and it seems to be working fine. Image subcommands are failing but I don't know if that's caused by this PR...

@MerlijnWajer
Copy link
Collaborator Author

The without-gstreamer definitely breaks some image features: retagging is one of them. Thanks for testing. The without-gstreamer commit was more to prove that we have all the required functionality ready, without gstreamer. It wasn't meant to be merged yet.

@MerlijnWajer
Copy link
Collaborator Author

I think I've dealt with most suggestions. I've not included '--best' right now, and I am not yet parsing the flac stderr - but noted that we should do that soon.

As far as I am concerned, this is ready for merging.

@JoeLametta JoeLametta merged commit 6ddb5d0 into whipper-team:master Feb 2, 2017
@MerlijnWajer
Copy link
Collaborator Author

MerlijnWajer commented Feb 4, 2017 via email

@MerlijnWajer
Copy link
Collaborator Author

MerlijnWajer commented Feb 5, 2017 via email

@Freso
Copy link
Member

Freso commented Feb 5, 2017

FWIW, I've updated my rip-dish.sh script to re‐encode FLAC files after the rip. (In addition to all the other things the script does… I think I might port it to Python one of these days.) (And it's also switched to using whipper rather than rip.)

@JoeLametta JoeLametta mentioned this pull request Aug 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants