Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"EAE timeout! EAE not running, or wrong folder?" Error #125

Closed
RobertHolstein opened this issue Mar 22, 2022 · 43 comments
Closed

"EAE timeout! EAE not running, or wrong folder?" Error #125

RobertHolstein opened this issue Mar 22, 2022 · 43 comments
Labels
bug Something isn't working

Comments

@RobertHolstein
Copy link

I'm getting the following error while trying to decode any stream:

[eac3_eae @ 0x7fb3ebb36b40] EAE timeout! EAE not running, or wrong folder? Could not read '/tmp/pms-32e985ec-1ab9-498f-b346-30ca02b0356f/EasyAudioEncoder/Convert to WAV (to 8ch or less)/j7b2umlssurwlcz1dzb8ba86_950-0-6.wav'
[eac3_eae @ 0x7fb3ebb36b40] error reading output
Error while decoding stream #0:1: I/O error

Full log: hjyw270k9k3dg3oc1dx9bqddm_logs.txt

I ran a 'chmod -R 777' on my entire plex folder to make sure it wasn't a permissions issue. My transcoder temporary directory that is set in the plex settings is '/tmp/transcode'.

This is the output from /tmp/pms-5180a806-8e56-46e9-ab20-2ac8f0e46c64/EasyAudioEncoder/Convert to WAV (to 8ch or less)/

root@docker1:/mnt/gvf2/plex# ls tmp/pms-5180a806-8e56-46e9-ab20-2ac8f0e46c64/EasyAudioEncoder/Convert\ to\ WAV\ \(to\ 8ch\ or\ less\)/
9hii6u6sfw4vvfowncgfck95_981-0-54.ec3  9hii6u6sfw4vvfowncgfck95_981-0-55.ec3

The WAV file isn't in there.

My plex stack: plexstack.txt

@RobertHolstein RobertHolstein added the bug Something isn't working label Mar 22, 2022
@pabloromeo
Copy link
Owner

I believe others have seen similar issues in the past. For example here: #41 (comment)

In that particular case, from what can also be seen in the linked content they posted, it might be related to using a network share that doesn't propagate fs events.
How is you /tmp shared between PMS and the worker, nfs, gluster, cifs?

In my case I use gluster and haven't seen those issues. The person seeing it was using nfs.

@pabloromeo
Copy link
Owner

One setting you might want to look into is TRANSCODE_EAE_LOCALLY. It's in the Readme.
That was a contribution by someone else, that when enabled forces EAE transcodes to occur on PMS instead of workers.
It was probably related to these sorts of issues.

@RobertHolstein
Copy link
Author

I use glusterfs for the tmp folder. I used the default settings when I setup gluster. Is there something on my gluster volume that I should enable to make this play nice?

@pabloromeo
Copy link
Owner

Not that I know of, I used default settings as well. Maybe the content I tested with doesn't require the EAE audio encoder, which is part of Plex, not something clusterplex interacts with directly or anything like that.

@pabloromeo
Copy link
Owner

I've been researching this a bit more. It might be necessary for us to manually start EAE on each worker.
I'll try to include that in, probably on the experimental branch and see how it goes.
Might ask you to test it with your content once that is available.

@RobertHolstein
Copy link
Author

I'll be more than happy to test it for you. Let me know when 👍

@github-actions
Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label Apr 25, 2022
@pabloromeo
Copy link
Owner

I do plan on looking into this further. I was able to download the appropriate EAE binary from Plex servers, however, there's an associated (and mandatory) license file which I still haven't determined where Plex gets it from.

@pabloromeo pabloromeo removed the stale Issue has been inactive for more than 30 days label Apr 25, 2022
@github-actions
Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label May 26, 2022
@pabloromeo pabloromeo removed the stale Issue has been inactive for more than 30 days label May 28, 2022
@github-actions
Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label Jun 28, 2022
@pabloromeo pabloromeo removed the stale Issue has been inactive for more than 30 days label Jul 4, 2022
@brandan-schmitz
Copy link
Contributor

Do you have an update on this? I have a few thousand media files that are impacted by this. Using the ENV variable does fix it, however when it transcodes locally the transcoding quality is limited to 720p at best, even with a 4k video file. This issue is what is holding me back from going from default Plex server to using this in my k8s cluster with CEPH.

@pabloromeo
Copy link
Owner

I did look into it for a while, but was never able to get a hold of the license file required to run EAE.
Meaning, we can get the Workers to download EAE from Plex's servers, but not the license file, which varies per architecture, it seems. And without it EAE won't start.
Unfortunately I've been very limited on free time to keep looking, but it's the next item in line as soon as I have more time.

@Seji64
Copy link
Contributor

Seji64 commented Aug 11, 2022

i would like to help. can you explain in a few setences whats the issue behind EAE?
In my setup the /tmp is shared via CIFS. Files which do not need EAE playing flawless with HW enconding (vaapi). The thing i don't get why those WAV files are not getting created. Who should create those in our case and which binary etc is getting called to create those?

I know many questions, but maybe it would be hepful for other people to understand the workflow - so we can maybe help :-)

@pabloromeo
Copy link
Owner

Oh, that'd be amazing!
Now, from what I can remember and just by trying to determine what Plex is doing behind the scenes, it appears that for certain content which gets identified as requiring that specific EAE encoder, it appears that Plex downloads the encoder from their servers, and starts the executable while transcoding.
Now, this encoder depends on the plex version and architecture, for example:
https://downloads.plex.tv/codecs/eae-69c1de6-42/linux-x86_64-standard/EasyAudioEncoder-linux-x86_64-standard.zip
is for Linux x86_64 and corresponds to version eae-69c1de6-42.

Now, the main issue I came across when attempting to download the corresponding EAE and run it on a Worker before transcoding, is that it won't start due to a missing "license" file which should live along-side it.
I believe if you look into any standalone Plex server, you'd end up finding EasyAudioEncoder and a License file next to it. But it's not provided within the ZIP file, so it must come from somewhere else. Haven't found the source yet.
Also it appears that the license file is different per architecture and/or Plex version. At least that's what I encountered when running PMS and the Worker on ARM architecture.

And given that technically with ClusterPlex your PMS and your workers don't necessarily need to match Architecture, it's not like we can just try to copy it off of the main PMS or something like that.

I'm not sure if Unicorn Transcoder ever got that part to work, maybe they did and we can look into how it was done.

If you want to give it a shot initially you can try copying EAE and the license file into a worker, starting it in the background and attempting a transcode in that Worker.

I believe that's how it works, but it's all based on a great deal of speculation on my part since none of it is open-source nor is there public info about it AFAIK.

@Seji64
Copy link
Contributor

Seji64 commented Aug 16, 2022

I found a way to grab a eae license!

its sniffed the plex traffic and it is pretty simple:

Generate any GUID ( https://www.guidgen.com/ ) make the following request:

//assume the generated GUID is dfea5cdb-893e-4595-809b-6d9793f4590a

root@a662fab564a1:~# curl -s "https://plex.tv/api/codecs/easyaudioencoder?build=linux-x86_64-standard&deviceId=dfea5cdb-893e-4595-809b-6d9793f4590a&oldestPreviousVersion=1%2E28%2E0%2E5999-97678ded3&version=1785" | xq

<?xml version="1.0" encoding="UTF-8"?>
<MediaContainer friendlyName="myPlex" identifier="com.plexapp.plugins.myplex" title="Codec Downloads" codec="easyaudioencoder" version="1785" size="1">
  <Codec url="https://downloads.plex.tv/codecs/1785/linux-x86_64-standard/EasyAudioEncoder-linux-x86_64-standard.zip" fileSha="9cb8b25a5532cb93867d9746630c432ae432d779" fileSha256="cd28898d0924572a1279489e3e2868b061b71214cfc8dc885354a5f011caf3e1" fileName="EasyAudioEncoder-linux-x86_64-standard.zip" build="linux-x86_64-standard" license="1662821240 0f41b7be6560ce71596ce4d5e41075a329262915f7c269ab35d8d59a81b1 00fb99f35cd8c26dd8e80a647e4d7492a97e6682a4ea80eed90a9d6978f465aa"/>
</MediaContainer>

copy license string 1662821240 0f41b7be6560ce71596ce4d5e41075a329262915f7c269ab35d8d59a81b1 00fb99f35cd8c26dd8e80a647e4d7492a97e6682a4ea80eed90a9d6978f465aa and put it in the license file :-)

//EDIT:

Quick and dirty bash which you could probably just copy paste into worker launch script:

EAEVERSION=1785 # i think this is static - i didn't find any method to get the latest version
UUID=$(cat /proc/sys/kernel/random/uuid)
LICENSEKEY=$(curl -s "https://plex.tv/api/codecs/easyaudioencoder?build=${CLUSTERPLEX_PLEX_CODEC_ARCH]&deviceId=${UUID}&oldestPreviousVersion={CLUSTERPLEX_PLEX_VERSION}&version=${EAEVERSION}" | grep -Po 'license="\K([A-Za-z0-9]{10}\s[A-Za-z0-9]{60}\s[A-Za-z0-9]{64})"')
LICENSEKEYPATH="/codecs/EasyAudioEncoder-${EAEVERSION}-${CLUSTERPLEX_PLEX_CODEC_ARCH}/EasyAudioEncoder/eae-license.txt"
echo $LICENSEKEY >> $LICENSEKEYPATH

@pabloromeo
Copy link
Owner

That's very cool!! I never got that far :)

Now, regarding the EAE Version, I believe it's actually not static, but rather bound to the plex version, and it has different values depending on the architecture as well if I remember correctly. The problem is that it used to be "extractable" (similar to the plex version) from the Plex Executable, but it no longer seems to be the case.
The worker launch script does in fact attempt the extraction, but if I remember correctly it would work for x86 but not for ARM, for example. But we'd have to retest on different architectures to see what works and what doesn't.
The extracted value is logged during the launch of the worker, using the name CLUSTERPLEX_PLEX_EAE_VERSION.

I believe Unicorn Transcoder has the static value fixed per version in the code, so they haven't been able to extract it either.

Maybe using an older version of EAE with a newer Plex version works, and we can fix it to a specific version. Don't know if that would work.

@Seji64
Copy link
Contributor

Seji64 commented Aug 28, 2022

I found a way to grab a eae license!

Or you can just launch plex and open any truehd video without HDR. e.g., snowboard.m2ts. But thanks for the script, just changing linux to windows worked great and is latest compared to what plex downloads.

think this is static - i didn't find any method to get the latest version

Well, my windows version is also 1795. 64 bit.

well, @pabloromeo could just you implement it with a static version as workaround? at least für x86_64 ? Should i send a pull request?

@pabloromeo
Copy link
Owner

So...I think I might have good news :)
It seems i've finally gotten this damn EasyAudioEncoder to work. Anyone up for testing this experimental implementation?
It's currently available in the experimental branch and releases of the worker dockermod, so basically just use: ghcr.io/pabloromeo/clusterplex_worker_dockermod:experimental

Then to actually turn on the experimental feature you'd need to add the following environment variable to the worker nodes:
EXP_EAE_SUPPORT = "true"

You might also have to remove TRANSCODE_EAE_LOCALLY on PMS, if you have it set, or else all EAE will continue to transcode locally.

One last thing: In order to get it to work I also had to change the transcoding location. Initially I was using two separate persistent volumes, one for /tmp and a separate one for /transcode. But unfortunately that didn't seem to work. I ended up going with a single ReadWriteMany volume /tmp and just set the transcode path to /temp/transcode.
I get the feeling EasyAudioEncoder isn't getting the inotify events when using separate volumes (at least when using longhorn).

@brandan-schmitz
Copy link
Contributor

brandan-schmitz commented Sep 2, 2022

Thank you for all your work on this! I have created a new deployment to test out the experimental functions and I so far it appears to be working on the worker side initially however then it starts encountering the EAE not running error after a while.

I have verified that the transcoder is working, content is being produced and updated in the /tmp/transcode/Transcode/Sessions folder. I can also see the generated WAV files being produced by EAE in the folder specified by the transcoding job.

root@plex-worker-v1-67d4bbb8f6-8b2xr:/tmp/transcode/Transcode/Sessions/plex-transcode-D12E6AD4-82D5-40C6-BC36-CC2925EE6665-ca54fafa-5c11-4aec-9c76-797352414a66# ls
header  media-00000.ts  media-00001.ts  media-00002.ts  media-00003.ts  media-00004.ts  media-00005.ts

root@plex-worker-v1-67d4bbb8f6-8b2xr:/tmp/pms-a8c4caf2-e862-4b58-8f95-44b777ecb8a0/EasyAudioEncoder/Convert to WAV (to 8ch or less)# ls
D12E6AD4-82D5-40C6-BC36-CC2925EE6665_2202-0-122.wav  D12E6AD4-82D5-40C6-BC36-CC2925EE6665_2202-0-142.ec3  D12E6AD4-82D5-40C6-BC36-CC2925EE6665_2202-0-142.wav  D12E6AD4-82D5-40C6-BC36-CC2925EE6665_2202-0-84.wav

I also collected the logs from the PMS instance: pms.log

I have configured the following settings in the Plex settings as well:

  • Transcoder -> Transcoder temporary directory: /tmp/transcode
  • Network -> List of IP addresses and networks that are allowed without auth: 10.0.0.0/8

For reference my kubernetes cluster is configured to use CEPH as the storage provider for my persistent volumes using the ceph-csi driver and provides the following volumes for this project:

  • clusterplex-config
    • Mounted at /config on the pms pod
    • ReadWriteOnce backed by CEPH RBD
  • clusterplex-backups
    • Mounted at /backups on the pms pod
    • ReadWriteOnce backed by CEPH RBD
  • clusterplex-transcode
    • Mounted at /tmp on both the pms and worker pods
    • ReadWriteMany backed by CEPH Filesystem
  • clusterplex-codecs
    • Mounted at /codecs on the worker pods
    • ReadWriteMany backed by CEPH Filesystem
  • clusterplex-media
    • Mounted at /data on both the pms and worker pods
    • ReadWriteMany backed by CEPH Filesystem
    • Mounted with readOnly: true
    • Contains a movies and tv folder containing all media.

@pabloromeo
Copy link
Owner

I see, yeah, my setup is similar, just using Longhorn for storage instead.
So I take it your content never actually starts playing right?
Because one thing I have noticed, is that playback begins and works without issues in my case, but when I seek during playback I do get one entry in the logs regarding EAE timeout due to a segment of audio not being present. But that's probably because I'm killing EAE along with the transcoder whenever it finishes (which occurs during a seek). However, in spite of those single entries for each time you seek, playback seems fine.

@brandan-schmitz
Copy link
Contributor

brandan-schmitz commented Sep 2, 2022

So I take it your content never actually starts playing right?

That is correct, no matter what device I utilize it will just sit there and spin. Usually it takes about 3 minutes of it doing that before I get a generic error that an error occurred during playback and it closes. And unfortunately I get nothing further in logs either in Plex or in the worker than what is shown in the pms log file I attached to my previous comment.

@pabloromeo
Copy link
Owner

And other content unrelated to EAE does transcode and play back correctly? Just to rule out any network config issues, or forced https issues around the progress callback to plex and all that. Plex is quite sensitive to network configs and tends to reject calls related to transcoding made to plex from anywhere except 127.0.0.1.

@pabloromeo
Copy link
Owner

I've currently tested 2 parallel transcodes using EAE on two different nodes and I'm currently testing the Plex Optimization/Conversion functionality, which seems to also work correctly with EAE.

@pabloromeo
Copy link
Owner

It seems I've spoken too soon. It is working as expected, in the fact that EAE is starting, it is transcoding audio segments and they are being streamed. Unfortunately from further tests it appears that sections of audio are being lost so audio cuts out every once in a while. In the logs I'm seeing the transcoder add segments of silence to fill in for errors reading or writing segments.
So it doesn't appear to be stable at the moment. I can't say if it's related to my specific infrastructure running it, the I/O speed of my distributed storage, or if it's something general related to the overall approach.

@pabloromeo
Copy link
Owner

pabloromeo commented Sep 2, 2022

Last update for today (and sorry for spamming the thread):
My bad, I was completely wrong about assuming /tmp needed to be shared across workers or even with PMS. I should have questioned that earlier on. There's no need.
Leaving /tmp alone and not using a distributed volume for it has stabilized audio, and all seems to work well, both in on-demand transcoding as well as through Optimize :)

@brandan-schmitz
Copy link
Contributor

So to clarify, the /tmp does not need to be shared at all between any of the components? Does this also include the transcode directory as that has been set in the tmp directory?

@pabloromeo
Copy link
Owner

/transcode does need to be shared since Plex PMS picks up the segments from there during streaming. So you can mount a RWX volume in that path (or a different one if you prefer) and set that as the transcoding path. No need to include it within /tmp. You can just let that be internal to the worker and not worry about it.

pabloromeo added a commit that referenced this issue Sep 3, 2022
Adding support for EAE.
The new default is for it to be enabled but if issues occur it can be turned off by setting the environment variable EAE_SUPPORT in the Workers to "false"

Solves #125 and #155
pabloromeo added a commit that referenced this issue Sep 3, 2022
Adding support for EAE.
The new default is for it to be enabled but if issues occur it can be turned off by setting the environment variable EAE_SUPPORT in the Workers to "false"

Solves #125 and #155
@pabloromeo
Copy link
Owner

Version v1.3.17 now has EAE support turned on by default :)

https://github.com/pabloromeo/clusterplex/releases/tag/v1.3.17

@brandan-schmitz
Copy link
Contributor

I have updated deployment, I can see that the transcode jobs do indeed start and I am not seeing any timeout issues anymore. However I believe I must indeed have some networking issue between my containers as I still am not getting any playback to start.

I have created a GIST containing the manifest file I use to deploy this, maybe you want to compare what I have to yours for obvious issues? Please see my comment in there with the notes. If you want to move this to another issue or somewhere else let me know.

https://gist.github.com/brandan-schmitz/7f775e30c17022caf62e3cccb5c07606

Note: I formatted it as a template kinda thing so others could potentially use it as a guide once it is working.

@pabloromeo
Copy link
Owner

What might be the cause in your case, is that in your configMap you are referencing PMS using it's service name:

PMS_IP: clusterplex-pms.plex-system.svc

Eventhough that works from a networking perspective, stupid plex rejects the traffic I believe. The same thing happened to me a few days ago when I migrated to k8s.
I got it to work by referencing the fixed IP managed by MetalLB instead.
If I'm not mistaken, in Plex's Logs you'd see it rejecting traffic saying it didn't recognize the service name included in the Host header, or something along those lines.
If you want, open a new issue, and share some Plex logs, just to see if it's related to networking or if it's something else.

@flopon
Copy link

flopon commented Oct 4, 2022

For some time now, my setup was broken.
I decided to quickly fix it by locking the clusterplex version to one i'm sure it was working, and indeed, it fixed my problem.

Today I take some time to analyze what is the problem with newer versions, and it seems to be related to the EAE update: when EAE support is activated, the folowing command in worker.js fail :
fs.mkdirSync(processedEnvironmentVariables.EAE_ROOT, { recursive: true });
It seems that EAE_ROOT is undefined, and, as consequence, the worker crash.

PS: Congratulation for the birth of your daughter !

@pabloromeo
Copy link
Owner

Thanks @flopon! :)
It's odd that you mention that EAE_ROOT is undefined. That environment variable is managed and sent by Plex itself.
If in your PMS you set TRANSCODER_VERBOSE to "1", in the logs you should see all the parameters being sent to the orchestrator, which should include EAE_ROOT.
Is it possible that EAE_ROOT does have a value but the worker doesn't have write permissions to create that directory?
EAE_ROOT will most likely be within /tmp, and as of the latest versions of ClusterPlex /tmp doesn't need to be shared across workers or with PMS, just the Transcode path specified in the PMS config needs to be shared (and the Media content, obviously). Maybe there's some misconfiguration there or a permissions issue on the paths the Worker is attempting to write to.

@pabloromeo
Copy link
Owner

In a few hours I'll be releasing a new version of the workers that includes the EAE_ROOT path in the logs, which could help troubleshoot the issue.

@flopon
Copy link

flopon commented Oct 13, 2022

Thank you :)

I've tried your experimental version, and got the following message in the log:

EAE Support - Creating EAE_ROOT destination => undefined

@flopon
Copy link

flopon commented Oct 13, 2022

And it's perfectly working if I add EAE_ROOT: "/transcode/EasyAudioEncoder" in my PMS compose file env var.

Strange !

@pabloromeo
Copy link
Owner

I've been able to reproduce the issue I believe, let me try to get a fix out in the experimental branch.

@pabloromeo
Copy link
Owner

Just tested the fix and it appears to be working. Will be merging to dev and later to master and creating a new release.
Thanks for reporting it @flopon!

@pabloromeo
Copy link
Owner

Check out release 1.4.1, which includes the fix: https://github.com/pabloromeo/clusterplex/releases/tag/v1.4.1

@flopon
Copy link

flopon commented Oct 13, 2022

1.4.1 tested, perfectly working.

Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants
@brandan-schmitz @flopon @Seji64 @RobertHolstein @pabloromeo and others