Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue transcoding at the worker #155

Closed
teachjing opened this issue Jul 13, 2022 · 11 comments
Closed

Issue transcoding at the worker #155

teachjing opened this issue Jul 13, 2022 · 11 comments
Labels
bug Something isn't working stale Issue has been inactive for more than 30 days

Comments

@teachjing
Copy link

teachjing commented Jul 13, 2022

Describe the bug
A clear and concise description of what the bug is.

After some troubleshooting, I finally managed to get call to the worker but the worker has trouble running the transcoder. I logged into the worker and I was able to see it is there. This seems to be the last step if you got any ideas that would be great.

To Reproduce
Steps to reproduce the behavior:
Unsure how to reproduce as it may be specific to my docker swarm stack. These plex files for each worker are locally on their machines.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Orchestrator screenshot
image

Worker specific error I located
image

Desktop (please complete the following information):
Docker nodes are running on proxmox cluster.

Additional context

I am feeling it has to the with the how the URLs was parsed because I also had issues mounting the transcoder with the recommened swarm template and had to revert to your worker template and that seem download all the codecs properly to the shared folder.

Again I am unsure where the problem lies, but the worker gets the call, but maybe there is trouble loading the file or providing the right url ?

I don't know if its relevant but the worker nodes don't seem to have the plex config fully installed ?
image

@teachjing teachjing added the bug Something isn't working label Jul 13, 2022
@pabloromeo
Copy link
Owner

Any chance you could describe a bit more how it is that you are running each component as well as the architecture used? For example, if you have a compose file for these services, if you are using the official linuxserver images and adding the dockermods to them, if the media content is shared and available at the same location on the worker nodes, etc.
It's usually a bit difficult to troubleshoot since each deployment is different and Plex offers little to no information when its transcoder fails.
Something that might help is, within Plex enabling Debug logging, and looking at its logs in the web Console while attempting to transcode.
In your case Plex seems to be correctly communicating with the Orchestrator and the Orchestration with the worker. What's failing seems to be the actual transcode call, but that could be due to a bunch of different things.

You mention having problems with the default swarm template. What issues were those?

Regarding your question, it's normal for Plex to not be configured on the workers, that's expected. In fact Plex itself isn't used at all, just its transcoder, which is called directly. But there's no need for workers to have anything set up in regards to Libraries, the database or Application Support.

A couple of things to consider, is to review the networking configuration between the workers and PMS, and see if you've followed the recommendations around sharing Temp and Transcode locations across all of them, making the Media content shared as well, setting the IP ranges allowed without auth, and setting PMS_IP to the appropriate value (you can use a service name as well if you prefer).

@pabloromeo
Copy link
Owner

BTW, codecs themselves don't necessarily need to be shared across workers. It's just convenient in order to avoid each worker downloading independent copies of them, but it should work fine without that too.

@brandan-schmitz
Copy link
Contributor

brandan-schmitz commented Jul 21, 2022

I encountered this issue as well. I am running my cluster in Kubernetes with CEPH as a storage backend. I used the docker-swarm as a guide for creating 3 deployment specs under an application specification. What I encountered was that the permissions in the docker container on the workers was not allowing it to save the codecs it was trying to download if you do not have a volume mounted at the /codecs directory. The error produced was caused by it unable to find the codec file.

@teachjing you might want to check your container startup logs to see if you are also having the issue where it is getting permission denied errors when trying to download the codecs. If you are, I was able to fix the issue by mounting a shared volume between my workers at the codecs directory. However I would imagine any supported volume type such as a local volume would also work depending on your setup.

@pabloromeo
Copy link
Owner

pabloromeo commented Jul 23, 2022

@brandan-schmitz do you happen to have logs around that permission issue on /codecs ? Were you running the clusterplex docker image or the official linuxserver one with the clusterplex dockermod on top?
If you have more info maybe we can find a quick fix to implement within the process downloading codecs.

@AngellusMortis
Copy link

AngellusMortis commented Jul 25, 2022

Based on @brandan-schmitz's comment I was able to confirm it was definitely the codecs as well myself. This is what the log entries look like:

2022-07-25 08:46:13	Starting Plex Media Server.
2022-07-25 08:46:14	CLUSTERPLEX_PLEX_CODECS_VERSION => 'd53cb63-4323'
2022-07-25 08:46:14	CLUSTERPLEX_PLEX_VERSION => '1.27.2.5929-a806c5905'
2022-07-25 08:46:14	PLEX_ARCH => 'amd64'Show context
2022-07-25 08:46:14	CLUSTERPLEX_PLEX_EAE_VERSION => 'eae-69c1de6-42'
2022-07-25 08:46:14	CLUSTERPLEX_PLEX_CODEC_ARCH => linux-x86_64-standard
2022-07-25 08:46:14	Codec location => /codecs/d53cb63-4323-linux-x86_64-standard
2022-07-25 08:46:14	mkdir: cannot create directory ‘/codecs’
2022-07-25 08:46:14	: Permission denied
2022-07-25 08:46:14	/usr/lib/plexmediaserver/Plex Media Server: line 38: cd: /codecs/d53cb63-4323-linux-x86_64-standard: No such file or directory

Following by the following repeated for each different codec:

2022-07-25 01:37:44	Codec libsonic_decoder.so does not exist. Downloading...
2022-07-25 01:37:44	--2022-07-25 01:37:44--  https://downloads.plex.tv/codecs/d53cb63-4323/linux-x86_64-standard/libsonic_decoder.so
2022-07-25 01:37:44	Resolving downloads.plex.tv (downloads.plex.tv)... 
2022-07-25 01:37:44	172.64.153.236, 104.18.34.20, 2606:4700:4400::6812:2214, ...
2022-07-25 01:37:44	Connecting to downloads.plex.tv (downloads.plex.tv)|172.64.153.236|:443... 
2022-07-25 01:37:44	connected.
2022-07-25 01:37:44	HTTP request sent, awaiting response... 
2022-07-25 01:37:44	Length: 68736 (67K) [application/octet-stream]
2022-07-25 01:37:44	200 OK
2022-07-25 01:37:44	libsonic_decoder.so: Permission denied

It also looks like the process that downloads the codecs for the work is not using the PUID/PGID set in the env. I have both set to 1000, but it is using 1004. I had to chmod 777 the directory to get it to work.

@pabloromeo
Copy link
Owner

@AngellusMortis are you running the linuxserver image with the worker dockermod on top or the clusterplex worker docker image?

Regarding the 1004 that's a bit odd. Basically what we do here is copy our .sh script and overwrite Plex's own executable. It's called directly by the linuxserver base image, not any call we make particularly. I'll read up a bit on the base image's mechanics around how they make the call to see if that could be causing issues.

@AngellusMortis
Copy link

The dockermod since that is the recommended one. I never got transcoding working. Even after fixing the permissions issues, it just gives that same error in the worker.

@pabloromeo
Copy link
Owner

pabloromeo commented Jul 25, 2022

10.60.0.174 is your PMS correct?
Have you configured the appropriate subnets of the workers to allow connecting to plex without Auth? Also if you have "Secure Connections" set to "Required" then you should set the environment variable "FORCE_HTTPS" to "1" or else Plex rejects connections coming from the workers. (FYI "FORCE_HTTPS" was added on v1.3.15, so make sure to be up to date to use it).
There are also considerations around the transcoding path, and having it shared across all workers and PMS.
Could it also be a permissions issue around /tmp/transcode ?

@AngellusMortis
Copy link

10.60.0.174 is your PMS correct?

Probably. It is Kubernetes, so it changes every time I destroy the pod. PMS_IP is set to just plex (DNS name)

Have you configured the appropriate subnets of the workers to allow connecting to plex without Auth?

Yes. 10.0.0.0/8, 172.16.0.0/12, 192.168.2.0/24 are all added (192.168.x is my actual LAN/VLAN networks outside of k8s, so all internal container IPs and my "server" VLAN are whitelisted)

Also if you have "Secure Connections" set to "Required" then you should set the environment variable "FORCE_HTTPS" to "1" or else Plex rejects connections coming from the workers.

Yes. It is using HTTPS.

There are also considerations around the transcoding path, and having it shared across all workers and PMS.
Could it also be a permissions issue around /tmp/transcode ?

I have the transcoder path shared between all of the nodes and they can write to it (it is creating folders).

The issue does not seem to be on Plex's side. There are no errors or real useful output in the Plex Console. The transcoder command just fails. Having the output for the transcoder command could probably be useful.

@pabloromeo
Copy link
Owner

Ah, I was asking about the Secure Connections because in your screenshot the url used for progress was http and not https, and that would fail if Secure Connections was set to Required.
Have you enabled TRANSCODER_VERBOSE: "1"?
Also, Plex's Debug logging is sometimes useful in identifying where the process is failing.
image
And seeing it in realtime in Plex's Console while attempting to launch a transcode job.

@github-actions
Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has been inactive for more than 30 days label Aug 25, 2022
pabloromeo added a commit that referenced this issue Sep 3, 2022
Adding support for EAE.
The new default is for it to be enabled but if issues occur it can be turned off by setting the environment variable EAE_SUPPORT in the Workers to "false"

Solves #125 and #155
pabloromeo added a commit that referenced this issue Sep 3, 2022
Adding support for EAE.
The new default is for it to be enabled but if issues occur it can be turned off by setting the environment variable EAE_SUPPORT in the Workers to "false"

Solves #125 and #155
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale Issue has been inactive for more than 30 days
Projects
None yet
Development

No branches or pull requests

4 participants