Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add invidious companion support #4985

Draft
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

unixfox
Copy link
Member

@unixfox unixfox commented Oct 7, 2024

Description

Invidious companion is the new tool created for the retrieval of the YouTube streams: https://github.com/iv-org/invidious-companion

Invidious will not handle the videos streams retrieval anymore, as it has become as burner for the Invidious team to adapt. Instead, invidious companion will be based on https://github.com/LuanRT/YouTube.js which is the most up to date when it comes to video streams retrieval.

This allows us to spend more time on actually improving the Invidious frontend.

What does this PR do?

Invidious send the /player request to Invidious companion in HTTP(S). Invidious companion does all of its magic then Invidious does the usual parsing of the player endpoint.

Invidious also delegate the work of latest_version, /api/manifest/dash/id and /videoplayback to Invidious companion.

What does invidious companion do

  • You can set multiple invidious_companion. Allowing you to utilize multiple external servers
  • It supports SOCKS proxy - related to Allow to use proxies (HTTP(s) & socks) #301
  • There is a plan to support multiple proxies: handle the ability to use multiple proxies invidious-companion#5
  • There is no need to use inv_sig_helper as the program automatically handle the deciphering.
  • There is no need for youtube-trusted-session-generator as the program automatically generate a po_token at a configurable frequency.
  • You can logging with a yt account

Incompatibilities

  • You can't use inv_sig_helper with invidious companion

How to try?

  1. Run Invidious companion with a secret key: https://github.com/iv-org/invidious-companion?tab=readme-ov-file#run-locally
  2. Configure Invidious companion in the config.yaml and with the same secret key:
invidious_companion:
  - http://127.0.0.1:8282
invidious_companion_key: myBeautifulKey
  1. Run Invidious

Fixes

Related to

@unixfox unixfox force-pushed the invidious-companion branch 2 times, most recently from 7efa8f7 to 194fb72 Compare October 20, 2024 00:11
@unixfox unixfox changed the title add invidious_companion option Add invidious companion support Nov 1, 2024
@unixfox unixfox requested a review from syeopite November 1, 2024 21:19
@unixfox unixfox marked this pull request as ready for review November 1, 2024 21:19
@unixfox unixfox requested review from SamantazFox and a team as code owners November 1, 2024 21:19
src/invidious/yt_backend/youtube_api.cr Outdated Show resolved Hide resolved
src/invidious/yt_backend/youtube_api.cr Outdated Show resolved Hide resolved
src/invidious/yt_backend/youtube_api.cr Outdated Show resolved Hide resolved
src/invidious/yt_backend/youtube_api.cr Outdated Show resolved Hide resolved
src/invidious/yt_backend/youtube_api.cr Outdated Show resolved Hide resolved
src/invidious/yt_backend/youtube_api.cr Outdated Show resolved Hide resolved
src/invidious/videos/parser.cr Outdated Show resolved Hide resolved
src/invidious/videos.cr Outdated Show resolved Hide resolved
src/invidious/routes/watch.cr Outdated Show resolved Hide resolved
Comment on lines 23 to 25
if local && CONFIG.invidious_companion
return env.redirect "#{video.invidious_companion["baseUrl"].as_s}#{env.request.path}?#{env.request.query}"
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the get_video() call above already talks to the companion and gets playable URLs, why still redirect to the companion manifest handler? Why not let invidious generate the manifest?

PS: I realized that there is an important oversight here: what if the video object (e.g returned from cache) doesn't have a companion field, or the companion base URL is not present anymore in the config?

PPS: Why handling that case at all when the URL is already replaced in components/player.ecr?

Copy link
Member Author

@unixfox unixfox Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple reasons:

In the design plan, I said that everything related to the video retrieval should be handled now in invidious companion. It doesn't make a lot of sense to handle all the logic explained above in invidious.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One may configure a proxy in invidious_companion. And invidious companion is going to handle multiple proxies soon, so this can't be handled inside Invidious

Unless this is done transparently to the end user (e.g with NGINX rules), the CSP header will need to reflect the use of these external proxies, and so Invidious will need to take care of it.

DASH manifest generation from youtube.js include way more things than invidious. which includes multi-language audio support

But is this compatible with videoJS? Our DASH manifest is missing multiple formats because of that.

If an external program relies on the dash api, it makes more sense to redirect this request to Invidious companion because it avoids making unnecessary process in invidious as Invidious companion is the one requesting the video streams from YouTube.

Invidious already requests and parses the whole /player response returned by the companion (if it wasn't already in cache) so I don't think this will represent much more processing.

One may configure multiple invidious companion, we need to make sure that the video stream used is from the same public IP address that generated the video stream.

For an API user, all URLs should already have that companion as the host, and for a web user, all the relevant URLs should already be handled in components/player.ecr. Am I right?

If so, the only way to access invidiou's own /latest_version and manifest endpoints would be willingly by a malicious or careless API user, so broken streams are a non-problem here imo.

Copy link
Member Author

@unixfox unixfox Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, I would like to say that I have been in your thinking situation. I had long thoughts about the advantages and the disadvantages of each implementation. This current implementation outweighs the one where Invidious handle almost all the logic and Invidious companion is just "dumb proxy".

Also, invidious companion is not entirely new, it's actually based on the ideas of invidious-stripped-down. A project that I have used for 2 years on yewtu.be in order to replace some functions of Invidious. For really improving the performance and user experience of my instance.

Unless this is done transparently to the end user (e.g with NGINX rules), the CSP header will need to reflect the use of these external proxies, and so Invidious will need to take care of it.

Already does in https://github.com/iv-org/invidious/pull/4985/files#diff-a86ced8aa99e3403588538decc008851d3d33dfadbb00d392d5ec8b4696f7df4

But is this compatible with videoJS? Our DASH manifest is missing multiple formats because of that.

Yes. I have used it for 2 years on yewtu.be. I'm stripping out the codec that do not work: https://github.com/iv-org/invidious-companion/blob/master/src/routes/invidious_routes/dashManifest.ts#L41-L43

If an external program relies on the dash api, it makes more sense to redirect this request to Invidious companion because it avoids making unnecessary process in invidious as Invidious companion is the one requesting the video streams from YouTube.

Invidious already requests and parses the whole /player response returned by the companion (if it wasn't already in cache) so I don't think this will represent much more processing.

The benefit of handling latest_version and /api/manifest/dash/id/ is that you can more easily handle all the edge cases in invidious companion itself, no need to add more logic in Invidious. Especially in a time when YouTube is evolving rapidly, we want to keep the pace thanks to youtube.js.

For an API user, all URLs should already have that companion as the host, and for a web user, all the relevant URLs should already be handled in components/player.ecr. Am I right?
If so, the only way to access invidiou's own /latest_version and manifest endpoints would be willingly by a malicious or careless API user, so broken streams are a non-problem here imo.

My number one priority is users using Invidious directly. We can think about API users later. Our main user base uses our frontend.

It doesn't mean that developers using the API are forever forgotten. I have ideas but for the moment, if invidious companion is configured then the video streams given through the API may not always work, especially when "proxying" through the same public IP address is needed.

I'm sorry, but a day is only 24 hours and at the moment I'm a single developer trying to improve the overall usability of Invidious through youtube.js. I can't deal with all the cases at the same time. Happy to receive more help for contribution!

Copy link
Member

@SamantazFox SamantazFox Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already does in https://github.com/iv-org/invidious/pull/4985/files#diff-a86ced8aa99e3403588538decc008851d3d33dfadbb00d392d5ec8b4696f7df4

The problem is that it add the companion's domains to the CSP, but not the companion's proxies themselves.

The benefit of handling latest_version and /api/manifest/dash/id/ is that you can more easily handle all the edge cases in invidious companion itself [...]
[...]
My number one priority is users using Invidious directly. We can think about API users later. Our main user base uses our frontend.

My question was about the necessity of adding this redirect logic on these endpoints, where they should never be called in the first place, as none of the URLs in the /watch page should lead to invidious in the first place!

Copy link
Member Author

@unixfox unixfox Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already does in #4985 (files)

The problem is that it add the companion's domains to the CSP, but not the companion's proxies themselves.

This adds in the CSP the baseUrl: https://github.com/iv-org/invidious-companion/blob/master/config/default.toml#L5 which is given by invidious companion in the JSON reply. This domain is the same domain that which videojs will use for either latest_version or dash manifest api requests or videoplayback request.

We are not adding the URL passing in here: https://github.com/iv-org/invidious/pull/4985/files#diff-b68cf7cb2cb2dbd2275444b03cdc238962e05b5ccff0b89ba34b1f9126f39187R71

 invidious_companion:
 - http://127.0.0.1:8282

My question was about the necessity of adding this redirect logic on these endpoints, where they should never be called in the first place, as none of the URLs in the /watch page should lead to invidious in the first place!

You always have stuff that use these endpoints directly without going through the frontend. It's to make sure that we are using invidious companion all the time.

  • download function/button in invidious
  • bots
  • apps that use latest_version or dash manifest api instead of the official api of invidious

It's to make sure that we avoid sending wrong requests to youtube servers and the IP get banned for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds in the CSP the baseUrl [...] which is given by invidious companion in the JSON reply

I got that ^^ But if the companion has say proxy1.example.com and proxy2.example.com set, those won't be added to the CSP. Unless this is completely transparent to invidious (= the companion forwards the request to the proxies itself, and invidious only ever connects to http://127.0.0.1:8282)

It's to make sure that we avoid sending wrong requests to youtube servers and the IP get banned for that.

Fair enough!

Copy link
Member Author

@unixfox unixfox Nov 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rewritten some part. Let me explain.

If you ENABLE invidious_companion then by default in Invidious:

  • /api/manifest/dash/id/ is blocked
  • /latest_version is blocked

This makes sense, as those endpoints are not used in Invidious if Invidious companion is enabled. This is to avoid any tries to load some video stream through the incorrect IP and making the requests suspicious to YouTube servers.

But for the "downloads" feature, all requests will now be straight sent to latest_version endpoint of invidious companion.

I know this does not yet download the video file directly but the user can do it by clicking on the download feature in their browser and this works. It's being tracked in iv-org/invidious-companion#13

In my opinion this is not a dealbreaker as the Invidious download function is somewhat not useful anymore since the removal of hd720. Maybe in the future we could look into adding ffmpeg combining video and audio on the fly, but right now I'm prioritizing giving back more stability to Invidious.

src/invidious/routes/video_playback.cr Outdated Show resolved Hide resolved
src/invidious/yt_backend/youtube_api.cr Outdated Show resolved Hide resolved
src/invidious/views/components/player.ecr Outdated Show resolved Hide resolved
@unixfox
Copy link
Member Author

unixfox commented Nov 5, 2024

Feedback from @Fijxu:
Try to handle busted or IP blocked invidious companions.

src/invidious/videos/parser.cr Outdated Show resolved Hide resolved
Comment on lines 23 to 25
if local && CONFIG.invidious_companion
return env.redirect "#{video.invidious_companion["baseUrl"].as_s}#{env.request.path}?#{env.request.query}"
end
Copy link
Member

@SamantazFox SamantazFox Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already does in https://github.com/iv-org/invidious/pull/4985/files#diff-a86ced8aa99e3403588538decc008851d3d33dfadbb00d392d5ec8b4696f7df4

The problem is that it add the companion's domains to the CSP, but not the companion's proxies themselves.

The benefit of handling latest_version and /api/manifest/dash/id/ is that you can more easily handle all the edge cases in invidious companion itself [...]
[...]
My number one priority is users using Invidious directly. We can think about API users later. Our main user base uses our frontend.

My question was about the necessity of adding this redirect logic on these endpoints, where they should never be called in the first place, as none of the URLs in the /watch page should lead to invidious in the first place!

@unixfox unixfox marked this pull request as draft November 8, 2024 22:44
Comment on lines 330 to +331
elsif itag = download_widget["itag"]?.try &.as_i
itag = itag.to_s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elsif itag = download_widget["itag"]?.try &.as_i
itag = itag.to_s
elsif itag = download_widget["itag"]?.try &.as_i.to_s

Comment on lines +204 to +209
if (!CONFIG.invidious_companion.empty?)
env.response.headers["Content-Security-Policy"] =
env.response.headers["Content-Security-Policy"]
.gsub("media-src", "media-src " + video.invidious_companion.not_nil!["baseUrl"].as_s)
.gsub("connect-src", "connect-src " + video.invidious_companion.not_nil!["baseUrl"].as_s)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed improvements:

  1. Base the if condition on the presence of the companion base URL, rather than the current config (reduces the risk of breaking videoplayback on cached videos)
  2. Use string interpolation (faster)
  3. Keep usage of .not_nil! to a minimum, as recommended by Ameba
Suggested change
if (!CONFIG.invidious_companion.empty?)
env.response.headers["Content-Security-Policy"] =
env.response.headers["Content-Security-Policy"]
.gsub("media-src", "media-src " + video.invidious_companion.not_nil!["baseUrl"].as_s)
.gsub("connect-src", "connect-src " + video.invidious_companion.not_nil!["baseUrl"].as_s)
end
if companion_base_url = video.invidious_companion.try &.["baseUrl"].as_s
env.response.headers["Content-Security-Policy"] =
env.response.headers["Content-Security-Policy"]
.gsub("media-src", "media-src #{companion_base_url}")
.gsub("connect-src", "connect-src #{companion_base_url}"
end

Copy link
Member Author

@unixfox unixfox Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you do without .not_nil! though? I had a hard time understand how. Crystal makes my brain hurt.

the solution in #4985 (comment) will probably help

Comment on lines +193 to +198
if (!CONFIG.invidious_companion.empty?)
env.response.headers["Content-Security-Policy"] =
env.response.headers["Content-Security-Policy"]
.gsub("media-src", "media-src " + video.invidious_companion.not_nil!["baseUrl"].as_s)
.gsub("connect-src", "connect-src " + video.invidious_companion.not_nil!["baseUrl"].as_s)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same suggestion as above:

Suggested change
if (!CONFIG.invidious_companion.empty?)
env.response.headers["Content-Security-Policy"] =
env.response.headers["Content-Security-Policy"]
.gsub("media-src", "media-src " + video.invidious_companion.not_nil!["baseUrl"].as_s)
.gsub("connect-src", "connect-src " + video.invidious_companion.not_nil!["baseUrl"].as_s)
end
if companion_base_url = video.invidious_companion.try &.["baseUrl"].as_s
env.response.headers["Content-Security-Policy"] =
env.response.headers["Content-Security-Policy"]
.gsub("media-src", "media-src #{companion_base_url}")
.gsub("connect-src", "connect-src #{companion_base_url}"
end

return Invidious::Routes::VideoPlayback.latest_version(env)
if (!CONFIG.invidious_companion.empty?)
video = get_video(video_id)
return env.redirect "#{video.invidious_companion.not_nil!["baseUrl"].as_s}/latest_version?id=#{video_id}&itag=#{itag}&local=true"
Copy link
Member

@SamantazFox SamantazFox Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

env.params.query is a regular HTTP::Params, so by doing a string interpolation, its .to_s method will be called, allowing to pass all parameters in one shot, with the proper encoding:

Suggested change
return env.redirect "#{video.invidious_companion.not_nil!["baseUrl"].as_s}/latest_version?id=#{video_id}&itag=#{itag}&local=true"
return env.redirect "#{video.invidious_companion.not_nil!["baseUrl"].as_s}/latest_version?#{env.params.query}"

With this, the suggestion right above can be ignored, and itag line above can be reverted.

On a side note, this implies that the invidious compacnion will take care of file downloads. Is that intended?
Nevermind, I saw #4985 (comment)

Comment on lines +195 to +197
def invidious_companion : Hash(String, JSON::Any)?
info["invidiousCompanion"]?.try &.as_h
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that the following might be better, to avoid using .not_nil! everywhere, and instead rely on the Hash's []? method:

Suggested change
def invidious_companion : Hash(String, JSON::Any)?
info["invidiousCompanion"]?.try &.as_h
end
def invidious_companion : Hash(String, JSON::Any)
info["invidiousCompanion"]?.try &.as_h || {} of String => JSON::Any
end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha yes crystal will be happier :).

Comment on lines +4 to +6
if !CONFIG.invidious_companion.empty?
return error_template(403, "This endpoint is not permitted because it is handled by Invidious companion.")
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think here to redirect to a random invidious companion, rather than display an error message? Sure, if there are multiple companions, we might not hit the same that initially loaded the watch page, but it might make playback smoother.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that you have to do a request to a random invidious companion for knowing its "public" URL. I had ideas to include the public URL inside the config.yml. But my poor Crystal knowledge limited me due to the new URIArrayConverter.

Something like:

invidious_companion:
 - http://127.0.0.1:8282:
      public_url: https://companion1.invidious.com

I want to avoid doing unnecessary requests to invidious companion.

http://127.0.0.1:8282 is the internal address that invidious uses to communicate with invidious companion. but the companion could very well be on another server, and so having a config like this can exist:

invidious_companion:
 - http://10.0.0.2:8282:
      public_url: https://companion1.invidious.com

10.0.0.2 is another server and 10.0.0.0/24 is an internal network faster than the internet network. example: https://www.ovhcloud.com/en/public-cloud/private-network/

) : Hash(String, JSON::Any)
headers = HTTP::Headers{
"Content-Type" => "application/json; charset=UTF-8",
"Authorization" => "Bearer " + CONFIG.invidious_companion_key,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Authorization" => "Bearer " + CONFIG.invidious_companion_key,
"Authorization" => "Bearer #{CONFIG.invidious_companion_key}",

@@ -500,7 +500,11 @@ module YoutubeAPI
data["params"] = params
end

return self._post_json("/youtubei/v1/player", data, client_config)
if !CONFIG.invidious_companion.empty?
return self._post_invidious_companion("/youtubei/v1/player", data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a side note here: because we've a dedicated function to call the companion, we're not forced to use the same URL path as youtube! The path can even be hardcoded (= one less argument required), as we're sure it will never change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but it's possible that we could send more path than /youtubei/v1/player. we do not know the future.

/youtubei/v1/player is because I want invidious companion to be compatible with other projects: iv-org/invidious-companion#4

Comment on lines +701 to +704
if (response.status_code != 200)
raise Exception.new("Error while communicating with Invidious companion: \
status code: " + response.status_code.to_s + " and body: " + body)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using string interpolation + .dump on the body, as it 1. quotes the data and 2. prevents control and non-ASCII characters from breaking the terminal/log file:

Suggested change
if (response.status_code != 200)
raise Exception.new("Error while communicating with Invidious companion: \
status code: " + response.status_code.to_s + " and body: " + body)
end
if (response.status_code != 200)
raise Exception.new(
"Error while communicating with Invidious companion: \
status code: #{response.status_code} and body: #{body.dump}"
)
end

Comment on lines +709 to +711
if body.nil?
raise InfoException.new("Error while communicating with Invidious companion: no response data.")
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not required, as body is always String, because

  1. You're using response.body and not response.body?
  2. You're already catching any potential errors above!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment