-
-
Notifications
You must be signed in to change notification settings - Fork 76
Tutorial (Indexers, newznab, API, *arr, etc.)
The following is meant as a introduction to some of the concepts you need to understand to properly use NZBHydra. Some of the chapters are kept intentionally short as the focus lies somewhere else.
Indexers scrape the usenet for nice stuff (mostly Movies, TV shows and porn, but also games, apps and music to a lesser degree). These are uploaded split in many parts and are often uploaded under names that a) do not make the easily discoverable and b) makes it hard to find out what exactly they contain. So they might find a couple of ZIP files but they're named "ABCD123", so you don't know if it's a movie or whatever. They may actually download the ZIPs and see inside it. But many uploads are actually named more or less like their content. It's up to the indexers to find these things and index them, hence the name. For each release they create an NZB file which is a file containing all the information you need to download this release. You don't need to know anything about what this file looks like or how it's created. In the simplest form the indexer just saves this file and some basic information about the release (size, upload date, etc). This is what raw search engines like Binsearch do. You can search by the names of these releases and filter by size and age, but if the release is named weirdly you probably won't find it. And if you search for "lost" you may be, well, lost, because there's a lot of stuff out there with that in its name.
"Proper" indexers will try to find out for each release what it actually contains and try to assign it proper metadata, e.g. find the movie and save that information along in the database. If it finds an episode of "Lost" it will add it to its internal list of "Lost" episodes, along with season and episode number. That way you can go to their website and search for "Lost, Season 1, Episode 12" and find releases for exactly that. If you tried to do that with a raw search engine you'd need to enter "Lost s01e12" and would miss any releases that are named "Lost 1x12", for example.
Where do the indexers get all the information about movies and TV shows? From metadata providers like TheTVDB or https://www.imdb.com/. The indexers will not only save the metadata but also the ID for the movie or TV show. The TVDB ID for Lost is 73739.
Every indexer provides an API (Application programming interface) endpoint, a certain URL that can be called to programmatically retrieve the indexer's releases and search them. This API is described by the [https://newznab.readthedocs.io/en/latest/misc/api/](newznab spec) but it should be noted that no indexer actually implements all the functions described there. The API is only meant to be called by programs like NZBHydra or Sonarr. Indexers usually limit the access to the API to a certain amount of hits per day (because each API request takes a little bit of processing power). Many indexers allow a couple of hits for free users and thousands for paying VIP users.
An API search URL might look like this: https://www.indexer.com/api?apikey=someapikey&t=search&q=whatever
.
API searches (the API also allows downloading and some other stuff, but we'll ignore that) can be made using several search parameters which determine which results are returned.
Each release is a assigned a category. These categories are predefined to a certain degree (some indexers invent new ones). Each category has a fixed number which identifies this category for searches. The categories are split into main and subcategories. Main categories are for example "Movies" (2000), "TV" (5000), "Audio" (3000). Each main category has several subcategories. "Movies" has the subcategories "HD" (2040), "SD" (2030) and others, "TV" has the subcategories "HD" (5040), "SD" (5030) and others. You can already see that the subcategories always start with the same digit as their main category because they're subcategory. If an API search is made with the parameter cat=2000
that means that only Movie results should be returned (so any results with a category that starts with "2"). The same way if an API search is made with the parameter cat=2040
that means that only HD Movie results should be returned. It's also possible to combine multiple categories: cat=2000,5000
will only return Movie and TV results, cat=2010,2030
will only return foreign and "other" movies (whatever that is) (so movies that have either category assigned). You can see that if cat=2000
returns any results with a category that starts with "2" it doesn't make any sense to search for cat=2000,2010,2020,2030
- that's the same as searching for cat=2000
.
API searches can be made using several functions. These determine how results are searched and what parameters can be added to the search. The search type is defined by the t
parameter in the URL.
This is a search in its most basic form (t=search
). You can provide a simple text based query using q=whatever
which would limit returned results to those with "whatever" in their name. But even that parameter is optional. It's possible just to search t=search&cat=2000
to get a list of the latest movies. This is called an "update query" in NZBHydra because it doesn't search for anything in particular. That's the kind of query periodically made by Sonarr just to keep up-to-date.
It's important to understand that the search function with a query parameter (q=whatever
) doesn't use any special logic. It just searches in the release title, nothing else. This involves all the downsides described above, i.e. you might miss releases if you use the wrong words or have too many false positives if your query is too generic.
So the indexer has already indexed all its releases and assigned meta data and knows exactly which TV show a release contains. Wouldn't it be nice to search for that exact TV show (or even episode)? That can be achieved by using the specific search types t=tvsearch
(or t=movie
). This allows providing a media ID (as described above) that specifies what you're looking for. t=tvsearch&tvdbid=73739
will search for all episodes of Lost, t=tvsearch&tvdbid=73739&cat=5040&season=1&ep=12
will search for all HD releases of Season 1, Episode 12 of Lost.
The same goes for movies, search for t=movie&imdbid=tt0076759
to only find Star Wars releases.
There are several media ID types and not all of them are supported by all indexers:
- IMDB ID (for movies). Nearly every indexer supports this.
- TheTVDB ID (for TV shows). Nearly every indexer supports this.
- TVmaze (for TV shows). Many indexers support this.
- The Movie Database. Many indexers support this.
- IMDB ID (for TV shows). Few indexers support this.
- TVRage. This was a TV show meta data provider that's been offline for a while. Still supported by many indexers for older TV shows.
It's also possible that a search type is supported but no ID. That means that you can search specifically for movies or TV shows but only using plain text queries.
Same as for TV shows and movies the spec also defines searches for music and books using specific search parameters (like author & title or artist & album). Some indexers support these. I can't say how good the results are.
NZBHydra can be used two ways:
- As an "artificial" indexer that you plug in Sonarr.
- As a GUI to manually search all your indexers in one place.
Either way you have to indexer all your indexers into NZBHydra. An automatic "caps check" will determine which of the described search types (SEARCH, TVSHOW, MOVIE, AUDIO, BOOK) and which of the search IDs (TVDB, IMDB, etc.) are supported by the indexer. This is done via "brute force". NZBHydra will execute a search for each of the types and IDs and check if at least 90% of the returned results match the search. That way we can be sure that the type/ID is actually supported. If the caps check does not determine a certain search type supported that does not mean that the indexer won't return any results in this area. So an indexer that doesn't support AUDIO will certainly return audio releases, you just can't search the indexer by artist and album or such.
So let's say you entered NZBHydra into Sonarr. Sonarr makes an update query every 30 minutes or so. NZBHydra queries all its configured indexers, aggregates the results, removes duplicates, filters out some results (if you configured any filters in the config) and return the list to Sonarr which may ask for another batch of results. That's the "update query" described above. Now let's say you're missing a certain episode. You can trigger this search manually but Sonarr will also execute a backlog search now and then. It will call NZBHydra searching for this particular show and episode, e.g. t=tvshow&tvdbid=73739&season=1&episode=12
. NZBHydra will search all indexers which support this search type and ID but will also ignore any which don't support that.
To "fix" this you can enable query generation. That means NZBHydra will convert the ID into a title and, if needed, add season and episode to a query and make a text based query using the SEARCH function. In the example above the indexer will be queried using Lost s01e12
. It's also possible to enable this only as a fallback which means that an indexer supporting the search type and ID will be searched using these and, if it doesn't return any results, NZBHydra will then execute a search using the generated query.
It's almost never a good idea to manually change the search types and IDs that NZBHydra determined to be supported by an indexer. If you remove any you will get less results and if you add any that aren't actually supported you will get errors.
Most torrent trackers work completely different than indexers. They don't index stuff, every torrent is usually manually uploaded by somebody, but it may also be scraped from other indexers. They rarely have any of the metadata the indexers have, so they don't know what TV show a certain torrent is for. There are private trackers that do stuff usenet users can only dream of, but that's another story.
Torrent trackers also usually don't have an API to be searched. To fix that torznab was invented, which is basically a slightly modified newznab format to translate tracker searches to a format that can be programmatically read. The most popular program to provide this API access is Jackett. You can configure trackers there and they can then be called NZBHydra or Sonarr. Jackett will execute the search against the trackers, translate the results and return a torznab result. Jackett is basically for torrent trackers what NZBHydra is for usenet indexers.
Due to their nature these trackers often don't support any search types or perhaps only one and rarely any IDs. NZBhydra allows to read the jackett config and automatically add all its configured trackers. In this case the supported search types and IDs are pulled from the config and not determined by brute force.
By now you hopefully understand what NZBHydra does and how it works but perhaps not why you should (or perhaps shouldn't) use it. This part does not touch NZBHydra as a manual search tool. There's no downside there. This part rather discusses the pros and contras of adding NZBHydra as an indexer to your programs.
- You refer all your programs (Sonarr, Radarr, Lidarr, LL, Mylar, etc.) to NZBHydra. You enter your indexers once in NZBHydra. Whenever you have a new indexer you only need to configure it once.
- You can configure all your jackett trackers automatically instead of having to add each one manually.
- You have vastly more options to (pre)filter the results by filtering out results with or without certain words or regexes, usenet poster, usenet group, etc.
- You get fancy stats and a unified download and search history.
- You get finer control over the access of the indexers: You can do load balancing, account for the API limit, use indexers only for update queries or only specific search queries or only certain categories.
- You get query generation and conversion between IDs (i.e. if a program provides a TMDB ID and the indexer only supports IMDB then NZBHydra will convert it).
- You add NZBHydra as a single point of failure. If it crashes or has a bug all the other programs don't work properly.
- NZBHydra aggregates all your search results and returns the newest 100. *arr will use paging to get up to 1000 results and then stop. All your results must fit into this limit of 1000 results whereas, when you add each indexer individually to *arr, the limit applies only to that indexer's results. That means with NZBHydra you may miss results because they're outside this scope 1000 results. I personally don't think that's ever a problem but it's possible and gets more probable with every indexer (and especially tracker) you add.