Skip to content

Commit

Permalink
Adding NFOSceneParser
Browse files Browse the repository at this point in the history
  • Loading branch information
Gykes committed Nov 10, 2024
1 parent 1345bde commit 23c78d5
Show file tree
Hide file tree
Showing 9 changed files with 1,570 additions and 0 deletions.
200 changes: 200 additions & 0 deletions plugins/nfoSceneParser/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
# nfoFileParser
Automatically and transparently populates your scenes data (during scan) based on either:
- NFO files
- patterns in your file names, configured through regex.

Ideal to "initial load" a large set of new files (or even a whole library) and not "start from scratch" in stash! *...provided you have nfo files and/or consistent patterns in your file names of course...*

# Installation

- If you have not done it yet, install the required python module: `pip install requests` (or `pip3 install requests` depending on your python setup). Note: if you are running stash as a Docker container, this is not needed as it is already installed.
- Download the whole folder `nfoFileParser`
- Place it in your `plugins` folder (where the `config.yml` is)
- Reload plugins (`Settings > Plugins > Reload`)
- `nfoFileParser` appears
- Scan some new files...

The plug-in is automatically triggered on each new scene creation (typically during scan)

# Usage

Imports scene details from nfo files or from regex patterns.

Every time a new scene is created, it will:
- look for a matching NFO file and parse it into the scene data (studio, performers, date, name,...)
- if no NFO are found, it uses a regular expression (regex) to parse your filename for patterns. This fallback works only if you have consistent & identifiable patterns in (some of) your file names. Read carefully below how to configure regex to match your file name pattern(s).
- If none of the above is found: it will do nothing ;-)

NFO complies with KODI's 'Movie Template' specification (https://kodi.wiki/view/NFO_files/Movies). Note: although initially created by KODI, this NFO structure has become a de-facto standard among video management software and is used today far beyond its KODI roots to store the video files's metadata.

regex patterns complies with Python's regular expressions. A good tool to write/test regex is: https://regex101.com/

## config.py

nfoFileParser works without any config edits. If you want more control, have a look at `config.py`, where you can change some default behavior.

## Reload task

nfoFileParser typically processes everything during scan. If you want to reload the nfo/regex at a later time, you can execute a "reload" task.

It works in three steps: configure, select & run:
- Configure: edit `reload_tags` in the plugin's `config.py` file. Set the name to an existing tag in your stash. It is used as the 'marker" tag by the plugin to identify which scenes to reload.
- Select: add the configured tag to your scenes to "mark" them.
- Run: execute the "reload" task: stash's settings -> "Tasks" -> Scroll down to "plugin tasks" / nfoSceneParser (at the bottom) -> "Reload tagged scenes" button

A reload essentially merges the new file data with the existing scene data, giving priority to the nfo/regex content. More specifically:
- For single-value fields, overrides what is already set if another content is found
- For single-value fields, keeps what is already set if nothing is found
- For multi-value fields, adds to existing values.

Note: The marker tag is removed from the reloaded scenes (unless it is present in the nfo or regex) => no need to remove it manually...

# NFO files organization

## Scene NFO

The plugin automatically looks for .nfo files (and optionally thumbnail images) in the same directory and with the same filename as your video file (for instance for a `BestSceneEver.mp4` video, it will look for a corresponding `BestSceneEver.nfo` file). Through config, you can specify an alternate location for your NFO files.

## Folder NFO

If a "folder.nfo" file is present, it will be loaded and applied as default for all scene files within the same folder. A scene specific nfo will override the default from the folder.nfo.

So if you have a folder.nfo, with a studio, or an performer, they will automatically be applied to all scenes in the folder, even if there is no specific nfo for each scene file.

folder.nfo are also used to create movies. See below for details on movie support.

## Image support

Thumbnails images are supported either from `<thumb>` tags in the NFO itself (link to image URL) or alternatively will be loaded from the local disk (following KODI's naming convention for movie artwork). The plug-in will use the first image it finds among:
- A local image with the `-landscape` or `-poster` or no suffix (example: `BestSceneEver-landscape.jpg` or `BestSceneEver.jpg`). If you have movie info in your nfo, two images will be loaded for front & back posters (example: `folder-poster.jpg` and `folder-poster1.jpg`)
- A download of the `<thumb>` tags url (if there are multiple thumb fields in the nfo, uses the one with the "landscape" attribute has priority over "poster").

## Movie support

Movies are automatically found and created in stash from the nfo files. The plugin supports two different alternatives:
- folder.nfo if present contains data valid for all scene files in the same directory. That is the very definition of a movie. The `<title>`tag designate the movie name, with all other relevant tags used to create the movie with all its details (`<date>`, `<studio>`, `<director>`, front/back image from `<thumb>`)
- Inside the scene nfo, through the `<set>` tag that designate the group/set to which multiple scenes belong.

example for `folder.nfo`:
```xml
<movie>
<title>My Movie Title</title>
<plot>You have to see it to believe it...</plot>
<thumb aspect="poster">https://front_cover.jpg</thumb>
<thumb aspect="poster">https://back_cover.jpg</thumb>
<studio>Best studio ever</studio>
<director>Georges Lucas</director>
</movie>
```

example for `BestSceneEver.nfo`:

```xml
<movie>
<title>BestSceneEver</title>
<plot>Scene of the century</plot>
<thumb aspect="landscape">https://scene_cover.jpg</thumb>
<studio>Best studio ever</studio>
<set>
<name>My Movie Title</name>
<index>2</index>
</set>
</movie>
```

## url support

The nfo spec does not officially support `<url>` tags, but given the importance for stash, it is supported by nfoSceneParser as an nfo extension and will be correctly recognized and updated to your scenes and movies.

## Mapping between stash data and nfo fields

stash scene fields | nfo movie fields
------------------------ | ---------------------
`title` | `title` or `originaltitle` or `sorttitle`
`details` | `plot` or `outline` or `tagline`
`studio` | `studio`
`performers` | `actor.name` (sorted by `actor.order`)
`movie` | `set.name` (sorted by `set.index`) or `title` from folder.nfo
`rating` | `userrating` or `ratings.rating`
`tags` | `tag` or `genre`
`date` | `premiered` or `year`
`url` | `url`
`director` (for movies) | `director` (only for folder.nfo)
`cover image` (or `front`/`back`for movies) | `thumb` (or local file)
`id` | `uniqueid`

Note: `uniqueid` support is only for existing stash scenes that were exported before (to they are updated "in place" with their existing id)




# Regex pattern matching

Regular expressions work by recognizing patterns in your files. It is a fallback if no NFO can be found.

You need to configure a custom pattern (like studio, actors or movie) that is specific to your naming convention. So a little bit of configuration is needed to "tell the plugin" how to recognize the right patterns.

patterns use the "regular expression" standard to match patterns (regex).

## Regex configuration - not your typical plugin

A consistent and uniform naming convention across a whole media library is extremely unlikely. Therefore, nfoSceneParser supports not one, but multiple `nfoSceneParser.json` regex config files. They are placed alongside your media files, directly into the library.

A configuration file applies to all files and subdirectories below it.

Config files can be nested inside the library's directories tree. In this case, the deepest and most specific config is always used.

`nfoSceneParser.json` configs are searched and loaded when the plug-in is executed. They can be added, modified or removed while stash is running, without the need to "reload" the plugins.

## File structure `nfoSceneParser.json`

Configuration files consist of one regex and some attributes.

| Name | Required | Description |
| ------------- | -------- | -------------------------------------- |
| regex | true | A regular expression (regex). Regex can be easily learned, previewed and tested via [https://regex101.com/](https://regex101.com/)|
| splitter | false | Used to further split the matched "performers" or 'tags" text into an array of strings (the most frequent use case being a list of actors or tags). For instance, if performers matches to `"Megan Rain, London Keyes"`, a splitter of `", "` will separate the two performers from the matched string |
| scope | false | possible values are "path" or "filename". Whether the regex is applied to the scene's whole path or just the filename. Defaults to "path" |

## Example `nfoSceneParser.json`

Let's assume the following directory and file structure:

`/movies/movie series/Movie Name 17/Studio name - first1 last1, first2 last2 - Scene title - 2017-12-31.mp4`

A common naming convention is used for all files under "movie series" directory => the `nfoSceneParser.json` file is placed in `/movies/movie series`.

We want to identify the following patterns:
- The deepest folder is the `movie`
- The file name has different sections, all separated by the same `' - '` delimiter. We can therefore use this to delimit and match the `studio`, the `performers` and the scene's `title`.
- The `date` is matched automatically. There is nothing to configure for that.

`nfoSceneParser.json` (remember: to be placed in your library)
```json
{
"regex": "^.*[/\\\\](?P<movie>.*?)[/\\](?P<studio>.*?) - (?P<performers>.*?) - (?P<title>.*?)[-]+.*\\.mp4$",
"splitter": ", ",
"scope": "path"
}
```

A quick look at the regex:
- `[/\\]` Matches slash & backslash, making it work on Windows and Unix path alike (Macos, Linux,...)
- Capturing groups like `(?P<movie>.*?)` have name that must match the supported nfoFileParser attributes (see below)

Note: in json, every `\` is escaped to `\\` => `\\` in json is actually `\` in the regex. If you are unfamiliar, look for a json regex formatter online and paste your regex there to get the properly "escaped" string you need to use in the config file.

## Supported regex capturing group names

The following can be used in your regex capturing group names:
- title
- date
- performers
- tags
- studio
- rating
- movie
- director
- index (mapped to stash scene_index - only relevant for movies)

Note: if `date` is not specified, the plug-in attempts to detect the date anywhere in the file name.
32 changes: 32 additions & 0 deletions plugins/nfoSceneParser/abstractParser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import os


class AbstractParser:

empty_default = { "actors": [], "tags": [] }

# Max number if images to process (2 for front/back cover in movies).
_image_Max = 2

def __init__(self):
self._defaults = [self.empty_default]

def _find_in_parents(self, start_path, searched_file):
parent_dir = os.path.dirname(start_path)
file = os.path.join(start_path, searched_file)
if os.path.exists(file):
return file
elif start_path != parent_dir:
# Not found => recurse via parent
return self._find_in_parents(parent_dir, searched_file)

def _get_default(self, key, source=None):
for default in self._defaults:
# Source filter: skip default if it is not of the specified source
if source and default.get("source") != source:
continue
if default.get(key) is not None:
return default.get(key)

def parse(self):
pass
70 changes: 70 additions & 0 deletions plugins/nfoSceneParser/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# If dry is True, will do a trial run with no permanent changes.
# Look in the log file for what would have been updated...
dry_mode = False

# nfo file location & naming.
# Possible options:
# - "with files": with the video files: Follows NFO standard naming: https://kodi.wiki/view/NFO_files/Movies
# - "...": a specific directory you mention. In this case, the nfo names will match your stash scene ids.
# if you set the above to "with files", it'll force filename anyway, to match the filename.
# ! Not yet implemented. Currently, only "with files" is supported
nfo_location = "with files"

# If True, will never update already "organized" scenes.
skip_organized = True

# If True, will set the scene to "organized" on update from nfo file.
set_organized_nfo = True

# Set of fields that must be set from the nfo (i.e. "not be empty") for the scene to be marked organized.
# Possible values: "performers", "studio", "tags", "movie", "title", "details", "date",
# "rating", "urls" and "cover_image"
set_organized_only_if = ["title", "performers", "details", "date", "studio", "tags", "cover_image"]

# Blacklist: array of nfo fields that will not be loaded into the scene.
# Possible values: "performers", "studio", "tags", "movie", "title", "details", "date",
# "rating", "urls" and "cover_image", "director"
# Note: "tags" is a special case: if blacklisted, new tags will not be created, but existing tags will be mapped.
blacklist = ["rating"]

# List of tags that will never be created or set to the scene.
# Example: blacklisted_tags = ["HD", "Now in HD"]
blacklisted_tags = ["HD", "4K", "Now in HD", "1080p Video", "4k Video"]

# Name of the tag used as 'marker" by the plugin to identify which scenes to reload.
# Empty string or None disables the reload feature
reload_tag = "_NFO_RELOAD"

# Creates missing entities in stash's database (or not)
create_missing_performers = True
create_missing_studios = True
create_missing_tags = True
create_missing_movies = True

###############################################################################
# Do not change config below unless you are absolutely sure of what you do...
###############################################################################

# Wether to Looks for existing entries also in aliases
search_performer_aliases = True
search_studio_aliases = True

levenshtein_distance_tolerance = 2

# "Single names" means performers with only one word as name like "Anna" or "Siri".
# If true, single names aliases will be ignored:
# => only the "main" performer name determines if a performer exists or is created.
# Only relevant if search_performer_aliases is True.
ignore_single_name_performer_aliases = True

# If the above is set to true, it can be overruled for some allowed (whitelisted) names
single_name_whitelist = ["MJFresh", "JMac", "Mazee"]

###############################################################################
# Reminder: if no matching NFO file can be found for the scene, a fallback
# "regular expressions" parsing is supported.
#
# ! regex patterns are defined in their own config files.
#
# See README.md for details
###############################################################################
52 changes: 52 additions & 0 deletions plugins/nfoSceneParser/log.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import sys


# Log messages sent from a plugin instance are transmitted via stderr and are
# encoded with a prefix consisting of special character SOH, then the log
# level (one of t, d, i, w, e, or p - corresponding to trace, debug, info,
# warning, error and progress levels respectively), then special character
# STX.
#
# The LogTrace, LogDebug, LogInfo, LogWarning, and LogError methods, and their equivalent
# formatted methods are intended for use by plugin instances to transmit log
# messages. The LogProgress method is also intended for sending progress data.
#

def __prefix(level_char):
start_level_char = b'\x01'
end_level_char = b'\x02'

ret = start_level_char + level_char + end_level_char
return ret.decode()


def __log(level_char, s):
if level_char == "":
return

print(__prefix(level_char) + s + "\n", file=sys.stderr, flush=True)


def LogTrace(s):
__log(b't', s)


def LogDebug(s):
__log(b'd', s)


def LogInfo(s):
__log(b'i', s)


def LogWarning(s):
__log(b'w', s)


def LogError(s):
__log(b'e', s)


def LogProgress(p):
progress = min(max(0, p), 1)
__log(b'p', str(progress))
Loading

0 comments on commit 23c78d5

Please sign in to comment.