Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.AccessViolationException on avcodec_open2 #38

Open
1 of 3 tasks
forlayo opened this issue Jun 5, 2023 · 7 comments
Open
1 of 3 tasks

System.AccessViolationException on avcodec_open2 #38

forlayo opened this issue Jun 5, 2023 · 7 comments

Comments

@forlayo
Copy link

forlayo commented Jun 5, 2023

Note: for support questions, please use stackoverflow or special repository on [github.com](in special repository github.com). This repository's issues are reserved for feature requests and bug reports.

  • **I'm submitting a ... **

    • bug report
    • feature request
    • support request or question => Please do not submit support request or questions here, see note at the top of this template.
  • Do you want to request a feature or report a bug?

I want to report a bug.

  • What is the current behavior?

Sometimes I got System.AccessViolationException when calling avcodec_open2, making impossible to catch the issue to fallback or to retry or so; as the application closes.

  • *If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem:

It happens to me time by time depending on the options I choose on the codec, but for example is reproducible 100% on a device that has a NVIDIA GeForce MX350 graphics card when I do select to open an h264_nvenc with mode AV_CODEC_HW_CONFIG_METHOD_HW_FRAMES_CTX using AV_PIX_FMT_CUDA and AV_HWDEVICE_TYPE_CUDA.

But as I said, I think any problem that makes avcodec_open2 fail in a specific manner let me on a crash of this type.

  • What is the expected behavior?

Open the codec or getting a controlled error.

  • What is the motivation / use case for changing the behavior?

As currently is not usable.

  • Please tell us about your environment:
    I am using it on a Windows 11 device with NVIDIA GeForce MX350 graphics card.
  • version:
    Latest version v6.0.0.2
  • Other information (e.g. detailed explanation, stacktraces, related issues, suggestions how to fix, links for us to have context, eg. stackoverflow, gitter, etc)

On FFmpeg logs I can see something like:

[h264_nvenc @ 000001f473fcc480] OpenEncodeSessionEx failed: unsupported device (2): (no details)

The crash I am getting.

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at FFmpeg.AutoGen.DynamicallyLoadedBindings+<>c.<Initialize>b__2_494(FFmpeg.AutoGen.AVCodecContext*, FFmpeg.AutoGen.AVCodec*, FFmpeg.AutoGen.AVDictionary**)
   at FFmpeg.AutoGen.ffmpeg.avcodec_open2(FFmpeg.AutoGen.AVCodecContext*, FFmpeg.AutoGen.AVCodec*, FFmpeg.AutoGen.AVDictionary**)
   at GraphicsCheck.FfmpegChecker.StartEncoder(Omni.Platforms.Windows.Services.AV.CodecInfo, Int32, Int32, Int32)
   at GraphicsCheck.FfmpegChecker.CheckConfig(CodecConfig)
   at Program.<Main>$(System.String[])

I am reproducing it with this demo project:
GraphicsCheck.zip

@forlayo
Copy link
Author

forlayo commented Jun 6, 2023

Taking the error of nvenc

[h264_nvenc @ 000001f473fcc480] OpenEncodeSessionEx failed: unsupported device (2): (no details)

as clue, I can see this:

  1. That happens at nvenc_open_session and is called here
  2. That would cause going to fail2 routine, which is this one:
fail2:
    CHECK_CU(dl_fn->cuda_dl->cuCtxDestroy(ctx->cu_context_internal));
    ctx->cu_context_internal = NULL;
  1. ctx->cu_context_internal comes from (in same method) then I suppose it was created properly (as otherwise wouldn't continue).
    ret = CHECK_CU(dl_fn->cuda_dl->cuCtxCreate(&ctx->cu_context_internal, 0, cu_device));
    if (ret < 0)
        goto fail;
  1. ctx in that method is the priv_data of AVCodecContext as we can see here:
static av_cold int nvenc_check_device(AVCodecContext *avctx, int idx)
{
    NvencContext *ctx = avctx->priv_data;
  1. Also the act of printing the error message, on nvenc_print_error, is interacting with context.
static int nvenc_print_error(AVCodecContext *avctx, NVENCSTATUS err,
                             const char *error_string)
{
    const char *desc;
    const char *details = "(no details)";
    int ret = nvenc_map_error(err, &desc);

#ifdef NVENC_HAVE_GETLASTERRORSTRING
    NvencContext *ctx = avctx->priv_data;
    NV_ENCODE_API_FUNCTION_LIST *p_nvenc = &ctx->nvenc_dload_funcs.nvenc_funcs;

    if (p_nvenc && ctx->nvencoder)
        details = p_nvenc->nvEncGetLastErrorString(ctx->nvencoder);
#endif

    av_log(avctx, AV_LOG_ERROR, "%s: %s (%d): %s\n", error_string, desc, err, details);

    return ret;
}
  1. av_log is executed as we can see the error message on logs then the other lines shouldn't be a problem.

  2. Seeing the other calls, when nvenc_check_device fails the error is going up to nvenc_check_device -> nvenc_setup_device but we aren't seeing more logs then the issue may be on nvenc_check_device on fail2 routine or on nvenc_print_error

Any thoughts ?

@hglee
Copy link

hglee commented Jun 13, 2023

Your demo project works with RTX3000. And MX350 does not supports nvenc, only for nvdec: https://en.wikipedia.org/wiki/Nvidia_NVENC

Would you try with an another card or check only for decoder?

@forlayo
Copy link
Author

forlayo commented Jun 13, 2023

@hglee hey!

My problem is really that a avcodec_open2 creates a crash that closes my app, when the command itself is producing some of these type of errors.

Then I can’t just detect that avcodec_open2 has failed to act in consequence.

This crash happens in some situations and I suppose it’s something memory related on how this wrapper and libav are interacting.

@Ruslan-B
Copy link
Owner

I'll move question as it attracted some gravity, or you find bug in bindings - otherwise it is just usage of complex product.

@Ruslan-B Ruslan-B transferred this issue from Ruslan-B/FFmpeg.AutoGen Jun 13, 2023
@forlayo
Copy link
Author

forlayo commented Jun 13, 2023

Thanks @Ruslan-B

This type of crash happened to me in other situations that led on libav giving an internal error on avcodec_open2, instead of receiving the result as error it just crashes as something relative to memory interaction between libav and this wrapper. It may be an issue of different size of data types, but haven't found the exact explanation.

I wanted to highlight that's not something that just happens when using this type of card or so, but something general to calling avcodec_open2 and getting an error in some situations. The case of using a MX350 is just a certain way to reproduce it.

The main issue is as it crashes you haven't a way to act in consequence.

@hglee
Copy link

hglee commented Jun 13, 2023

Would you try ffmpeg DLL with master? There's some fix for nvenc after v6.0: https://trac.ffmpeg.org/ticket/10221

FYI, you can try OBS style device check: https://github.com/obsproject/obs-studio/blob/cb391a595d45aea0d710680c143eb90efe22998b/plugins/obs-ffmpeg/obs-ffmpeg.c#L61

@forlayo
Copy link
Author

forlayo commented Jun 13, 2023

@hglee

I am using version v6.0.0.2 with the ffmpeg libraries that are https://github.com/Ruslan-B/FFmpeg.AutoGen/tree/v6.0.0.2 that I suppose are v6.0 ( as per its version ). Then I think your point is perfectly valid, I'll try with latest FFMpeg libraries when I've a minute. Thanks!

The OBS style check may help here as well, thanks, however it's not enough as there are a lot of combinations of types of cards ( intel, amd and Nvidia. Plus intel + Nvidia and other hybrids ) and codecs configurations that may not be available or may fail and then at the end it would be needed to check the failure to fallback to software or to try something different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants