CUDA_ERROR_UNKNOWN: unknown error #122

vincentvic · 2024-12-19T15:39:58Z

Hi !

When i'm running this part of code:

pyDec = vali.PyDecoder(
    url,
    CONFIG_FFMPEG,
    gpu_id=0)
pkt_data = vali.PacketData()
frame_idx = 0
while True:
    #NV12 surface
    success, details = pyDec.DecodeSingleSurface(surf_nv12, pkt_data)

I have this error with this message, what does it mean ?

[h264_cuvid @ 0x558a8aac76c0] ctx->cvdl->cuvidParseVideoData(ctx->cuparser, &cupkt) failed -> CUDA_ERROR_UNKNOWN: unknown error
Error while sending a packet to the decoder. Error description: Generic error in an external library

The text was updated successfully, but these errors were encountered:

RomanArzumanyan · 2024-12-19T15:54:10Z

Hi @vincentvic

That’s FFmpeg struggling to communicate to the gpu driver.

Please check if the driver is available by running nvidia-smi. If using docker make sure that you run the image with video driver capabilities.

vincentvic · 2024-12-19T15:56:58Z

No I do not think that it comes from the driver, because I run the exact same script with a different video and it works.
And if i re-encode the video with command it also works
ffmpeg -i <input.mp4> -c:v h264_nvenc -preset slow -crf 22 'output.mp4'

Thanks

RomanArzumanyan · 2024-12-19T16:02:37Z

Thanks for the update @vincentvic

If the error is specific to a particular video I assume it’s not fully conformant with H.264 standard or there’s a bug somewhere within ffmpeg / video codec sdk.

As a workaround I can recommend you to catch exceptions from decoder and re-create it in SW mode. For that you simply need to use gpu_id=-1. SW decoder is often more resilient to problematic videos.

vincentvic · 2024-12-19T16:52:06Z

Thanks, it gives me now, a clear explanation, the pyDec.Format is in Yuv420 rather than NV12

RomanArzumanyan · 2024-12-19T16:54:52Z

@vincentvic

Nvdec native format is nv12, sw decoder outputs in yuv420.

vincentvic · 2024-12-19T18:05:52Z

Indeed, I did not find the reason why it does not work for this video for now.

Do you have any idea of what rules or compliance (h264) can lead to the error ?

Thanks a lot

RomanArzumanyan · 2024-12-20T08:28:50Z

@vincentvic

Do you have any idea of what rules or compliance (h264) can lead to the error ?

If you're interested in finding out what's possibly wrong with the video, I invite you to move this topic to discussions.

However, if you need to process multitude of files in production environment, that won't be very helpful and the simplest approach would be to decode problematic videos with SW decoder. E. g. like that

# Please don't just copy-paste this code.
# It was never properly debugged and only serve as sample.

def decode_impl(py_dec, dec_frame, dec_surf, seek_once = -1):
    seek_ctx = None
    if seek_once != -1:
        seek_ctx = vali.SeekContext(seek_once)

    if py_dec.IsAccelerated:        
        return py_dec.DecodeSingleSurface(dec_surf, seek_ctx)
    else:
        return py_dec.DecodeSingleFrame(dec_frame, seek_ctx)

def decode(py_dec, dec_frame, dec_surf, seek_once = -1):
    frame_idx = 0
    success = True

    if seek_once != -1:
        success, details = decode_impl(py_dec, dec_frame, dec_surf, seek_once)
        if success:
            frame_idx += 1

    while success:
        success, details = decode_impl(py_dec, dec_frame, dec_surf)
        if success:
            frame_idx += 1

    return frame_idx

py_dec = vali.PyDecoder(url, {},  gpu_id=0)
surf = vali.Surface.Make(pyDec.Format, pyDec.Width, pyDec.Height, gpu_id=0)
frame = np.ndarray(dtype=np.uint8, shape=(surf.HostSize))
curr_frame = 0

try:
    # Try to decode file as normal
    curr_frame = decode(py_dec, frame, surf)

except Exception as e:
    # Re-create decoder in SW mode, seek to last decoded frame, continue
    py_dec = vali.PyDecoder(url, {}, gpu_id=-1)
    decode(py_dec, frame, surf, curr_frame)

vincentvic · 2024-12-20T11:44:15Z

Hi!

I still have the same error message with this video..

[h264_cuvid @ 0x558a8aac76c0] ctx->cvdl->cuvidParseVideoData(ctx->cuparser, &cupkt) failed -> CUDA_ERROR_UNKNOWN: unknown error
Error while sending a packet to the decoder. Error description: Generic error in an external library

vincentvic · 2024-12-20T11:45:31Z

Is it possible that it comes from the opts parameters ?

CONFIG_FFMPEG = {
'codec': 'h264',
'hwaccel_output_format': 'cuda',
'hwaccel': 'cuda',
'ignore_editlist': 'true',
'preset': 'hq',
}
pyDec = vali.PyDecoder(url, opts=CONFIG_FFMPEG, gpu_id=0)

RomanArzumanyan · 2024-12-20T11:45:55Z

Hi @vincentvic

According to log message you’re still using gpu decoder. Pass gpu_id=-1 to use SW decoder instead.

vincentvic · 2024-12-20T12:04:37Z

`pyDec = vali.PyDecoder(url, opts=CONFIG_FFMPEG, gpu_id=-1)
print(pyDec.Format)
surf_yuv = vali.Surface.Make(format=vali.PixelFormat.YUV420, width=pyDec.Width, height=pyDec.Height, gpu_id=0)
pkt_data = vali.PacketData()
while True:
    success, details = pyDec.DecodeSingleSurface(surf_yuv, pkt_data)
    if not success:
        print(success)
        print(details)
    break`

it does not give me any information, i do not really understand the goal to pass the gpu_id to "-1"

PixelFormat.YUV420
False
TaskExecInfo.SUCCESS

vincentvic · 2024-12-20T12:19:39Z

Can we desactivate the decoder cuvid ?

RomanArzumanyan · 2024-12-20T12:24:12Z

Is it possible that it comes from the opts parameters ?

CONFIG_FFMPEG = { 'codec': 'h264', 'hwaccel_output_format': 'cuda', 'hwaccel': 'cuda', 'ignore_editlist': 'true', 'preset': 'hq', } pyDec = vali.PyDecoder(url, opts=CONFIG_FFMPEG, gpu_id=0)

@vincentvic

You don't need most of those options.
VALI will automatically choose HW decoding options, you just need to pass gpu_id.
Take a look at the decoding sample: https://github.com/RomanArzumanyan/VALI/blob/main/samples/sample_decode_show.ipynb

Just pass actual gpu id for HW decoding of -1 for SW decoding, that's it.

vincentvic · 2024-12-20T12:43:58Z

I test two commands with ffmpeg on the video, the first one works well, but not the second with the same error message.
ffmpeg -hwaccel cuda -c:v h264 -i input.mp4 output.mp4

ffmpeg -loglevel verbose -hwaccel cuda -c:v h264_cuvid -i input.mp4 output.mp4

RomanArzumanyan · 2024-12-20T12:50:12Z

@vincentvic

That's an interesting observation, but I'm not sure if it's relevant to the PyDecoder behavior.

-hwaccel cuda means 'keep decoded frames in vRAM`
output.mp4 without specifying the encoder means "guess the most relevant encoder". Most probably, ffmpeg will choose libx264 or whatever else it has available which is compatible with MP4 container. So decoded frames will be kept in vRAM, then downloaded to RAM, then given to encoder.

VALI works basically like this ffmpeg -hwaccel cuda -c:v h264 -i input.mp4

If gpu_id is meaningfull, VALI will automatically select proper decoder accelerated by Nvdec and will keep decoded frames in vRAM as Surfaces.

Decoder selection happens here:

VALI/src/TC/src/TaskDecodeFrame.cpp

Lines 81 to 91 in c053554

    
           static const std::map<AVCodecID, std::string> 
        
               hwaccel_codecs({std::make_pair(AV_CODEC_ID_AV1, "av1_cuvid"), 
        
                               std::make_pair(AV_CODEC_ID_HEVC, "hevc_cuvid"), 
        
                               std::make_pair(AV_CODEC_ID_H264, "h264_cuvid"), 
        
                               std::make_pair(AV_CODEC_ID_MJPEG, "mjpeg_cuvid"), 
        
                               std::make_pair(AV_CODEC_ID_MPEG1VIDEO, "mpeg1_cuvid"), 
        
                               std::make_pair(AV_CODEC_ID_MPEG2VIDEO, "mpeg2_cuvid"), 
        
                               std::make_pair(AV_CODEC_ID_MPEG4, "mpeg4_cuvid"), 
        
                               std::make_pair(AV_CODEC_ID_VP8, "vp8_cuvid"), 
        
                               std::make_pair(AV_CODEC_ID_VP9, "vp9_cuvid"), 
        
                               std::make_pair(AV_CODEC_ID_VC1, "vc1_cuvid")});

vincentvic · 2024-12-20T12:59:33Z

Yes so it's logical that the error appear because vali will use the h264_cuvid decoder if I am refer to your screen of code ?
But it does not give the reason..

RomanArzumanyan · 2024-12-20T14:50:07Z

@vincentvic

Yes, you see same errors produced both by ffmpeg and VALI because VALI relies on ffmpeg decoder.

vincentvic · 2024-12-20T15:15:57Z

Sorry, last point, I notice that all videos that have a problem have metadata on the first frame with a pkt_dts egal to 1536 and a pkt_dts egal to 512. Is it a coincidence ?

I use this command to get the information

command = [ 'ffprobe', '-i', url, '-show_entries', 'frames', '-print_format', 'json', '-select_streams', 'v:0', '-read_intervals', '%+1' ]

RomanArzumanyan · 2024-12-20T15:21:41Z

@vincentvic

DTS is decode time stamp. It's the moment of time in stream time base units when the packet is to be decoded.
PTS is presentation time stamp. Similar thing but it describes the time decoded frame shall be presented to user (shown in video player etc.).

If there's a frame reordering (e. g. B frames are there), PTS and DTS of same packet may be different.
DTS shall increase monotonically and it doesn't have to start from zero.

So values of 512 and 1536 don't tell much by themselves.

vincentvic · 2025-01-15T16:16:38Z

Hello,

First Happy New Year !!
I think, I have probably found why some videos does not work with this message.

[h264` @ 0x56293b426c80] decoder->cvdl->cuvidCreateDecoder(&decoder->decoder, params) failed -> CUDA_ERROR _INVALID_VALUE: invalid argument [h264 @ 0x56293b426c80] Using more than 32 (33) decode surfaces might cause nvdec to fail. [h264 @ 0x56293b426c80] Try lowering the amount of threads. Using 5 right now. [h264 @ 0x56293b426c80] Failed setup for format cuda: hwaccel initialisation returned error.

Do you what does it mean exactly and if we can fix it ?
Thanks a lot

RomanArzumanyan · 2025-01-15T18:56:47Z

Hi @vincentvic

I think, I have probably found why some videos does not work with this message.

Did you get this exact message from VALI error logs ? If so, under what conditions ?
I'm a bit surprised, let me explain below.

Do you what does it mean exactly and if we can fix it ?

VALI uses cuvid decoder path within libavcodec which isn't similar to nvdec.
Some time ago I actually submitted the patch to ffmpeg that sets up minimal possible amount of surfaces to be allocated for decoder internal pool:

https://github.com/FFmpeg/FFmpeg/blob/4f3c9f2f03378a08692a26532bc3146414717f8c/libavcodec/cuviddec.c#L320

    fifo_size_inc = ctx->nb_surfaces;
    ctx->nb_surfaces = FFMAX(ctx->nb_surfaces, format->min_num_decode_surfaces + 3);

    if (avctx->extra_hw_frames > 0)
        ctx->nb_surfaces += avctx->extra_hw_frames;

What happens here is cuvid takes minimal amount of surfaces required to store in DPB and adds 3 extra surfaces to deal with async stuff (doing the other way will harm the performance).

To my best knowledge, high H.264 / H.265 levels and tiers require up to 16 decoded frames in internal buffer, so the overall amount shall not go higher then 19.

vincentvic · 2025-01-16T09:57:14Z

Hi!

I have this specifc message when I try to re-encode with this command :
ffmpeg -i <input.mp4> -c:v h264_nvenc -preset slow -crf 22 'output.mp4'

because vali crash at the first frame with this error message
[h264_cuvid @ 0x558a8aac76c0] ctx->cvdl->cuvidParseVideoData(ctx->cuparser, &cupkt) failed -> CUDA_ERROR_UNKNOWN: unknown error

In re-encoding the video, it works but the messsage of ffmpeg with the number of decode surfaces seem to be very linked to the error in vali.

RomanArzumanyan · 2025-01-16T19:59:36Z

@vincentvic

I'm afraid the discussion is a bit derailed.
Can you provide me with MVP that illustrates the erroneous VALI behavior ?
I'll try to repro on my machine.

vincentvic · 2025-01-17T14:08:57Z

Hi @RomanArzumanyan

This is the script that I'm trying to run. I do not know if can share you a small cut of the video "input.mp4"?

import python_vali as vali

class` StopExecution(Exception):
    def _render_traceback_(self):
        return []

CONFIG_FFMPEG = {
    'codec': 'h264',
    'hwaccel_output_format': 'cuda',
    'hwaccel': 'cuda',
    'ignore_editlist': 'true',
    'preset': 'hq',
}
pyDec = vali.PyDecoder('./input.mp4', CONFIG_FFMPEG, gpu_id=0)
surf_nv12 = vali.Surface.Make(format=pyDec.Format, width=pyDec.Width, height=pyDec.Height, gpu_id=0)
surf_yuv = vali.Surface.Make(format=vali.PixelFormat.YUV420, width=pyDec.Width, height=pyDec.Height, gpu_id=0)
surf_rgb = vali.Surface.Make(format=vali.PixelFormat.RGB, width=pyDec.Width, height=pyDec.Height, gpu_id=0)
surf_pln = vali.Surface.Make(format=vali.PixelFormat.RGB_PLANAR, width=pyDec.Width, height=pyDec.Height, gpu_id=0)
to_yuv = vali.PySurfaceConverter(vali.PixelFormat.NV12, vali.PixelFormat.YUV420, gpu_id=0)
to_rgb = vali.PySurfaceConverter(vali.PixelFormat.YUV420, vali.PixelFormat.RGB, gpu_id=0)
to_pln = vali.PySurfaceConverter(vali.PixelFormat.RGB, vali.PixelFormat.RGB_PLANAR, gpu_id=0)
cc_ctx = vali.ColorspaceConversionContext(vali.ColorSpace.BT_601, vali.ColorRange.MPEG)

pkt_data = vali.PacketData()
frame_idx = 0
while True:
    # NV12 surface
    success, details = pyDec.DecodeSingleSurface(surf_nv12, pkt_data,)
    if not success:
        raise VideoError(f'At frame {frame_idx}: {details} => need to analyse/re-encode')

    # NV12 -> YUV420
    success, details = to_yuv.Run(surf_nv12, surf_yuv, cc_ctx)
    if not success:
        raise StopExecution
    # YUV420 -> RGB
    success, details = to_rgb.Run(surf_yuv, surf_rgb, cc_ctx)
    if not success:
        raise StopExecution
    # RGB -> RGB Planar
    success, details = to_pln.Run(surf_rgb, surf_pln, cc_ctx)
    if not success:
        raise StopExecution

This python script return this error:
[h264_cuvid @ 0x55a737be82c0] ctx->cvdl->cuvidParseVideoData(ctx->cuparser, &cupkt) failed -> CUDA_ERROR_UNKNOWN: unknown error
Error while sending a packet to the decoder. Error description: Generic error in an external library

RomanArzumanyan · 2025-01-17T14:33:33Z

Hi @vincentvic

To start with, your CONFIG_FFMPEG parameters are really unusual.
Let me explain:

CONFIG_FFMPEG = {
    # You don't need these 3 lines. They are ffmpeg-specific. VALI will do that under the hood for you.
    'codec': 'h264',
    'hwaccel_output_format': 'cuda',
    'hwaccel': 'cuda',
    # No comments on this, don't know the meaning.
    'ignore_editlist': 'true',
    # This is encoder preset. No need to pass it to decoder.
    'preset': 'hq',
}

Please clean them up and re-check

vincentvic · 2025-01-17T14:44:21Z

We need the ignore_editlist parameter in our case but indeed we can comment the others but it does not change anything in the error message.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA_ERROR_UNKNOWN: unknown error #122

CUDA_ERROR_UNKNOWN: unknown error #122

vincentvic commented Dec 19, 2024 •

edited

Loading

RomanArzumanyan commented Dec 19, 2024

vincentvic commented Dec 19, 2024 •

edited

Loading

RomanArzumanyan commented Dec 19, 2024

vincentvic commented Dec 19, 2024

RomanArzumanyan commented Dec 19, 2024

vincentvic commented Dec 19, 2024 •

edited

Loading

RomanArzumanyan commented Dec 20, 2024

vincentvic commented Dec 20, 2024

vincentvic commented Dec 20, 2024

RomanArzumanyan commented Dec 20, 2024

vincentvic commented Dec 20, 2024 •

edited

Loading

vincentvic commented Dec 20, 2024

RomanArzumanyan commented Dec 20, 2024 •

edited

Loading

vincentvic commented Dec 20, 2024

RomanArzumanyan commented Dec 20, 2024 •

edited

Loading

vincentvic commented Dec 20, 2024 •

edited

Loading

RomanArzumanyan commented Dec 20, 2024

vincentvic commented Dec 20, 2024

RomanArzumanyan commented Dec 20, 2024

vincentvic commented Jan 15, 2025 •

edited

Loading

RomanArzumanyan commented Jan 15, 2025 •

edited

Loading

vincentvic commented Jan 16, 2025

RomanArzumanyan commented Jan 16, 2025

vincentvic commented Jan 17, 2025 •

edited by RomanArzumanyan

Loading

RomanArzumanyan commented Jan 17, 2025 •

edited

Loading

vincentvic commented Jan 17, 2025 •

edited

Loading

CUDA_ERROR_UNKNOWN: unknown error #122

CUDA_ERROR_UNKNOWN: unknown error #122

Comments

vincentvic commented Dec 19, 2024 • edited Loading

RomanArzumanyan commented Dec 19, 2024

vincentvic commented Dec 19, 2024 • edited Loading

RomanArzumanyan commented Dec 19, 2024

vincentvic commented Dec 19, 2024

RomanArzumanyan commented Dec 19, 2024

vincentvic commented Dec 19, 2024 • edited Loading

RomanArzumanyan commented Dec 20, 2024

vincentvic commented Dec 20, 2024

vincentvic commented Dec 20, 2024

RomanArzumanyan commented Dec 20, 2024

vincentvic commented Dec 20, 2024 • edited Loading

vincentvic commented Dec 20, 2024

RomanArzumanyan commented Dec 20, 2024 • edited Loading

vincentvic commented Dec 20, 2024

RomanArzumanyan commented Dec 20, 2024 • edited Loading

vincentvic commented Dec 20, 2024 • edited Loading

RomanArzumanyan commented Dec 20, 2024

vincentvic commented Dec 20, 2024

RomanArzumanyan commented Dec 20, 2024

vincentvic commented Jan 15, 2025 • edited Loading

RomanArzumanyan commented Jan 15, 2025 • edited Loading

vincentvic commented Jan 16, 2025

RomanArzumanyan commented Jan 16, 2025

vincentvic commented Jan 17, 2025 • edited by RomanArzumanyan Loading

RomanArzumanyan commented Jan 17, 2025 • edited Loading

vincentvic commented Jan 17, 2025 • edited Loading

vincentvic commented Dec 19, 2024 •

edited

Loading

vincentvic commented Dec 19, 2024 •

edited

Loading

vincentvic commented Dec 19, 2024 •

edited

Loading

vincentvic commented Dec 20, 2024 •

edited

Loading

RomanArzumanyan commented Dec 20, 2024 •

edited

Loading

RomanArzumanyan commented Dec 20, 2024 •

edited

Loading

vincentvic commented Dec 20, 2024 •

edited

Loading

vincentvic commented Jan 15, 2025 •

edited

Loading

RomanArzumanyan commented Jan 15, 2025 •

edited

Loading

vincentvic commented Jan 17, 2025 •

edited by RomanArzumanyan

Loading

RomanArzumanyan commented Jan 17, 2025 •

edited

Loading

vincentvic commented Jan 17, 2025 •

edited

Loading