Skip to content
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.

Big memory issue on retrieving job properties for utility size circuits #775

Open
zlatko-minev opened this issue Dec 13, 2023 · 0 comments
Labels
type: bug Something isn't working

Comments

@zlatko-minev
Copy link

zlatko-minev commented Dec 13, 2023

Information

  • qiskit-ibm-provider version: Any
  • Python version: 3.11
  • Operating system: MacOS

What is the current behavior?

The issue is that calling job.header() or similar receives the full _get_params(), which can be hundreds of MB, when the object of interest, header for instance, is only kilobytes. This means batch job processing becomes impossible with utility-size circuits. Part of the issue is this grabs the quantum circuits, which can have attached pulse schedules, and are 100x100 with 5,000+ gates.

This is important when porting over to the new runtime provider.

Steps to reproduce the problem

Simple version

provider = IBMProvider()
job = provider.retrieve_job("cns135cheefg008c343g")
header = job.header()

With memory tracing



import gc
import os, psutil
from qiskit_ibm_provider import IBMProvider
from pympler import asizeof

def human_readable_size(num: float):
    exp_str = [
        (0, "B"),
        (10, "KB"),
        (20, "MB"),
        (30, "GB"),
        (40, "TB"),
        (50, "PB"),
    ]
    i = 0
    while i + 1 < len(exp_str) and num >= (2 ** exp_str[i + 1][0]):
        i += 1
    rounded_val = round(float(num) / 2 ** exp_str[i][0], 2)
    return "%.1f %s" % (rounded_val, exp_str[i][1])

def get_kernel_memory_usage(pid=None, do_print=False):
    pid = pid or os.getpid()
    process = psutil.Process(pid)
    memory_usage_bytes = process.memory_info().rss  # Memory usage in bytes
    memory_usage_gb = memory_usage_bytes / (1024**3)  # Convert bytes to GB
    if do_print:
        print(
            f"Current Python kernel memory usage: {memory_usage_gb:.3f} GB   [pid={pid}]"
        )
    return memory_usage_bytes, pid

provider = IBMProvider()

get_kernel_memory_usage(do_print=True);
job = provider.retrieve_job("cns135cheefg008c343g")
get_kernel_memory_usage(do_print=True);
print('job      size = ', human_readable_size(asizeof.asizeof(job)))
print('provider size = ', human_readable_size(asizeof.asizeof(provider)))

get_kernel_memory_usage(do_print=True);
header = job.header()
get_kernel_memory_usage(do_print=True);
print('header   size = ', human_readable_size(asizeof.asizeof(header)))
print('job      size = ', human_readable_size(asizeof.asizeof(job)))
print('provider size = ', human_readable_size(asizeof.asizeof(provider)))
image

What is the expected behavior?

Fast and small memory of sie of header, not 100+ Mb

Suggested solutions

Do not retrieve the full params in self._get_params()
image

@zlatko-minev zlatko-minev added the type: bug Something isn't working label Dec 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant