[Error:38]Function not implemented. multiprocessing in graphene #2689

Yujindawang · 2021-09-24T02:03:59Z

I run a pytorch program in graphene, but here threw an error when it comes to multi-process communication.

Traceback (most recent call last):
  File "/workplace/app/predict.py", line 137, in <module>
    main(input_dir)
  File "/workplace/app/predict.py", line 113, in main
    for i, (images) in enumerate(tqdm(val_loader)):
  File "/usr/local/lib/python3.6/dist-packages/tqdm/std.py", line 1185, in __iter__
    for obj in iterable:
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 355, in __iter__
    return self._get_iterator()
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 301, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 887, in __init__
    self._worker_result_queue = multiprocessing_context.Queue()  # type: ignore
  File "/usr/lib/python3.6/multiprocessing/context.py", line 102, in Queue
    return Queue(maxsize, ctx=self.get_context())
  File "/usr/lib/python3.6/multiprocessing/queues.py", line 42, in __init__
    self._rlock = ctx.Lock()
  File "/usr/lib/python3.6/multiprocessing/context.py", line 67, in Lock
    return Lock(ctx=self.get_context())
  File "/usr/lib/python3.6/multiprocessing/synchronize.py", line 162, in __init__
    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
  File "/usr/lib/python3.6/multiprocessing/synchronize.py", line 59, in __init__
    unlink_now)

Sorry I can’t provide the relevant code. So i typed a simple example to simulate this situation.

import os
import threading
import multiprocessing

# Main
print('Main:', os.getpid())

# worker function
def worker(sign, lock):
    lock.acquire()
    print(sign, os.getpid())
    lock.release()


# Multi-thread
record = []
lock = threading.Lock()

# Multi-process
record = []
lock = multiprocessing.Lock()

if __name__ == '__main__':
    for i in range(5):
        thread = threading.Thread(target=worker, args=('thread', lock))
        thread.start()
        record.append(thread)

    for thread in record:
        thread.join()
    
    for i in range(5):
        process = multiprocessing.Process(target=worker, args=('process', lock))
        process.start()
        record.append(process)
    
    for process in record:
        process.join()

when i run this py in graphene, same error was thrown.

Traceback (most recent call last):
  File "/workplace/app/test.py", line 21, in <module>
    lock = multiprocessing.Lock()
  File "/usr/lib/python3.6/multiprocessing/context.py", line 67, in Lock
    return Lock(ctx=self.get_context())
  File "/usr/lib/python3.6/multiprocessing/synchronize.py", line 162, in __init__
    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
  File "/usr/lib/python3.6/multiprocessing/synchronize.py", line 59, in __init__
    unlink_now)

This seems to be related to Sem_lock.

So I want to confirm the problems of multi-process in graphene and how to solve this bug?

Thank you for your attention and look forward to your reply. T_T

The text was updated successfully, but these errors were encountered:

dimakuv · 2021-09-24T09:07:47Z

From what I understand, Python's multiprocessing package uses POSIX semaphores (or maybe even older Sys-V semaphores). Check https://linux.die.net/man/7/sem_overview.

Gramine currently doesn't support semaphores at all. So this example unfortunately cannot run in Gramine.

dimakuv · 2021-09-24T09:10:04Z

You can actually add loader.log_level = "trace" in your manifest file, and re-run your simple example. Gramine will output a lot of additional info.

If you see some system call like sem_wait() or sem_open() or segmet(), and they return -38 ("not implemented"), then it's definitely the problem: Gramine doesn't support these system calls (Gramine doesn't support semaphores).

Yujindawang · 2021-09-24T09:46:00Z

You can actually add loader.log_level = "trace" in your manifest file, and re-run your simple example. Gramine will output a lot of additional info.

If you see some system call like sem_wait() or sem_open() or segmet(), and they return -38 ("not implemented"), then it's definitely the problem: Gramine doesn't support these system calls (Gramine doesn't support semaphores).

i got it, so is there any workaround to deal with this problem?

dimakuv · 2021-09-24T09:49:28Z

i got it, so is there any workaround to deal with this problem?

The only workaround I can think of is: don't use multiprocessing package in Python.

Yujindawang · 2021-09-28T01:28:41Z

i got it, so is there any workaround to deal with this problem?

The only workaround I can think of is: don't use multiprocessing package in Python.

I try to set num_workers=0 to ban the multiprocessing function ,

data.DataLoader(val_dst, batch_size=opts.val_batch_size, shuffle=False, num_workers=0)

The original error 38 disappeared, but another bug happened, my program killed by singal 8. I checked that it means 8) SIGFPE.
I have tried without graphene, it ran successfully. Then i try to increase enclave_size, stack.size and pal_internal_mem_size,but all invalid. I have no idea what happened.

for i, (images) in enumerate(tqdm(val_loader)):

dimakuv · 2021-09-28T06:14:06Z

arithmetic fault is interesting!

Can you debug your program using gdb? You simply build all Gramine in debug mode and then run GDB=1 gramine-sgx <your app>. For more info, check https://gramine.readthedocs.io/en/latest/devel/debugging.html

The debugger should point you to the place in code / assembly instruction where SIGFPE (airthmetic fault) happens. This may immediately give you an idea of what goes wrong. Or at least you can copy-paste the assembly snippet where SIGFPE happens -- maybe we'll be able to help just by looking at the failing assembly.

Yujindawang · 2021-09-30T01:36:47Z

arithmetic fault is interesting!

Can you debug your program using gdb? You simply build all Gramine in debug mode and then run GDB=1 gramine-sgx <your app>. For more info, check https://gramine.readthedocs.io/en/latest/devel/debugging.html

The debugger should point you to the place in code / assembly instruction where SIGFPE (airthmetic fault) happens. This may immediately give you an idea of what goes wrong. Or at least you can copy-paste the assembly snippet where SIGFPE happens -- maybe we'll be able to help just by looking at the failing assembly.

Sorry for not being able to check the message due to some things in the past two days, i will try GBD immediately.

dimakuv mentioned this issue Jun 27, 2022

error： _multiprocessing.SemLock( FileNotFoundError: [Errno 2] No such file or directory gramineproject/examples#33

Open

dimakuv mentioned this issue Mar 24, 2023

[LibOS] RFC: support for System-V semaphores gramineproject/gramine#1248

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Error:38]Function not implemented. multiprocessing in graphene #2689

[Error:38]Function not implemented. multiprocessing in graphene #2689

Yujindawang commented Sep 24, 2021 •

edited

Loading

dimakuv commented Sep 24, 2021

dimakuv commented Sep 24, 2021

Yujindawang commented Sep 24, 2021

dimakuv commented Sep 24, 2021

Yujindawang commented Sep 28, 2021 •

edited

Loading

dimakuv commented Sep 28, 2021

Yujindawang commented Sep 30, 2021

[Error:38]Function not implemented. multiprocessing in graphene #2689

[Error:38]Function not implemented. multiprocessing in graphene #2689

Comments

Yujindawang commented Sep 24, 2021 • edited Loading

dimakuv commented Sep 24, 2021

dimakuv commented Sep 24, 2021

Yujindawang commented Sep 24, 2021

dimakuv commented Sep 24, 2021

Yujindawang commented Sep 28, 2021 • edited Loading

dimakuv commented Sep 28, 2021

Yujindawang commented Sep 30, 2021

Yujindawang commented Sep 24, 2021 •

edited

Loading

Yujindawang commented Sep 28, 2021 •

edited

Loading