From d8aab3f72c9a269ae90567dc91574f7d7671fdfc Mon Sep 17 00:00:00 2001 From: David Ormsbee Date: Mon, 29 Apr 2024 15:46:14 -0400 Subject: [PATCH] fix: remove modulestore init to avoid pymongo deadlocks Prior to this commit, the LMS would log the following error in tutor production: pymongo/topology.py:175: UserWarning: MongoClient opened before fork. Create MongoClient only after forking. See PyMongo's documentation for details: https://pymongo.readthedocs.io/en/stable/faq.html#is-pymongo-fork-safe Quoting from that page: > PyMongo is not fork-safe. Care must be taken when using instances of > MongoClient with fork(). Specifically, instances of MongoClient must > not be copied from a parent process to a child process. Instead, the > parent process and each child process must create their own instances > of MongoClient. Instances of MongoClient copied from the parent > process have a high probability of deadlock in the child process due > to the inherent incompatibilities between fork(), threads, and locks > described below. PyMongo will attempt to issue a warning if there is a > chance of this deadlock occurring. For edx-platform, the MongoClient connection is initalized with the modulestore() invocation. That call creates and caches a global variable that Studio or the LMS will reuse across the life of the worker process. That initialization was put into lms/wsgi.py in 7c758ec9, but originated in lms/startup.py with 51d0dd1. The original reason for it is because at that time (2013), we still supported the XML Modulestore, which stored courses on disk as directories of OLX files and static assets. The XML Modulestore would then read the entirety of those courses into memory at startup. This meant that the startup process was *extremely* expensive, so we needed to have it happen before the workers started serving requests to users, instead of having the system lazily read them in when the first user request arrived. Loading course content in this form hasn't been supported since 2016, meaning that modulestore initialization is no longer the performance time bomb that it once was. The fact that this code remained here is likely an oversight, which was considered harmless until @ztraboo reported these pymongo log messages during the course of investigating performance issues: https://discuss.openedx.org/t/atlas-mongodb-performance-issues-un-indexed-queries/12803/16 Two potential followups that should be explored after this: 1. Tutor should probably be forking earlier than this, before we load Django settings and initialize database and cache connections. 2. It's possible that the caching mechanism for modulestore should be revisited to operate at the request cache level. The performance benefit of keeping it around may not be worth the potential memory leaks. Anything we do here would have to be very carefully monitored though, since connection costs may add up. --- lms/wsgi.py | 7 ------- 1 file changed, 7 deletions(-) diff --git a/lms/wsgi.py b/lms/wsgi.py index bb0d0f7ede8..169b6929fbf 100644 --- a/lms/wsgi.py +++ b/lms/wsgi.py @@ -18,13 +18,6 @@ import lms.startup as startup # lint-amnesty, pylint: disable=wrong-import-position startup.run() -from xmodule.modulestore.django import modulestore # lint-amnesty, pylint: disable=wrong-import-position - -# Trigger a forced initialization of our modulestores since this can take a -# while to complete and we want this done before HTTP requests are accepted. -modulestore() - - # This application object is used by the development server # as well as any WSGI server configured to use this file. from django.core.wsgi import get_wsgi_application # lint-amnesty, pylint: disable=wrong-import-order, wrong-import-position