-
Notifications
You must be signed in to change notification settings - Fork 397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide APIs for enabling cgroup subsystems #1424
Provide APIs for enabling cgroup subsystems #1424
Conversation
@charliegracie @DanHeidinga can you please review. |
@charliegracie @DanHeidinga any estimate when this could be reviewed? |
I am not sure why the mutex is requried? Should the mutex be a pthread mutex or a omrthread mutex? Should the mutex be a global variable or a field in the port lirbary struct? Waiting for a thumbs up from @DanHeidinga as well as the answers to my questions above. Thanks |
Since the mutex is only required for protecting |
Re: the mutex. Is that what other places in the port library do? |
I see omrsignal.c using a |
e946c21
to
52052b1
Compare
I have updated the change set to use omrthread_monitor_t instead of pthread_mutex_t. |
include_core/omrport.h
Outdated
/** see @ref omrsysinfo.c::omrsysinfo_cgroup_enable_limits "omrsysinfo_cgroup_enable_limits"*/ | ||
int32_t ( *sysinfo_cgroup_enable_limits)(struct OMRPortLibrary *portLibrary); | ||
/** see @ref omrsysinfo.c::omrsysinfo_cgroup_system_is_available "omrsysinfo_cgroup_system_is_available"*/ | ||
int32_t ( *sysinfo_cgroup_system_is_available)(struct OMRPortLibrary *portLibrary) ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be sysinfo_cgroup_is_system_available()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this sounds better.
port/common/omrsysinfo.c
Outdated
{ | ||
return OMRPORT_ERROR_SYSINFO_CGROUP_UNSUPPORTED_PLATFORM; | ||
return OMRPORT_ERROR_SYSINFO_CGROUP_UNSUPPORTED_PLATFORM;; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: double ;
at end of line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the name and the comment, I'm not confident I know what this method would do. Is it testing if the runtime is running in a cgroup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
*/ | ||
uint64_t | ||
omrsysinfo_cgroup_enable_subsystems(struct OMRPortLibrary *portLibrary, uint64_t requestedSubsystems) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are errors handled / communicated to the caller by this function?
What if the user tries to enable a subsystem that isn't enabled in the system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error can occur during omrsysinfo_cgroup_is_system_available
in which case no subsystem is enabled, but the error is not communicated back to the user.
I am expecting the APIs would be used in following manner:
if (0 == omrsysinfo_cgroup_is_system_available()) {
availableSubsystems = omrsysinfo_cgroup_get_available_subsystems();
if (requiredSubsystem & availableSubsystems) {
omrsysinfo_cgroup_enable_subsystems(requiredSubsystems);
}
}
What if the user tries to enable a subsystem that isn't enabled in the system?
No error is returned. If the APIs are used as above, this situation shouldn't happen.
And the API actually returns the subsystems that are enabled. So the user can check which subsystems got enabled on this call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So a user that wanted to validate that the subsystems were actually enabled would need to do the following:
if (0 == omrsysinfo_cgroup_is_system_available()) {
availableSubsystems = omrsysinfo_cgroup_get_available_subsystems();
if (requiredSubsystem & availableSubsystems) {
uint64_t enabledSubsystems = omrsysinfo_cgroup_enable_subsystems(requiredSubsystems);
if ((enabledSubsystems & requiredSybsystems) != enabledSubsystems) {
/* do error handling */
}
}
}
Isn't exactly user friendly but is something I can live with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually there is no need to call omrsysinfo_cgroup_get_available_subsystems() here:
if (0 == omrsysinfo_cgroup_is_system_available()) {
uint64_t enabledSubsystems = omrsysinfo_cgroup_enable_subsystems(requiredSubsystems);
if ((enabledSubsystems & requiredSybsystems) != enabledSubsystems) {
/* do error handling */
}
}
port/unix/omrsysinfo.c
Outdated
freeCgroupEntries(portLibrary, PPG_cgroupEntryList); | ||
omrthread_monitor_exit(cgroupEntryListMonitor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The monitor needs to protect the nulling of the PPG_cgroupEntryList = NULL;
as well as the call to freeCgroupEntries()
port/unix/omrsysinfo.c
Outdated
Trc_PRT_Assert_ShouldNeverHappen(); | ||
} | ||
/* unreachable code */ | ||
Trc_PRT_Assert_ShouldNeverHappen(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should issue a compiler warning as technically trace asserts can be disabled, resulting in falling off the end of a non-void function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a return here and then just the assert in the default case should be sufficient,
port/unix/omrsysinfo.c
Outdated
* subsystems that are available | ||
*/ | ||
uint64_t | ||
getAvailableSubsystems(struct OMRPortLibrary *portLibrary, OMRCgroupEntry *cgEntryList) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be static so it's not called elsewhere?
port/unix_include/omrportpg.h
Outdated
@@ -120,7 +121,8 @@ typedef struct OMRPortPlatformGlobals { | |||
#endif | |||
|
|||
#if defined(LINUX) | |||
#define PPG_cgroupLimitsEnabled (portLibrary->portGlobals->platformGlobals.cgroupLimitsEnabled) | |||
#define PPG_cgroupSubsystemsAvailable (portLibrary->portGlobals->platformGlobals.cgroupSubsystemsAvailable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three port library globals in play here and it looks like the two new ones are only valid when PPG_cgroupEntryList
is non-null?
Can you add a comment here (and in the commit) that describes the relationship between the three port globals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition PPG_cgroupEntryList
being non-null is used only to check if PPG_cgroupSubsystemsAvailable
and PPG_cgroupSubsystemsAvailable
have already been set. As such there is no relation between the three.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't that state a relationship? Neither PPG_cgroupSubsystemsAvailable or PPG_cgroupSubsystemsAvailable are valid if PPG_cgroupEntryList is null.
The code should state that invariant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the above invariant as a comment in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I haven't reviewed the test changes yet. I'll look at those once we're in agreement on the rest of the code changes. |
52052b1
to
36753bb
Compare
Updated the change set as per the comments. |
@DanHeidinga any more review comments? |
fvtest/porttest/si.cpp
Outdated
outputErrorMessage(PORTTEST_ERROR_ARGS, "omrsysinfo_cgroup_get_memlimit returned %d, expected %d\n", rc, OMRPORT_ERROR_SYSINFO_CGROUP_LIMITS_DISABLED); | ||
} | ||
#else | ||
#if !defined(LINUX) | ||
if (OMRPORT_ERROR_SYSINFO_CGROUP_UNSUPPORTED_PLATFORM != rc) { | ||
outputErrorMessage(PORTTEST_ERROR_ARGS, "omrsysinfo_cgroup_get_memlimit returned %d, exptected %d on platform that does not support cgroups\n", rc, OMRPORT_ERROR_SYSINFO_CGROUP_UNSUPPORTED_PLATFORM); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exptected
-> expected
fvtest/porttest/si.cpp
Outdated
@@ -2144,16 +2136,8 @@ TEST(PortSysinfoTest, sysinfo_cgroup_get_memlimit) | |||
outputErrorMessage(PORTTEST_ERROR_ARGS, "Expected omrsysinfo_cgroup_get_memlimit and omrsysinfo_get_physical_memory to return same value, but omrsysinfo_cgroup_get_memlimit returned %ld and omrsysinfo_get_physical_memory returned %ld\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two %ld
format specifiers in the outputErrorMessage call but no data is passed to the call.
include_core/omrport.h
Outdated
/** see @ref omrsysinfo.c::omrsysinfo_cgroup_get_available_subsystems "omrsysinfo_cgroup_get_available_subsystems"*/ | ||
uint64_t ( *sysinfo_cgroup_get_available_subsystems)(struct OMRPortLibrary *portLibrary); | ||
/** see @ref omrsysinfo.c::omrsysinfo_cgroup_is_subsystem_available "omrsysinfo_cgroup_is_subsystem_available"*/ | ||
BOOLEAN ( *sysinfo_cgroup_is_subsystem_available)(struct OMRPortLibrary *portLibrary, uint64_t subsystem); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int32_t ( *sysinfo_cgroup_is_system_available)(struct OMRPortLibrary *portLibrary) ;
BOOLEAN ( *sysinfo_cgroup_is_subsystem_available)(struct OMRPortLibrary *portLibrary, uint64_t subsystem);
Why does one return an int32_t and the other a BOOLEAN?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sysinfo_cgroup_is_system_available
can return negative error code to indicate the reason for unavailability of cgroup system. sysinfo_cgroup_is_subsystem_available
just checks the subsystem flags which are set.
include_core/omrport.h
Outdated
/** see @ref omrsysinfo.c::omrsysinfo_cgroup_enable_limits "omrsysinfo_cgroup_enable_limits"*/ | ||
int32_t ( *sysinfo_cgroup_enable_limits)(struct OMRPortLibrary *portLibrary); | ||
/** see @ref omrsysinfo.c::omrsysinfo_cgroup_is_system_available "omrsysinfo_cgroup_is_system_available"*/ | ||
int32_t ( *sysinfo_cgroup_is_system_available)(struct OMRPortLibrary *portLibrary) ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: extra space between )
& ;
.
port/common/omrsysinfo.c
Outdated
@@ -808,42 +808,74 @@ omrsysinfo_os_kernel_info(struct OMRPortLibrary *portLibrary, struct OMROSKernel | |||
} | |||
|
|||
/** | |||
* Checks if the port library can use cgroup limits | |||
* Returns TRUE if the platform supports cgroup system and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This says Returns TRUE
and then the @return 0 if the the cgroup system is available
which is the opposite. Which is correct?
port/unix/omrsysinfo.c
Outdated
@@ -3459,38 +3542,68 @@ omrsysinfo_cgroup_is_limits_supported(struct OMRPortLibrary *portLibrary) | |||
goto _end; | |||
} | |||
|
|||
omrthread_monitor_enter(cgroupEntryListMonitor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this monitor actually protecting? Two threads that got past the if (NULL == PPG_cgroupEntryList) {
check would BOTH execute readCgroupFile()
, just one after the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They would end up creating two lists of OMRCgroupEntry nodes and only one would be assigned to PPG_cgroupEntryList. Other one would be leaked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should check PPG_cgroupEntryList
under the mutex to determine whether it needs to call readCgroupFile()
so that it doesn't leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
port/unix/omrsysinfo.c
Outdated
if (0 != rc) { | ||
goto _end; | ||
} | ||
PPG_cgroupSubsystemsAvailable = getAvailableSubsystems(portLibrary, PPG_cgroupEntryList); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once readCgroupFile() has set PPG_cgroupEntryList to non-null, the PPG_cgroupSubsystemsAvailable global isn't read and this function will always return 0.
It seems strange that this function calls getAvailableSubsystems
. I would have (maybe naively?) expected that to be called by omrsysinfo_cgroup_get_available_subsystems
if the PPG_cgroupSubsystemsAvailable
global was null...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PPG_cgroupSubsystemsAvailable
cannot be checked for null; its not a pointer.
omrsysinfo_cgroup_is_system_available
calls getAvailableSubsystems
to ensure PPG_cgroupSubsystemsAvailable
is set properly at the time when PPG_cgroupEntryList
is created.
If getAvailableSubsystems
is to be called by omrsysinfo_cgroup_get_available_subsystems
, then it cannot rely on PPG_cgroupEntryList
being null to call it, as PPG_cgroupEntryList
may have been set previously by a call to omrsysinfo_cgroup_is_system_available
.
port/unix/omrsysinfo.c
Outdated
if (NULL == PPG_cgroupEntryList) { | ||
int32_t rc = portLibrary->sysinfo_cgroup_is_system_available(portLibrary); | ||
if (0 != rc) { | ||
PPG_cgroupSubsystemsAvailable = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the way this global is sometimes set here, sometimes set in sysinfo_cgroup_is_system_available. It needs to be set in a single place (single point of initialization)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it to omrsysinfo_cgroup_is_system_available
.
port/unix_include/omrportpg.h
Outdated
@@ -120,7 +121,8 @@ typedef struct OMRPortPlatformGlobals { | |||
#endif | |||
|
|||
#if defined(LINUX) | |||
#define PPG_cgroupLimitsEnabled (portLibrary->portGlobals->platformGlobals.cgroupLimitsEnabled) | |||
#define PPG_cgroupSubsystemsAvailable (portLibrary->portGlobals->platformGlobals.cgroupSubsystemsAvailable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't that state a relationship? Neither PPG_cgroupSubsystemsAvailable or PPG_cgroupSubsystemsAvailable are valid if PPG_cgroupEntryList is null.
The code should state that invariant.
port/win32/omrsysinfo.c
Outdated
{ | ||
return OMRPORT_ERROR_SYSINFO_CGROUP_UNSUPPORTED_PLATFORM; | ||
return OMRPORT_ERROR_SYSINFO_CGROUP_UNSUPPORTED_PLATFORM;; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: double ;
on this line
@DanHeidinga, there are few things pointed out by @dinogun on need for this PR and design of cgroup APIs. Main point is -if we update existing portlib APIs to use cgroup limits transparently, then it would change the meaning of the API. @dinogun, can you please add your thoughts as well. |
The rationale behind having existing apis return the cgroup limits is that it makes it way way easier for users of the port library to be docker aware. If the existing apis aren't updated to use the cgroup limits, then every caller will need to be updated to check if cgroups are enabled, get the cgroup limit otherwise get the original limit. That's a lot of code for every user to write at every callsite... and its easy to get wrong. The design here is to have the runtime opt-into the cgroup limits deliberately which mitigates the concerns about "change the meaning of the API". Can you point to runtimes using this api that would be broken by this change? |
@DanHeidinga I understand your point. One example where modifying existing API may have undesired effect is J9's |
The "saving grace" for APIs like the OSMXBean is to have the user opt into enabling cgroup limits system wide through some mechanism - in code, command line options, etc. |
@DanHeidinga I don't think I got this, can you please explain. |
I don't have a problem with
|
Just to be clear - if something is OSMXBean and wants to return the physical memory even though the runtime has enabled cgroup limits, then I'm ok with addressing that particular use case through a new bug / issue when it occurs. |
36753bb
to
0f0adcc
Compare
@DanHeidinga I have updated the change set to address the review comments. |
@DanHeidinga can you please review updated change set. |
I'm hoping to get through this review this week. Its been somewhat crazy due to internal deadlines recently and I haven't had a lot of cycles for this. Sorry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have reservations about the api, but don't have concrete comments about it now. Will comment further this week.
fvtest/porttest/si.cpp
Outdated
outputErrorMessage(PORTTEST_ERROR_ARGS, "omrsysinfo_cgroup_get_memlimit returned %d when cgroup limits was enabled successfully\n", rc); | ||
/* Compare sysinfo_get_physical_memory and sysinfo_cgroup_get_memlimit after enabling memory subsystem */ | ||
enabledSubsystems = omrsysinfo_cgroup_enable_subsystems(OMR_CGROUP_SUBSYSTEM_MEMORY); | ||
if (OMR_CGROUP_SUBSYSTEM_MEMORY == enabledSubsystems) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be
if (OMR_CGROUP_SUBSYSTEM_MEMORY == (enabledSubsystems & OMR_CGROUP_SUBSYSTEM_MEMORY))
to make the test resilient to other subsystems being enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats correct.
port/common/omrsysinfo.c
Outdated
* | ||
* @param[in] portLibrary pointer to OMRPortLibrary | ||
* | ||
* @return 0 if the port library can use cgroup limits, otherwise negative error code | ||
* @return 0 if the the cgroup system is available, otherwise negative error code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this really a boolean then? Indicating if the system is available or not?
I'm struggling to see what the user code would do with the failure case that that a boolean wouldn't give us.
freeCgroupEntries(portLibrary, PPG_cgroupEntryList); | ||
PPG_cgroupEntryList = NULL; | ||
omrthread_monitor_exit(cgroupEntryListMonitor); | ||
attachedPortLibraries -= 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great info to put in the code - a comment that describes the problem due to multiple port libraries and indicating the tests that run into this problem.
That way no one has to rediscover the rationale for this code when they investigating it in the future.
port/unix/omrsysinfo.c
Outdated
availableSubsystems |= OMR_CGROUP_SUBSYSTEM_MEMORY; | ||
} | ||
cgEntry = cgEntry->next; | ||
} while (cgEntry!= cgEntryList); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long is this list on a typical system? Vanilla hw vs inside a container? If the list is typically capped at 2-3 entries, this code is fine. If we expect longer lists, we'll want to write this differently, especially if it gets called at startup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently cgEntryList
has all cgroup subsystems configured in the system. In cgroup v1, that comes to 12 subsystems.
We can probably restrict it to only the ones we need - memory and cpu for now.
Let me add another change set on top of existing one which would highlight these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DanHeidinga please take a look at commit 175daad that avoids adding every subsystem to PPG_cgroupEntryList
and also sets PPG_cgroupSubsystemsAvailable
more efficiently than before.
6203283
to
129fed2
Compare
@DanHeidinga can you please review commit 175daad. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The APIs seem reasonable as a starting point. I'm still concerned about the use of globals and locks in this code. Once the globals and lock issues are addressed (and documented), this should be ready to merge.
port/unix/omrsysinfo.c
Outdated
@@ -3459,38 +3542,68 @@ omrsysinfo_cgroup_is_limits_supported(struct OMRPortLibrary *portLibrary) | |||
goto _end; | |||
} | |||
|
|||
omrthread_monitor_enter(cgroupEntryListMonitor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should check PPG_cgroupEntryList
under the mutex to determine whether it needs to call readCgroupFile()
so that it doesn't leak.
port/unix/omrsysinfo.c
Outdated
cgEntry->cgroup = cgEntry->subsystem + strlen(subsystem) + 1; | ||
strcpy(cgEntry->cgroup, cgroupName); | ||
for (i = 0; i < sizeof(supportedSubsystems) / sizeof(supportedSubsystems[0]); i++) { | ||
if (J9_ARE_NO_BITS_SET(PPG_cgroupSubsystemsAvailable, supportedSubsystems[i].flag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for two threads to be executing this method at once? If it's not, then the function should document what lock needs to be held when calling it. Otherwise, it shouldn't be reading / modifying global state (PPG_cgroupSubsystemsAvailable
) without a lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated addCgroupEntry()
to avoid using PPG_cgroupSubsystemsAvailable.
Available subsystems are now determined by the readCgroupFile()
and returned back via one of its parameters.
port/unix_include/omrportpg.h
Outdated
@@ -120,7 +121,8 @@ typedef struct OMRPortPlatformGlobals { | |||
#endif | |||
|
|||
#if defined(LINUX) | |||
#define PPG_cgroupLimitsEnabled (portLibrary->portGlobals->platformGlobals.cgroupLimitsEnabled) | |||
#define PPG_cgroupSubsystemsAvailable (portLibrary->portGlobals->platformGlobals.cgroupSubsystemsAvailable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the above invariant as a comment in the code?
@DanHeidinga added new commit 5a9e49f to address the comments. |
port/unix/omrsysinfo.c
Outdated
if (0 != rc) { | ||
goto _end; | ||
for (i = 0; i < sizeof(supportedSubsystems) / sizeof(supportedSubsystems[0]); i++) { | ||
if (J9_ARE_NO_BITS_SET(available, supportedSubsystems[i].flag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use OMR_ARE_NO_BITS_SET
instead of the J9
version. This will solve a compile break in this change.
@ashu-mehra Thanks for addressing the locks, globals and comments questions. The Travis and AppVeyor builds are failing now. Can you fix up the issues and then I'll give this another look over? |
5717b12
to
b8f34de
Compare
@DanHeidinga Added a commit to replace J9_ARE_NO_BITS_SET with OMR_ARE_NO_BITS_SET. |
Checking travis-ci build failure... |
Pushed a commit to fix compile failure reported by travis-ci build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thanks @ashu-mehra for working with us to get this ready to merge.
@charliegracie Can you or one of the other committers give this a second look and merge? The commits should probably be squashed prior to the merge. |
Hi @ashu-mehra would it be possible to refactor this commits to resolve the compilation issues into the appropriate original commits. It would clean up the commit history and make it easier to test the codebase at different points in history. If this was merged there would be at least 3 known points in history that the codebase would not compile. Thanks |
Instead of providing APIs that enable cgroup limits for all subsystems, a new set of APIs is added which allows to query which subsystems are available and enable subsystems individually. This would allow runtimes to enable cgroup subsystems based on their requirements. Note that there is another subtle change in the design of the port library APIs related to cgroup system. Enabling a cgroup subsystem in portlibrary (by calling omrsysinfo_cgroup_enable_subsystems) would make the port library APIs to use cgroup limits internally. Any port library API that directly returns cgroup limits, eg omrsysinfo_cgroup_get_memlimit(), does not check whether the corresponding subsystem (eg memory) is enabled or not. Therefore, when using omrsysinfo_cgroup_* APIs, the caller doesn't need to explicitly enable the corresponding subsystem in the port library. Consider the case of omrsysinfo_get_physical_memory() which returns physical memory of the system. If the system is running in a cgroup, and the cgroup's "memory" subsystem is enabled, then omrsysinfo_get_physical_memory() would internally call omrsysinfo_cgroup_get_memlimit(), and return memory limit imposed by the cgroup. However, if the user wants to just get the memory limit imposed by the cgroup, it can directly call omrsysinfo_cgroup_get_memlimit(), without enabling the "memory" subsystem. Issue: eclipse-omr#1281 Signed-off-by: Ashutosh Mehra <asmehra1@in.ibm.com>
d3f0de3
to
beb1470
Compare
@charliegracie I squashed the commits and brought it down from 5 to 2 commits. |
genie-omr build all |
It looks like I inserted a bug in
|
List of changes done: 1. Updated addCgroupEntry() to avoid using PPG_cgroupSubsystemsAvailable - Updated readCgroupFile() to compute available subsystems - Add check for PPG_cgroupEntryList before calling readCgroupFile() - Added comment to explain reason for using global variable attachedPortLibraries - Macros of type J9_ARE_*_BITS_SET have been replaced by OMR_ARE_*_BITS_SET - Moved declaration of variable 'rc' such that it is declared only for LINUX platform. Signed-off-by: Ashutosh Mehra <asmehra1@in.ibm.com>
beb1470
to
d94ef6f
Compare
@genie-omr build all |
if (0 != rc) { | ||
goto _end; | ||
for (i = 0; i < sizeof(supportedSubsystems) / sizeof(supportedSubsystems[0]); i++) { | ||
if (OMR_ARE_NO_BITS_SET(available, supportedSubsystems[i].flag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have preferred that this if statement be broken into 2 so it is easier to debug but that is a stylistic issue that is not worth changing the code for. I will accept this change as is.
This is a great start to the Cgroup support. Thank you |
Echoing Charlie's comments - thanks for working through the reviews on this with us @ashu-mehra! Great to see it merged! |
😌 |
Instead of providing APIs that enable cgroup limits for all
subsystems, a new set of APIs is added which allows to query
which subsystems are available and enable subsystems individually.
This would allow runtimes to enable cgroup subsystems based on their
requirements.
Note that there is another subtle change in the design of the
port library APIs related to cgroup system.
Enabling a cgroup subsystem in portlibrary (by calling
omrsysinfo_cgroup_enable_subsystems) would make the port library APIs
to use cgroup limits internally.
Any port library API that directly returns cgroup limits,
eg omrsysinfo_cgroup_get_memlimit(), does not check whether the
corresponding subsystem (eg memory) is enabled or not.
Therefore, when using omrsysinfo_cgroup_* APIs,
the caller doesn't need to explicitly enable the corresponding
subsystem in the port library.
Consider the case of omrsysinfo_get_physical_memory() which returns
physical memory of the system.
If the system is running in a cgroup, and the cgroup's "memory"
subsystem is enabled, then omrsysinfo_get_physical_memory()
would internally call omrsysinfo_cgroup_get_memlimit(), and
return memory limit imposed by the cgroup.
However, if the user wants to just get the memory limit imposed by the
cgroup, it can directly call omrsysinfo_cgroup_get_memlimit(),
without enabling the "memory" subsystem.
Issue: #1281
Signed-off-by: Ashutosh Mehra asmehra1@in.ibm.com