-
-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistently reports 0 W for DRAM consumption #108
Comments
From reading other issues and how they're debugged, maybe this is useful? Sorry, I know very little about Powercap 😅 Just ask if there's anything I can do to help in the debugging here :) root@Ubuntu-2004-focal-64-minimal ~ # ls /sys/class/powercap/
intel-rapl intel-rapl:0 intel-rapl:0:0 intel-rapl:0:1 intel-rapl:0:2 root@Ubuntu-2004-focal-64-minimal ~ # ls /sys/class/powercap/intel-rapl:0:2
constraint_0_max_power_uw device name
constraint_0_name enabled power
constraint_0_power_limit_uw energy_uj subsystem
constraint_0_time_window_us max_energy_range_uj uevent root@Ubuntu-2004-focal-64-minimal /sys/class/powercap/intel-rapl:0:2 # cat energy_uj
185204869380 |
Hi ! Thanks for reporting that. Could you confirm that /sys/class/powercap/intel-rapl:0:2/name contains "dram" ? What's strange too is that the sum of "core" and "uncore" power consumption is above the total power consumption of the host. I couldn't reproduce it on the machines I have access to. We could have diverse behaviour depending on the hardware and the kernel version. We will surely have to ask you to run some tests to work on it. I've added this issue to the global board, I'd prefer someone else to look at it but I'll do it if nothing has moved in a few time. |
Hi @bpetit, thank you for your quick and kind reply! Yeah, indeed, the sums not adding up is strange, now that you mention it... I can confirm that the root@Ubuntu-2004-focal-64-minimal ~ # cat /sys/class/powercap/intel-rapl:0:2/name
dram |
Update: I discovered powerapi-ng/energy-scripts today, which seems to be a shell script reading from the powercap sensor. I was able to have it report DRAM data back to me, suggesting that at least my sensor is able to report data on the dram usage (unless the energy-scripts file is lying to me, haha!). On the other hand, it reports Uncore as 0.
as far as I can tell from the source code, however, measureit.sh reads from another path than scaphandre, i.e. The same applies to when I run the root@Ubuntu-2004-focal-64-minimal ~ # gcc rapl-read.c -lm -o rapl-read.out
root@Ubuntu-2004-focal-64-minimal ~ # ./rapl-read.out -s
RAPL read -- use -s for sysfs, -p for perf_event, -m for msr
Found Kaby Lake Processor type
0 (0), 1 (0), 2 (0), 3 (0), 4 (0), 5 (0), 6 (0), 7 (0)
Detected 8 cores in 1 packages
Trying sysfs powercap interface to gather results
Sleeping 1 second
Package 0
package-0 : 1.356625J
core : 0.942808J
uncore : 0.000000J
dram : 1.069944J
root@Ubuntu-2004-focal-64-minimal ~ # ./rapl-read.out -p
RAPL read -- use -s for sysfs, -p for perf_event, -m for msr
Found Kaby Lake Processor type
0 (0), 1 (0), 2 (0), 3 (0), 4 (0), 5 (0), 6 (0), 7 (0)
Detected 8 cores in 1 packages
Trying perf_event interface to gather results
Event=energy-cores Config=1 scale=2.32831e-10 units=Joules
Event=energy-gpu Config=4 scale=2.32831e-10 units=Joules
Event=energy-pkg Config=2 scale=2.32831e-10 units=Joules
Event=energy-ram Config=3 scale=2.32831e-10 units=Joules
Sleeping 1 second
Package 0:
energy-cores Energy Consumed: 0.878906 Joules
energy-gpu Energy Consumed: 0.000000 Joules
energy-pkg Energy Consumed: 1.285156 Joules
energy-ram Energy Consumed: 1.052246 Joules |
I'm continuing my futile attempts to debug this. TL;DR: Could it be that the an assumption is made in the exporter(s) that the domains are in the order [core, uncore, dram], while they (depending on what When trying to log what the path to the counter_uj_path for sockets are, I get that only one socket is created (and iterated over), and the path to its energy counter uj file is I have no familiarity with rust - can I somehow build with a debug flag to view the output of the diff --git a/src/exporters/stdout.rs b/src/exporters/stdout.rs
index e22efb3..52265f1 100644
--- a/src/exporters/stdout.rs
+++ b/src/exporters/stdout.rs
@@ -99,6 +99,10 @@ impl StdoutExporter {
fn iterate(&mut self) {
self.topology.refresh();
+ println!("We have {} sockets", &self.topology.sockets.len());
+ for s in &self.topology.sockets {
+ println!("{}: path was {}", s.id.to_string(), s.counter_uj_path);
+ }
self.show_metrics();
}
diff --git a/src/sensors/mod.rs b/src/sensors/mod.rs
index 32fbcdc..69cca6c 100644
--- a/src/sensors/mod.rs
+++ b/src/sensors/mod.rs
@@ -609,6 +609,7 @@ impl CPUSocket {
counter_uj_path: String,
buffer_max_kbytes: u16,
) -> CPUSocket {
+ println!("Created socket with path: {}", counter_uj_path);
CPUSocket {
id,
domains, On the other hand, when logging the path(s) used of each domain in the StdoutExporter, I am a bit confused about which path should be read for each domain. With my limited understanding, it almost looks like they are in the wrong order, and that instead of the order [core, uncore, dram], the domains are ordered [dram, core, uncore]...
diff --git a/src/exporters/stdout.rs b/src/exporters/stdout.rs
index e22efb3..62c9096 100644
--- a/src/exporters/stdout.rs
+++ b/src/exporters/stdout.rs
@@ -89,7 +89,13 @@ impl StdoutExporter {
if let Some(socket) = socket_present {
let mut domains_power: Vec<Option<Record>> = vec![];
for d in socket.get_domains_passive() {
- domains_power.push(d.get_records_diff_power_microwatts());
+ let power = d.get_records_diff_power_microwatts();
+ match power {
+ Some(ref val) => println!("{}", val),
+ None => println!("No power!"),
+ }
+ println!("Domain path was: {}\n", d.counter_uj_path);
+ domains_power.push(power);
}
domains_power
} else { # cat /sys/class/powercap/intel-rapl:0:0/name
core
# cat /sys/class/powercap/intel-rapl:0:1/name
uncore
# cat /sys/class/powercap/intel-rapl:0:2/name
dram Indeed, this is the order that they are added to the socket:
diff --git a/src/sensors/mod.rs b/src/sensors/mod.rs
index 32fbcdc..6a85856 100644
--- a/src/sensors/mod.rs
+++ b/src/sensors/mod.rs
@@ -207,6 +207,7 @@ impl Topology {
buffer_max_kbytes: u16,
) {
let iterator = self.sockets.iter_mut();
+ println!("Adding domain to socket {} ({})", name, uj_counter);
for socket in iterator {
if socket.id == socket_id {
socket.safe_add_domain(Domain::new( |
I think the above is even clearer in the json exporter, where we on one hand have a hard-coded names array (
diff --git a/src/exporters/json.rs b/src/exporters/json.rs
index a6a3ca3..4a3a68e 100644
--- a/src/exporters/json.rs
+++ b/src/exporters/json.rs
@@ -177,12 +177,17 @@ impl JSONExporter {
let domains = socket
.get_domains_passive()
.iter()
- .map(|d| d.get_records_diff_power_microwatts())
+ .enumerate()
+ .map(|(index, d)| {
+ println!("Domain.name is {}, but names[index] is {}", d.name, names[index]);
+ return d.get_records_diff_power_microwatts();
+ })
.map(|record| record.map(|d| d.value))
.enumerate()
.map(|(index, d)| {
let domain_power =
d.map(|value| value.parse::<u64>().unwrap()).unwrap_or(0);
+
Domain {
name: names[index].to_string(),
consumption: domain_power as f32,
`` |
Hi, I'll try to push a PR. |
@PierreRust That's fantastic, many thanks! Yeah, I tried a naive solution, sorting the array inside powercap_rapl.rs (see below). This resulted in me finally getting DRAM results on both json and stdout exporter 🎉 Forgive the probably horrendously unsemantic rust: diff --git a/src/sensors/powercap_rapl.rs b/src/sensors/powercap_rapl.rs
index 208b420..e57eb35 100644
--- a/src/sensors/powercap_rapl.rs
+++ b/src/sensors/powercap_rapl.rs
@@ -73,8 +73,17 @@ impl Sensor for PowercapRAPLSensor {
}
let mut topo = Topology::new();
let re_domain = Regex::new(r"^.*/intel-rapl:\d+:\d+$").unwrap();
- for folder in fs::read_dir(&self.base_path).unwrap() {
- let folder_name = String::from(folder.unwrap().path().to_str().unwrap());
+
+ let mut folder_names = fs::read_dir(&self.base_path)
+ .unwrap()
+ .map(|folder|
+ String::from(folder.unwrap().path().to_str().unwrap())
+ )
+ .collect::<Vec<_>>();
+
+ folder_names.sort();
+
+ for folder_name in folder_names {
// let's catch domain folders
if re_domain.is_match(&folder_name) {
// let's get the second number of the intel-rapl:X:X string
root@Ubuntu-2004-focal-64-minimal ~/scaphandre/src # |
Well done ! Looks like a bug solved ! :) |
Bug description
Scaphandre reports 0 W as the DRAM consumption.
Expected behavior
I hoped scaphandre would include DRAM measurements.
Screenshots
Environment
a7b0159ce5135e2b25fe4c97a86434d30f6e0982
(self-built). Also tried pre-built docker image.Additional context
I realize that this may very well be the intended behavior, and not a bug (e.g. because the Intel RAPL sensor or powercap doesn't output RAM for this CPU make). I was mainly hoping to get clarifications into this - if it's something I will be able to fix (maybe I'm just operating scaphandre wrong or am missing something silly?), or if it's just the way it is. Thank you for having written this really useful tool!
Also, could it simply be because the memory usage is low, i.e. around 1GB used with 11GB buff/cached?
The text was updated successfully, but these errors were encountered: