Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory usage in large pipelines #1349

Closed
wlandau opened this issue Oct 22, 2024 · 3 comments
Closed

High memory usage in large pipelines #1349

wlandau opened this issue Oct 22, 2024 · 3 comments
Assignees

Comments

@wlandau
Copy link
Member

wlandau commented Oct 22, 2024

c.f. #1347 and #1329. I tried the following pipeline on a RHEL9 node:

library(autometric)
library(crew)
library(targets)

controller <- crew_controller_local(
  workers = 1L,
  garbage_collection = TRUE,
  options_metrics = crew_options_metrics(
    path = "logs",
    seconds_interval = 1
  )
)

if (tar_active()) {
  controller$start()
  log_start(
    path = "logs/main.txt",
    seconds = 1,
    pids = controller$pids()
  )
}

tar_option_set(
  memory = "transient",
  garbage_collection = TRUE,
  controller = controller
)

write_file <- function(x) {
  fs::dir_create("files")
  path <- file.path("files", paste0(x, ".rds"))
  saveRDS(x, path)
  path
}

list(
  tar_target(x, seq_len(2e4)),
  tar_target(y, write_file(x), pattern = map(x), format = "file"),
  tar_target(z, readRDS(y), pattern = map(y))
)

Then I read and visualized the autometric logs:

library(autometric)
log <- log_read("logs", units_memory = "megabytes")
names <- unique(log$name)
log_plot(log, name = names[1], metric = "resident")
log_plot(log, name = names[2], metric = "resident")
log_plot(log, name = names[3], metric = "resident")

Screenshot 2024-10-21 at 8 14 40 PM

Screenshot 2024-10-21 at 8 15 18 PM

Screenshot 2024-10-21 at 8 15 03 PM

The crew worker and mirai dispatcher are efficient with memory, consuming no more than a few megabytes. But the memory consumption of the local targets process kept increasing without an ostensible bound. 3 GB isn't necessarily alarming, but I will need to look into what is responsible for most of this memory.

I wonder if this could explain #1347 or #1329, and I wonder what would happen without crew.

@wlandau wlandau self-assigned this Oct 22, 2024
@wlandau
Copy link
Member Author

wlandau commented Oct 22, 2024

I tried a similar pipeline without crew:

library(autometric)
library(targets)

if (tar_active()) {
  log_start(
    path = "logs/main.txt",
    seconds = 1
  )
}

tar_option_set(
  memory = "transient",
  garbage_collection = TRUE
)

write_file <- function(x) {
  fs::dir_create("files")
  path <- file.path("files", paste0(x, ".rds"))
  saveRDS(x, path)
  path
}

list(
  tar_target(x, seq_len(2e4)),
  tar_target(y, write_file(x), pattern = map(x), format = "file"),
  tar_target(z, readRDS(y), pattern = map(y))
)

The pipeline took a lot longer to run (~7 hr), but memory usage looked more reasonable:

Screenshot 2024-10-22 at 6 48 03 AM

There is a mild surge at the beginning, a mild surge at around 10000s (presumably when all the dynamic branches of z are defined) and then another mild surge at the end. A max of 800 MB is pretty good.

Takeaways:

  1. Something about crew + targets guzzles memory.
  2. Something about targets alone is slow for this type of pipeline, and the slowness does not appear to have anything to do with crew or (1).

So we actually have 2 different unrelated performance problems.

@wlandau
Copy link
Member Author

wlandau commented Oct 22, 2024

For (2), the slowness just comes from garbage collection 😆 . I should have known.

library(targets)

tar_option_set(
  memory = "transient",
  garbage_collection = TRUE
)

write_file <- function(x) {
  fs::dir_create("files")
  path <- file.path("files", paste0(x, ".rds"))
  saveRDS(x, path)
  path
}

list(
  tar_target(x, seq_len(1000)),
  tar_target(y, write_file(x), pattern = map(x), format = "file"),
  tar_target(z, readRDS(y), pattern = map(y))
)
library(proffer)
library(targets)
tar_destroy()
pprof(tar_make(callr_function = NULL, reporter = "summary"))

Screenshot 2024-10-22 at 10 51 37 AM

@wlandau
Copy link
Member Author

wlandau commented Oct 22, 2024

As best I can tell for now, most of the memory is consumed by the internal data structures targets needs for bookkeeping. targets has an internal object oriented programming system which uses environments with S3 classes. With 32k targets, there are 32k+ nested environments, and those happen to take up a lot of memory in aggregate. Unless I am missing something in scaled-up examples, improving memory efficiency here would be a huge undertaking and may involve converting many of the internal data structures into compact C structs. Converting this thread to a discussion.

@ropensci ropensci locked and limited conversation to collaborators Oct 22, 2024
@wlandau wlandau converted this issue into discussion #1352 Oct 22, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

1 participant