Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear tmp dir #344

Closed
wants to merge 18 commits into from
Closed
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions src/main/java/org/commonwl/view/Scheduler.java
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package org.commonwl.view;


import org.commonwl.view.util.StreamGobbler;
import org.commonwl.view.workflow.QueuedWorkflowRepository;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
Expand All @@ -9,6 +10,9 @@
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;


import java.io.IOException;

import java.util.Calendar;
import java.util.Date;

Expand All @@ -24,6 +28,9 @@ public class Scheduler {
@Value("${queuedWorkflowAgeLimitHours}")
private Integer QUEUED_WORKFLOW_AGE_LIMIT_HOURS;

@Value("${tmpDirAgeLimitDays}")
private Integer TMP_DIR_AGE_LIMIT_DAYS;

@Autowired
public Scheduler(QueuedWorkflowRepository queuedWorkflowRepository) {
this.queuedWorkflowRepository = queuedWorkflowRepository;
Expand Down Expand Up @@ -55,4 +62,40 @@ public void removeOldQueuedWorkflowEntries() {
logger.info(queuedWorkflowRepository.deleteByTempRepresentation_RetrievedOnLessThanEqual(removeTime)
+ " Old queued workflows removed");
}


@Scheduled(cron = "${cron.clearTmpDir}")
public void clearTmpDir() {
// TODO: find source of tmp dir creation and delete from there
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 we can change the PR to draft until the TODOs are resolved.


// wipe tmp dir and log info
try {
// Run command
String[] command = {"find", "/tmp", "-ctime", "+" + TMP_DIR_AGE_LIMIT_DAYS, "-exec", "rm", "-rf", "{}", "+"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, I suggest investigating a java-based solution instead of using a subprocess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

ProcessBuilder clearProcess = new ProcessBuilder(command);
logger.info("Clearing /tmp directory for content older than " + TMP_DIR_AGE_LIMIT_DAYS + " day" + (TMP_DIR_AGE_LIMIT_DAYS > 1 ? "s" : "") + "...");
Process process = clearProcess.start();

// Read output from the process using threads
StreamGobbler inputGobbler = new StreamGobbler(process.getInputStream());
StreamGobbler errorGobbler = new StreamGobbler(process.getErrorStream());
errorGobbler.start();
inputGobbler.start();

// Wait for process to complete
int exitCode = process.waitFor();
if (exitCode == 0) {
inputGobbler.join();
logger.info(inputGobbler.getContent());
logger.info("Successfully Cleared /tmp directory");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Cleared/cleared for consistency with other messages? 😬

Also, once the TODO above about locating the tmp dir is solved, we should probably include the location in the logs to help in case a sysadmin needs to check the directory permissions, modes, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Cleared/cleared for consistency with other messages? 😬

Also, once the TODO above about locating the tmp dir is solved, we should probably include the location in the logs to help in case a sysadmin needs to check the directory permissions, modes, etc.

Hi @kinow ,
The idea was to delete the file within the process that actually creates these files.
But that would be a mistake since the file may still be in use.

Another idea was to delete the file the moment it is no longer in use, but an exact time would be difficult to determine

Hence a solution that works would be to delete all files above a certain age.

so I would be removing the TODO remark.
But before I do that, do you have any comments on a much better approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. The other comment about using a Java native solution is still valid I think.

So instead of forking a new process we could try first using Java to iterate/walk the temp dir, filter for files/dirs older than the threshold, and finally delete or mark them for deletion (I assume we will try to delete it and in case it fails, call deleteOnExit or just log it... maybe Commons IO has something that we can use... just food for thought 👍 )

} else {
errorGobbler.join();
logger.warn("Could not clear /tmp directory");
logger.warn(errorGobbler.getContent());

}
} catch (IOException|InterruptedException e) {
logger.error("Error running clear /tmp dir process", e);
}
}
}
2 changes: 1 addition & 1 deletion src/main/java/org/commonwl/view/util/StreamGobbler.java
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ public class StreamGobbler extends Thread {
private final String lineSeparator = System.getProperty("line.separator");

private InputStream is;
private String content = "";
private String content = ""; // TODO: update to string builder

public StreamGobbler(InputStream is) {
this.is = is;
Expand Down
10 changes: 9 additions & 1 deletion src/main/resources/application.properties
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ spring.data.mongodb.port = 27017
sparql.endpoint = http://localhost:3030/cwlviewer/



Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit-picking, but unnecessary blank line 😬

#=======================
# Scheduler settings
#=======================
Expand All @@ -57,5 +58,12 @@ sparql.endpoint = http://localhost:3030/cwlviewer/
# The expression below implies every hour at the 0th second and 0th minute i.e (01:00:00, 02:00::00,... etc)
cron.deleteOldQueuedWorkflows = 0 0 * * * ?

# The expression below implies every day at the 0th second, 0th minute and 24th(0th) hour i.e ( time 00:00:00, every day)
cron.clearTmpDir = 0 0 0 * * ?

# Age limit for queued workflows in hours.
queuedWorkflowAgeLimitHours = 24
queuedWorkflowAgeLimitHours = 24

# Age limit for tmp directories in days.
tmpDirAgeLimitDays = 1