Skip to content

Commit

Permalink
Merge pull request #41 from 45Drives/arg_max_headroom
Browse files Browse the repository at this point in the history
Arg max headroom
  • Loading branch information
joshuaboud authored May 19, 2021
2 parents 4ae5a12 + 89728fa commit 97dded1
Show file tree
Hide file tree
Showing 12 changed files with 311 additions and 131 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,21 +17,21 @@ You must have a Ceph file system. `rsync`, `scp`, or similar must be installed o
## Installation
### Current Release
#### Centos 7
* `yum install https://github.com/45Drives/cephgeorep/releases/download/v1.2.8/cephgeorep-1.2.8-1.el7.x86_64.rpm`
* `yum install https://github.com/45Drives/cephgeorep/releases/download/v1.2.9/cephgeorep-1.2.9-1.el7.x86_64.rpm`
#### Centos 8
* `yum install https://github.com/45Drives/cephgeorep/releases/download/v1.2.8/cephgeorep-1.2.8-1.el8.x86_64.rpm`
* `yum install https://github.com/45Drives/cephgeorep/releases/download/v1.2.9/cephgeorep-1.2.9-1.el8.x86_64.rpm`
#### Ubuntu 20.04
* `wget https://github.com/45Drives/cephgeorep/releases/download/v1.2.8/cephgeorep_1.2.8-1focal_amd64.deb`
* `apt install ./cephgeorep_1.2.8-1focal_amd64.deb`
* `wget https://github.com/45Drives/cephgeorep/releases/download/v1.2.9/cephgeorep_1.2.9-1focal_amd64.deb`
* `apt install ./cephgeorep_1.2.9-1focal_amd64.deb`
#### Ubuntu 18.04
* `wget https://github.com/45Drives/cephgeorep/releases/download/v1.2.8/cephgeorep_1.2.8-1bionic_amd64.deb`
* `apt install ./cephgeorep_1.2.8-1bionic_amd64.deb`
* `wget https://github.com/45Drives/cephgeorep/releases/download/v1.2.9/cephgeorep_1.2.9-1bionic_amd64.deb`
* `apt install ./cephgeorep_1.2.9-1bionic_amd64.deb`

### Installing from Source
* Install Boost (libboost-dev) and Thread Building Blocks (libtbb-dev) development libraries
* `git clone https://github.com/45drives/cephgeorep`
* `cd cephgeorep`
* `git checkout tags/v1.2.8`
* `git checkout tags/v1.2.9`
* `make -j8` or `make -j8 static` to statically link libraries
* `sudo make install`
#### Uninstalling from Source
Expand Down
2 changes: 1 addition & 1 deletion doc/man/cephgeorep.8
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
.\" First parameter, NAME, should be all caps
.\" Second parameter, SECTION, should be 1-8, maybe w/ subsection
.\" other parameters are allowed: see man(7), man(1)
.TH CEPHGEOREP 8 "March 23 2021" "cephgeorep 1.2.5"
.TH CEPHGEOREP 8 "May 18 2021" "cephgeorep 1.2.9"
.\" Please adjust this date whenever revising the manpage.

.SH NAME
Expand Down
7 changes: 6 additions & 1 deletion el7/cephgeorep.spec
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
%define __os_install_post %{_dbpath}/brp-compress

Name: cephgeorep
Version: 1.2.8
Version: 1.2.9
Release: 1%{?dist}
Summary: Ceph File System Remote Sync Daemon

Expand Down Expand Up @@ -56,6 +56,11 @@ systemctl stop cephfssyncd.service
systemctl daemon-reload

%changelog
* Wed May 19 2021 Josh Boudreau <jboudreau@45drives.com> 1.2.9-1
- When execution fails from too many arguments, the argv headroom
is increased and execution is tried again.
- STDOUT and STDERR of sync process are logged to a file on failure.

* Mon Apr 26 2021 Josh Boudreau <jboudreau@45drives.com> 1.2.8-1
- Signigicant optimizations for modifying file paths and for comparing
file size and type to make crawl time up to 6 times faster.
Expand Down
7 changes: 6 additions & 1 deletion el8/cephgeorep.spec
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
%define __os_install_post %{_dbpath}/brp-compress

Name: cephgeorep
Version: 1.2.8
Version: 1.2.9
Release: 1%{?dist}
Summary: Ceph File System Remote Sync Daemon

Expand Down Expand Up @@ -52,6 +52,11 @@ systemctl stop cephfssyncd.service
systemctl daemon-reload

%changelog
* Wed May 19 2021 Josh Boudreau <jboudreau@45drives.com> 1.2.9-1
- When execution fails from too many arguments, the argv headroom
is increased and execution is tried again.
- STDOUT and STDERR of sync process are logged to a file on failure.

* Mon Apr 26 2021 Josh Boudreau <jboudreau@45drives.com> 1.2.8-1
- Signigicant optimizations for modifying file paths and for comparing
file size and type to make crawl time up to 6 times faster.
Expand Down
8 changes: 3 additions & 5 deletions src/impl/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,9 @@ namespace fs = boost::filesystem;

inline size_t get_env_size(char *envp[]){
size_t size = 0;
while(*envp){
size += strlen(*envp++) + 1;
size += sizeof(char *);
}
size += 1; // null terminator
while(*envp)
size += strlen(*envp++) + 1 + sizeof(char *);
size += sizeof(NULL); // null terminator
return size;
}

Expand Down
130 changes: 113 additions & 17 deletions src/impl/syncProcess.cpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,30 @@
/*
* Copyright (C) 2019-2021 Joshua Boudreau <jboudreau@45drives.com>
*
* This file is part of cephgeorep.
*
* cephgeorep is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 2 of the License, or
* (at your option) any later version.
*
* cephgeorep is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with cephgeorep. If not, see <https://www.gnu.org/licenses/>.
*/

#include "syncProcess.hpp"
#include "syncer.hpp"
#include "file.hpp"
#include <unistd.h>
#include <sstream>
#include <fstream>
#include <iomanip>
#include <ctime>
#include <csignal>
#include <boost/tokenizer.hpp>

extern "C" {
Expand All @@ -12,10 +35,13 @@ extern "C" {
SyncProcess::SyncProcess(Syncer *parent, int id, int nproc, std::vector<File> &queue)
: id_(id),
inc_(nproc),
pid_(0),
max_mem_usage_(parent->max_mem_usage_),
start_mem_usage_(parent->start_mem_usage_),
curr_payload_bytes_(0),
pipefd_{-1,-1},
destination_(parent->destination_),
sending_to_(*destination_),
file_itr_(queue.begin()),
payload_(parent->start_payload_){

Expand All @@ -26,6 +52,13 @@ SyncProcess::SyncProcess(Syncer *parent, int id, int nproc, std::vector<File> &q
std::advance(file_itr_, id_);
}

SyncProcess::~SyncProcess(){
if(pipefd_[0] != -1)
close(pipefd_[0]);
if(pipefd_[1] != -1)
close(pipefd_[1]);
}

int SyncProcess::id() const{
return id_;
}
Expand All @@ -48,7 +81,6 @@ uintmax_t SyncProcess::payload_count(void) const{

void SyncProcess::add(const std::vector<File>::iterator &itr){
payload_.push_back(itr->path());
itr->disown_path();
curr_mem_usage_ += itr->path_len() + 1 + sizeof(char *);
curr_payload_bytes_ += itr->size();
}
Expand All @@ -62,14 +94,16 @@ void SyncProcess::consume(std::vector<File> &queue){
add(file_itr_);
std::advance(file_itr_, inc_);
}
sending_to_ = *destination_;
if(!destination_->empty())
if(!destination_->empty()){
sending_to_ = *destination_;
payload_.push_back((char *)destination_->c_str());
}
payload_.push_back(NULL);
payload_.shrink_to_fit();
}

void SyncProcess::sync_batch(){
pipe(pipefd_);

pid_ = fork(); // create child process
int error;
switch(pid_){
Expand All @@ -79,32 +113,36 @@ void SyncProcess::sync_batch(){
l::exit(EXIT_FAILURE);
case 0: // child process
{
int null_fd = open("/dev/null", O_WRONLY);
dup2(null_fd, 1);
dup2(null_fd, 2);
close(null_fd);
close(pipefd_[0]);
pipefd_[0] = -1;
signal(SIGINT, SIG_DFL);
dup2(pipefd_[1], 1);
dup2(pipefd_[1], 2);
close(pipefd_[1]);
pipefd_[1] = -1;
execvp(payload_[0], payload_.data());
int execvp_errno = -errno;
error = errno;
dump_argv(error);
int execvp_errno = -error;
exit(execvp_errno);
}
break;
default: // parent process
close(pipefd_[1]);
pipefd_[1] = -1;
Logging::log.message(std::to_string(pid_) + " started.", 2);
break;
}
}

void SyncProcess::reset(void){
for(
std::vector<char *>::iterator itr = std::next(payload_.begin(), start_payload_sz_);
itr != std::prev(payload_.end(), 2); // 2 before end to avoid deleting dest and NULL
++itr
){
delete[] *itr; // free memory taken by path
}
payload_.resize(start_payload_sz_);
curr_mem_usage_ = start_mem_usage_;
curr_payload_bytes_ = 0;
if(pipefd_[0] != -1)
close(pipefd_[0]);
if(pipefd_[1] != -1)
close(pipefd_[1]);
}

bool SyncProcess::done(const std::vector<File> &queue) const{
Expand All @@ -114,3 +152,61 @@ bool SyncProcess::done(const std::vector<File> &queue) const{
const std::string &SyncProcess::destination(void) const{
return sending_to_;
}

inline std::string get_unique_log_path(const std::string type){
std::string log_location = "/var/log/cephgeorep";
if(!fs::exists(log_location))
fs::create_directories(log_location);
std::stringstream log_path_ss;
std::time_t now = std::time(nullptr);
log_path_ss << log_location << "/exec_" << type << "_" << std::put_time(std::localtime(&now), "%F_%T_%z") << ".log";
std::string log_path = log_path_ss.str();
if(fs::exists(log_path)){
int i = 1;
while(fs::exists(log_path + "." + std::to_string(i)))
i++;
log_path += "." + std::to_string(i);
}
return log_path;
}

void SyncProcess::dump_argv(int error) const{
std::ofstream f;
f.open(get_unique_log_path("fail"), std::ios::trunc);
if(!f.is_open()){
Logging::log.error("Could not dump argv to log file.");
return;
}
std::string msg = (error > 0)? strerror(error) : Logging::log.rsync_error(-error);
f << error << " : " << msg << std::endl;
for(const char *arg : payload_){
if(arg)
f << arg << std::endl;
}
f.close();
}

void SyncProcess::log_errors() const{
std::ofstream f;
std::string log_path = get_unique_log_path("error");
f.open(log_path, std::ios::trunc);
if(!f.is_open()){
Logging::log.error("Could not dump stdout/stderr to log file.");
return;
}
char buffer[4*1024];
int bytes_read = 0;
while((bytes_read = read(pipefd_[0], buffer, sizeof(buffer) - 1)) > 0){
buffer[bytes_read] = '\0';
f << buffer;
}
if(bytes_read == -1){
int err = errno;
Logging::log.error(std::string("Error reading pipe for logging error: ") + strerror(err));
}else{
Logging::log.message(payload_[0] + std::string(" error details logged in ") + log_path, 0);
}

f.close();
}

Loading

0 comments on commit 97dded1

Please sign in to comment.