Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ood-control based restore logic #228

Closed
Tracked by #227
lurenpluto opened this issue Apr 20, 2023 · 6 comments
Closed
Tracked by #227

Add ood-control based restore logic #228

lurenpluto opened this issue Apr 20, 2023 · 6 comments
Assignees
Labels
Backup & Restore The OOD backup and restore related feature New feature OOD-daemon The OOD-daemon basic service

Comments

@lurenpluto
Copy link
Member

The current backup and restore mechanism based on cyfs-backup tool is already supported, but it requires command line operation, which is an advanced usage and relatively not very user friendly, so consider adding the corresponding OOD backup and restore related functions in the OOD management page of cyfs-browser, similar to the OOD activation process, which can be activated in the LAN or restore a new OOD (gateway) from an existing backup, so the related functions and protocols need to be supported in the OOD-control component.

@lurenpluto lurenpluto added feature New feature OOD-daemon The OOD-daemon basic service labels Apr 20, 2023
@lurenpluto lurenpluto moved this to 💬To Discuss in CYFS-Stack & Services Apr 20, 2023
@lurenpluto
Copy link
Member Author

Some of the key design considerations for the implementation of this feature are as follows.

Control protocol

The current activation of ood-control is based on the LAN http protocol, so consider adding the restore protocol on top of this, some key points are as follows.

1. The type of data source supported by restore

Since the backup is generated from a data directory, the recovery in principle receives a data directory, but in many cases it can be packaged (tar/zip) into an archive, so in the long run, the recovery logic needs to support the following formats

  • A local data directory
  • A local packaged archive
  • A remote archive and directory specified by url
  • Data flow model (for long term planning, more advanced customization support from the backup & restore module is required)

2. Control protocol via http

Based on the existing restore, still controlled at the granularity of task, the client can start/cancel/query a restore task

  • StartTask(data, config) -> TaskId
  • GetTaskStatus(task_id) -> TaskStatInfo
  • CancelTask(task_id)

3. Security issues

The current ood-control is based on the LAN http protocol, which itself does not have permission-related security detection, so there are still certain security issues, and it is necessary to consider how to restrict the source requests, mainly for requests initiated by unknown web pages

The following three services currently use the ood-control control protocol

  • Static pages of cyfs-browser
  • Cyfs-browser plugins
  • Some tools (native applications)

So we can consider restricting requests from unknown sources by using the origin of http requests

Internal implementation

Consider a direct integration of cyfs-backup and cyfs-backup-tool logic in the form of a lib (not directly dependent on cyfs-backup tool), with the following new features:

  • Add conversion processing of data sources
    It is possible to receive multiple types of data directories and convert them to the types supported by cyfs-backup.
    • For url, it is necessary to download the corresponding data directory or data file to local
    • For packaged data files, you need to decompress them first (can we consider the mode of reading them directly without decompressing them later?)
  • Integrate task progress management
    Currently, cyfs-backup provides progress management for backup and recovery, so we need to provide more granular progress management on top of that, mainly adding progress management for download and decompression.

@lurenpluto lurenpluto self-assigned this Apr 20, 2023
@lurenpluto lurenpluto moved this from 💬To Discuss to 📝Todo in CYFS-Stack & Services Apr 24, 2023
@lurenpluto lurenpluto moved this from 📝Todo to 🚧In Progress in CYFS-Stack & Services Apr 24, 2023
lurenpluto added a commit that referenced this issue Apr 26, 2023
@lurenpluto
Copy link
Member Author

lurenpluto commented Apr 26, 2023

The first version of ood-control based remote restore has been completed, which mainly includes the following core logic

1. Core implementation

The core implementation is in cyfs-backup, including the following two parts

  • archive_download

source code

The main implementation is based on http protocol to download and unpack a backup data from the remote end, supporting the following two modes
o Directly download a packaged backup file and unpack it locally
o Directly download the remote folder as a backup directory (requires remote http support for directory mode)

  • remote_store

source code

Task-based management of remote restore, integrating archive download, unpack and existing restore from local logic, and providing progress management

2. Integration with ood-control

In addition to the existing bind/check logic of ood-control, new logic related to restore is added, including the following.

  • create_restore_task
  • get_restore_task_status
  • get_restore_task_list
  • cancel_restore_task

@lurenpluto
Copy link
Member Author

The usage of ood-control based remote restore

1. Supported remote archive url format

The archive is downloaded in http mode and supports arbitrary custom query string parameters, which can be used for permission control, etc.

Currently the url supports two modes

- Zip package format

The data directory generated after backup is packed into a single file by zip, the url example is as follows

http://127.0.0.1:8887/test/data.zip?token=123456

- Directory format

The data directory generated after backup is downloaded directly through the url, the url example is as follows

http://127.0.0.1:8887/test/${filename}?token=123456

Where ${filename} is the keyword, indicating that this part is replaceable. Remote restore logics during downloading directory, will replace ${filename} for the corresponding file, generate the full url to handle, such as the above url, will first request the index file, as follows:

http://127.0.0.1:8887/test/index?token=123456

Then, according to the contents of the index, the corresponding data files will be downloaded in turn, such as

http://127.0.0.1:8887/test/data/object.0.data?token=123456
http://127.0.0.1:8887/test/data/chunk.0.data?token=123456
......

2. Create restore task

Create a restore task

POST http://{ood-control-addr}/restore   {params in json format}

Params related code as follows:

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct RemoteRestoreParams {
// TaskId, should be valid segment string of path
pub id: String,
// Restore related params
pub cyfs_root: Option<String>,
pub isolate: Option<String>,
pub password: Option<ProtectedPassword>,
// Remote archive info
pub remote_archive: String,
}

Currently params supports the following fields:

- id

Required parameter, task id, needs to satisfy the following conditions:
a. It must be a valid path character, and may not contain special characters such as '/' that are not supported by the operating system path;
b. Task id must be unique, a task id means an independent restore task
c. At present, a ood-control (ood-daemon/ood-installer) service can only have at most one restore task at the same time, that is, it must wait until the last restore task is completed or cancelled before it can create a new restore task.

- remote_archive

Required parameter, the url of the remote backup archive data used for recovery, refer to the description in 1 for the format

- cyfs_root

Optional parameter, you can specify a different cyfs_root for recovery, default value is used

- isolate

  • Optional parameter, you can specify the isolate of the recovery data, which is used to isolate the data in the case of multiple protocol stacks, normally you don't need to specify this parameter

- password

Optional parameter, if the remote archive is stored in an encrypted way, then you need to specify the corresponding password here

After the restore task is created successfully, it can be managed based on the task id, including querying progress and cancellation

3. Get restore task status

After the restore task is created successfully, query the status of the task

GET http://{ood-control-addr}/restore/{task-id}

Returns a status description in json format, defined in

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct RemoteRestoreStatus {
pub phase: RemoteRestoreTaskPhase,
pub result: Option<BuckyResult<()>>,
pub download_progress: Option<ArchiveProgress>,
pub unpack_progress: Option<ArchiveProgress>,
pub restore_status: Option<RestoreStatus>,
}

Where

- phase

The phase of the current restore task, it should be noted that for tasks with remote_archive as folder, there is no unpack phase.

For example, if phase=Download, then the download_progress field is not empty, and the detailed progress of download is expressed in it

Phase define code as follows:

#[derive(Clone, Copy, PartialEq, Eq, Debug, Serialize, Deserialize)]
pub enum RemoteRestoreTaskPhase {
Init,
Download,
Unpack,
Restore,
Complete,
}

- result

The result of the task, which is set at the end of the task; the possible values of result are

  • The task is still in progress
"result": null
  • The task was successful
 "result": {
	"Ok": null
}
  • Task was failed.
"result": {
	"Err": {
	  "code": "4",
	  "msg": "not found error"
	}
}

- download_progress

The progress of the download from the remote archive url, with the following input example

"download_progress": {      
	"total": 1000000,
	"completed": 100,
	"result": null,
	"current": {
	  "file": "object.0.data",
	  "total": 100000,
	  "completed": 100,
	  "result": null
	}
}

Where

  • total
    Total size, in bytes
  • completed
    Completed size, in bytes
  • result
    The result of the download, refer to result above
    Current Details of the file currently being downloaded

- unpack_progress

The same definition as download_progress, indicating the current unpacking progress, which is used when the remote archive is in zip format

- restore_status

The recovery status of the local ood, which is the same as the standard local archive recovery process, is defined as follows

https://github.com/buckyos/CYFS/blob/1b7b35b09a81cb6119fe844058bba8297ae1e3ce/src/component/cyfs-backup-lib/src/backup/restore_ status.rs#L32-L41

The sample status output in json format is as follows

"restore_status": {
	"phase": "Init",
	"phase_last_update_time": 0,
	"stat": {
	  "objects": {
	    "count": 10000,
	    "bytes": 0
	  },
	  "chunks": {
	    "count": 1000,
	    "bytes": 0
	  },
	  "files": {
	    "count": 100,
	    "bytes": 0
	  }
	},
	"complete": {
	  "objects": {
	    "count": 100,
	    "bytes": 0
	  },
	  "chunks": {
	    "count": 0,
	    "bytes": 0
	  },
	  "files": {
	    "count": 0,
	    "bytes": 0
	  }
	},
	"result": null
}

Where

  • phase
    Phase of restore, defined as follows

#[derive(Clone, Copy, Debug, Serialize, Deserialize)]
pub enum RestoreTaskPhase {
Init,
LoadAndVerify,
RestoreKeyData,
RestoreObject,
RestoreChunk,
Complete,
}

The phases of the ood restore process are described in order of progression

  • stat
    The statistics of the archive, including the statistics of three main parts of the data
    § object
    § chunk
    § file
    The number and size of each part are included (the size is not used at the moment, and only count can be used to judge the progress)

  • complete
    The statistics of completion, which corresponds to the above stat, can calculate the progress of completion of each stage.

4. Cancel restore task

Before a restore execution is completed, you can forcibly terminate it with the following command

DELETE http://{ood-control-addr}/restore/{task-id}

It should be noted that canceling a restore task will not clean up the restored ood data together, but only the downloaded data files and the decompressed data files (if they already exist); after canceling the restore task, you can restart a new restore task, which will directly overwrite the old state ( generally incomplete state, OOD is not available)

5. Query the current restore task

You can query the list of currently executing restore task (currently there is and can only have a restore task)

GET http://{ood-control-addr}/restore/tasks

If there is already a restore task executing, the upper layer either waits for the task to finish or cancels the task, and cannot recreate a new restore task.

@lurenpluto lurenpluto moved this from 🚧In Progress to 🧪To Test in CYFS-Stack & Services Apr 27, 2023
@lurenpluto lurenpluto added the Backup & Restore The OOD backup and restore related label Apr 27, 2023
@lurenpluto
Copy link
Member Author

When testing remote restore, you can build a simple http server locally to handle the two types of remote_archive cases.
Recommend using the Chrome plugin "Web Server for Chrome", which makes it easy to build your own local server to handle file and folder downloads. @lizhihongTest

@lizhihongTest
Copy link
Collaborator

Add ood-control based restore logic has tested finished.

Test environment

OOD : Nightly 1.1.0.756

Create OOD random data

source code
random data contains:

// NameObject num = nameobject_thread * nameobject_number
const nameobject_thread = 10;
const nameobject_number = 10

// Each type of ObjectMap key value data  num = ket_data_thread * ket_data_number,There contains (root_state local_cache)*(Path IsolatePath Single)
const ket_data_thread = 5;
const ket_data_number = 20

// chunk data size = chunk_thread * chunk_number * chunk_size(10MB)
const chunk_thread = 5;
const chunk_number = 10;

// file data size = file_thread * file_number * file_size(10MBFile)
const file_thread = 5;
const file_number = 10;

Backup ood data

Backup ood data without password

./cyfs-backup --mode backup --id 001 --target-dir /backup/001 --root /cyfs

Backup ood data with password

./cyfs-backup --mode backup --id 002 --target-dir /backup/002 --root /cyfs --password token-dhjfkfsfsaf --file-max-size 100000000000

Use nginx support a simple http file server

server {
	listen       192.168.200.151:80;
	server_name  192.168.200.151;
	location / {
		root /backup;
		autoindex on; 
		autoindex_exact_size on; 
		autoindex_localtime on; 						   
	}
}

Backup ood data support zip package format server

cd /backup/001
zip -r data.zip ./*

we can view zip package by: http://192.168.200.151/001/data.zip

Backup ood data support directory format server

we can view directory by: http://192.168.200.151/002

Restore ood data by ood-daemon

Create restore task

  • task 001
POST http://192.168.100.205:1320/restore
{
	id : "001",
	remote_archive : "http://192.168.200.151/001/data.zip"
}
  • task 002
POST http://192.168.100.205:1320/restore
{
	id : "002",
	remote_archive : "http://192.168.200.151/002/${filename}",
	cyfs_root : "/cyfs",
	password : "token-dhjfkfsfsaf"
}

Get restore task status

GET http://192.168.100.205:1320/restore/001

Cancel restore task

DELETE http://192.168.100.205:1320/restore/001

Query the current restore task

GET http://192.168.100.205:1320/restore/tasks

Clean data

rm -rf /cyfs
kill -9 $(pidof ood-daemon)
kill -9 $(pidof gateway)
kill -9 $(pidof file-manager)
kill -9 $(pidof chunk-manager)
kill -9 $(pidof app-manager)
/ood_test/CYFSOOD-x86-64-1.0.0.524-nightly.bin

Test Case List

source code
Create restore task 001 and cancel it
√ Use must params create restore task (608 ms)
√ Get restore task status,check it is beging (2029 ms)
√ Cancel restore task 001 (5043 ms)
Create restore task 001 ,and wait finished
√ Use must params create restore task (10 ms)
√ Get restore task status,check it is beging (56555 ms)
√ Cancel restore task 001 (5010 ms)
Create restore task 002 set error password
√ Create restore task 002 set error password,it will run async,begin create restore task (621 ms)
√ Get restore task status check task is error (20207 ms)
√ Cancel error restore task 002 (5019 ms)
Create restore task 002 set custome cyfs_root,and wait finished
√ Create restore task 002 set custome cyfs_root : /test_root (14 ms)
√ Get restore task status.set task cancle when state = Download (20209 ms)
Create restore task 002,and wait finished
√ Use all params create restore task (624 ms)
√ Get restore task status check task finished (48408 ms)
Query current restore task
√ Query the current restore task (11 ms)

@lizhihongTest lizhihongTest moved this from 🧪To Test to ✅Done in CYFS-Stack & Services May 5, 2023
@lurenpluto
Copy link
Member Author

Add ood-control based restore logic has tested finished.

Test environment

OOD : Nightly 1.1.0.756

Create OOD random data

source code random data contains:

// NameObject num = nameobject_thread * nameobject_number
const nameobject_thread = 10;
const nameobject_number = 10

// Each type of ObjectMap key value data  num = ket_data_thread * ket_data_number,There contains (root_state local_cache)*(Path IsolatePath Single)
const ket_data_thread = 5;
const ket_data_number = 20

// chunk data size = chunk_thread * chunk_number * chunk_size(10MB)
const chunk_thread = 5;
const chunk_number = 10;

// file data size = file_thread * file_number * file_size(10MBFile)
const file_thread = 5;
const file_number = 10;

Backup ood data

Backup ood data without password

./cyfs-backup --mode backup --id 001 --target-dir /backup/001 --root /cyfs

Backup ood data with password

./cyfs-backup --mode backup --id 002 --target-dir /backup/002 --root /cyfs --password token-dhjfkfsfsaf --file-max-size 100000000000

Use nginx support a simple http file server

server {
	listen       192.168.200.151:80;
	server_name  192.168.200.151;
	location / {
		root /backup;
		autoindex on; 
		autoindex_exact_size on; 
		autoindex_localtime on; 						   
	}
}

Backup ood data support zip package format server

cd /backup/001
zip -r data.zip ./*

we can view zip package by: http://192.168.200.151/001/data.zip

Backup ood data support directory format server

we can view directory by: http://192.168.200.151/002

Restore ood data by ood-daemon

Create restore task

  • task 001
POST http://192.168.100.205:1320/restore
{
	id : "001",
	remote_archive : "http://192.168.200.151/001/data.zip"
}
  • task 002
POST http://192.168.100.205:1320/restore
{
	id : "002",
	remote_archive : "http://192.168.200.151/002/${filename}",
	cyfs_root : "/cyfs",
	password : "token-dhjfkfsfsaf"
}

Get restore task status

GET http://192.168.100.205:1320/restore/001

Cancel restore task

DELETE http://192.168.100.205:1320/restore/001

Query the current restore task

GET http://192.168.100.205:1320/restore/tasks

Clean data

rm -rf /cyfs kill -9 $(pidof ood-daemon) kill -9 $(pidof gateway) kill -9 $(pidof file-manager) kill -9 $(pidof chunk-manager) kill -9 $(pidof app-manager) /ood_test/CYFSOOD-x86-64-1.0.0.524-nightly.bin

Test Case List

source code Create restore task 001 and cancel it √ Use must params create restore task (608 ms) √ Get restore task status,check it is beging (2029 ms) √ Cancel restore task 001 (5043 ms) Create restore task 001 ,and wait finished √ Use must params create restore task (10 ms) √ Get restore task status,check it is beging (56555 ms) √ Cancel restore task 001 (5010 ms) Create restore task 002 set error password √ Create restore task 002 set error password,it will run async,begin create restore task (621 ms) √ Get restore task status check task is error (20207 ms) √ Cancel error restore task 002 (5019 ms) Create restore task 002 set custome cyfs_root,and wait finished √ Create restore task 002 set custome cyfs_root : /test_root (14 ms) √ Get restore task status.set task cancle when state = Download (20209 ms) Create restore task 002,and wait finished √ Use all params create restore task (624 ms) √ Get restore task status check task finished (48408 ms) Query current restore task √ Query the current restore task (11 ms)

Very detailed test cases, thanks for the help! So this feature will be released in the next version

streetycat pushed a commit to streetycat/CYFS that referenced this issue May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backup & Restore The OOD backup and restore related feature New feature OOD-daemon The OOD-daemon basic service
Projects
Status: Done
Development

No branches or pull requests

2 participants