Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements storage type independent retry policy #397

Closed
mmohan2399 opened this issue Feb 18, 2022 · 3 comments · Fixed by #541
Closed

Implements storage type independent retry policy #397

mmohan2399 opened this issue Feb 18, 2022 · 3 comments · Fixed by #541
Assignees
Milestone

Comments

@mmohan2399
Copy link

mmohan2399 commented Feb 18, 2022

we are doing clickhouse backup for 4 sharding which we are uploading backup to google cloud. One of the shard is struck due to below error . The data we are uploading almost 100 GB per shard.

2022/02/18 14:01:41.010787 error CompressedStreamUpload return error: Post "https://storage.googleapis.com/upload/storage/v1/b/clickhouse_backup/2Fchi-spdb-clickhouse-spdb-clickhouse-1-0-full-2022-02-18-14-00-03%2Fshadow%2Frecipe%2Frecipe_event_pv%2Fdefault_20181202_2903_2954_1.tar&prettyPrint=false&projection=full&uploadType=resumable&upload_id=ADPycdt-GrUeYdMFLV3LopZ2Aj1ls_TTNl2nPOI2oz9D1da11fPcutb8B1A25k8VPZW0aXgZLBjdoMmdDX7psyzsdNxTYE3e1g": http2: client connection force closed via ClientConn.Close
2022/02/18 14:01:41.010990 error can't acquire semaphore during Upload: context canceled

{"name":"","created":"0001-01-01 00:00:00","location":"remote","required":"","desc":"broken (can't stat metadata.json)"}

GCP Config

env:
                 - name: LOG_LEVEL
                   value: "debug"
                 - name: ALLOW_EMPTY_BACKUPS
                   value: "true"
                 - name: API_LISTEN
                   value: "0.0.0.0:7171"
                 - name: API_CREATE_INTEGRATION_TABLES
                   value: "true"
                 - name: REMOTE_STORAGE
                   value: "gcs"
                 - name: GCS_BUCKET
                   value: clickhouse_backup
                 - name: GCS_PATH
                   value: backups
                 - name: GCS_CREDENTIALS_FILE
                   value: /etc/gcscloud.json/

/tmp/.clickhouse-backup-metadata.cache.GCS

"chi-spdb-clickhouse-spdb-clickhouse-1-0-full-2022-02-18-14-00-03": {                       
                "backup_name": "chi-spdb-clickhouse-spdb-clickhouse-1-0-full-2022-02-18-14-00-03",  
                "disks": null,                                                                      
                "version": "",                                                                      
                "creation_date": "0001-01-01T00:00:00Z",                                            
                "metadata_size": 0,                                                                 
                "tables": null,                                                                     
                "data_format": "",                                                                  
                "Legacy": false,                                                                    
                "FileExtension": "",                                                                
                "Broken": "broken (can't stat metadata.json)",                                      
                "UploadDate": "0001-01-01T00:00:00Z"                                                
        },
@Slach
Copy link
Collaborator

Slach commented Feb 18, 2022

Which clickhouse-backup version do you use?

Look like connection just broken during data part upload, currently we did not implement storage independent retry mechanism,

I think we can try to implement storage-specific retry policy something like this https://github.com/GoogleCloudPlatform/golang-samples/blob/HEAD/storage/retry/configure_retries.go for GCS

https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/aws/retry for S3

{"name":"","created":"0001-01-01 00:00:00","location":"remote","required":"","desc":"broken (can't stat metadata.json)"}

empty name looks weird, try to hard restart clickhouse-backup server
and execute POST /backup/actions?command=delete remote <broken_backup_name>

What do you see in GET /backup/actions (SELECT * FROM system.backup_actions WHERE command LIKE '%backup_name%') for your upload <backup_name> or create_remote <backup_name>?

@Slach Slach changed the title Clickhouse-backup GCS Bucket(Google cloud ) upload is struck Implements storage type independent retry policy Feb 18, 2022
@Slach Slach self-assigned this Feb 18, 2022
@Slach Slach added this to the 2.0.0 milestone Feb 18, 2022
@mmohan2399
Copy link
Author

mmohan2399 commented Feb 19, 2022

I am using below clickhouse image

Image : altinity/clickhouse-backup:master

bash-5.1# clickhouse-backup --version
Version: 40955d0
Git Commit: 40955d0
Build Date: 2022-02-16
bash-5.1#

I am facing below error sometimes while upload

error full-2022-02-19-14-00-15 on chi-spdb-clickhouse-spdb-clickhouse-0-0
SELECT status,error FROM system.backup_actions WHERE command='upload chi-spdb-clickhouse-spdb-clickhouse-0-0-full-2022-02-19-14-00-15'
error one of upload go-routine return error: one of uploadTableData go-routine return error: can't check uploaded file: key not found

@Slach
Copy link
Collaborator

Slach commented Sep 3, 2022

could you check latest 1.6.2 where we update Google SDK to latest version

@Slach Slach modified the milestones: 2.0.0, 2.1.0 Sep 3, 2022
@Slach Slach mentioned this issue Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants