Skip to content

Commit

Permalink
Merge pull request #3870 from gchq/3831-table-docs
Browse files Browse the repository at this point in the history
Issue 3831 - Rearrange scripts documentation
  • Loading branch information
patchwork01 authored Dec 6, 2024
2 parents 38653a3 + 0f3d0f3 commit 16f03eb
Show file tree
Hide file tree
Showing 3 changed files with 89 additions and 83 deletions.
63 changes: 46 additions & 17 deletions docs/02-deployment-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -455,11 +455,14 @@ sleeper.optional.stacks=CompactionStack,IngestStack,QueryStack

Note that the system test stacks do not need to be specified. They will be included if you use the system test CDK app.

### Utility Scripts
## Administration clients

There are scripts in the `scripts/deploy` directory that can be used to manage an existing instance.
There are clients and scripts in the `scripts/deploy` and `scripts/utility` directories that can be used to work with an
existing instance.

#### Update Existing Instance
Also see the [tables documentation](04-tables.md#addedit-a-table) for scripts to add/edit Sleeper tables.

### Update Existing Instance

The `deployExisting.sh` script can be used to bring an existing instance up to date. This will upload any jars
that have changed, update all the docker images, and perform a `cdk deploy`.
Expand All @@ -472,27 +475,53 @@ We are planning to add support to this script for declarative deployment, so tha
tables configuration in a folder structure and pass it to this script to apply any changes. Currently such changes must
be done with the admin client.

#### Tables
### Sleeper Administration Client

We have provided a command line client that will enable you to:

Scripts can be used to add, rename and delete tables in a Sleeper instance.
1) List Sleeper instance properties
2) List Sleeper table names
3) List Sleeper table properties
4) Change an instance/table property
5) Get status reports (also see [checking the status of the system](06-status.md))

The `addTable.sh` script will create a new table with properties defined in `templates/tableproperties.template`, and a
schema defined in `templates/schema.template`. Currently any changes must be done in those templates or in the admin
client. We will add support for declarative deployment in the future.
This client will prompt you for things like your instance ID as mentioned above and/or the name of the table you want to
look at. To adjust property values it will open a text editor for a temporary file.

You can run this client with the following command:

```bash
cd scripts
editor templates/tableproperties.template
editor templates/schema.template
./utility/addTable.sh <instance-id> <table-name>
./utility/renameTable.sh <instance-id> <old-table-name> <new-table-name>
./utility/deleteTable.sh <instance-id> <table-name>
./scripts/utility/adminClient.sh ${INSTANCE_ID}
```

### Pausing and Restarting the System

If there is no ingest in progress, and all compactions have completed, then Sleeper will go to sleep, i.e. the only
significant ongoing charges are for data storage. However, there are several lambda functions that are scheduled to
run periodically using EventBridge rules. These lambda functions look for work to do, such as compactions to run.
The execution of these should have very small cost, but it is best practice to pause the system,
i.e. turn these rules off, if you will not be using it for a while. Note that the system can still be queried when
it is paused.

```bash
# Pause the System
./scripts/utility/pauseSystem.sh ${INSTANCE_ID}

# Restart the System
./scripts/utility/restartSystem.sh ${INSTANCE_ID}
```

You can also pass `--force` as an additional argument to deleteTable.sh to skip the prompt to confirm you wish to delete
all the data. This will permanently delete all data held in the table, as well as metadata.
### Compact all files

If you want to fully compact all files in leaf partitions, but the compaction strategy is not compacting files in a
partition, you can run the following script to force compactions to be created for files in leaf partitions that were
skipped by the compaction strategy:

```bash
./scripts/utility/compactAllFiles.sh ${INSTANCE_ID} <table-name-1> <table-name-2> ...
```

## Tear Down
### Tear Down

Once your finished with your Sleeper instance, you can delete it, i.e. remove all the resources
associated with it.
Expand Down
41 changes: 41 additions & 0 deletions docs/04-tables.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,44 @@ is better tested, and is likely to be the best option if you have a large
number of processes inserting data in parallel, however there is a
potential issue in that it may see an outdated view of the files in a table.
The `S3StateStore` does not have this problem.

## Add/edit a table

Scripts can be used to add, rename and delete tables in a Sleeper instance.

The `addTable.sh` script will create a new table with properties defined in `templates/tableproperties.template`, and a
schema defined in `templates/schema.template`. Currently any changes must be done in those templates or in the admin
client. We will add support for declarative deployment in the future.

```bash
cd scripts
editor templates/tableproperties.template
editor templates/schema.template
./utility/addTable.sh <instance-id> <table-name>
./utility/renameTable.sh <instance-id> <old-table-name> <new-table-name>
./utility/deleteTable.sh <instance-id> <table-name>
```

You can also pass `--force` as an additional argument to deleteTable.sh to skip the prompt to confirm you wish to delete
all the data. This will permanently delete all data held in the table, as well as metadata.

## Reinitialise a table

Reinitialising a table means deleting all its contents. This can sometimes be useful when you are experimenting
with Sleeper or if you created a table with the wrong schema.

You can reinitialise the table quickly by running the following command:

```bash
./scripts/utility/reinitialiseTable.sh <instance-id> <table-name> <optional-delete-partitions-true-or-false> <optional-split-points-file-location> <optional-split-points-file-base64-encoded-true-or-false>
```

For example

```bash
./scripts/utility/reinitialiseTable.sh sleeper-my-sleeper-config my-sleeper-table true /tmp/split-points.txt false
```

If you want to change the table schema you'll need to change it directly in the table properties file in the S3 config
bucket, and then reinitialise the table. An alternative is to delete the table and create a new table with the same
name.
68 changes: 2 additions & 66 deletions docs/06-status.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,8 @@ All status reports can be run using the scripts in the `utility` directory, [her
your Sleeper instance id. Some of the reports also require a table name. Some offer a standard option and a verbose
option.

The available reports are as follows, with the corresponding commands to run them:
The available reports are as follows. They can be accessed through the admin client
with `./scripts/utility/adminClient.sh ${INSTANCE_ID}`, or with the commands below:

| Report Name | Description | Command | Defaults |
|-------------------------------|---------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------|
Expand Down Expand Up @@ -74,68 +75,3 @@ Here's an example:
java -cp scripts/jars/clients-*-utility.jar \
sleeper.clients.status.report.RetryMessages ${INSTANCE_ID} ingest 1000
```

## Pausing and Restarting the System

If there is no ingest in progress, and all compactions have completed, then Sleeper will go to sleep, i.e. the only
significant ongoing charges are for data storage. However, there are several lambda functions that are scheduled to
run periodically using EventBridge rules. These lambda functions look for work to do, such as compactions to run.
The execution of these should have very small cost, but it is best practice to pause the system,
i.e. turn these rules off, if you will not be using it for a while. Note that the system can still be queried when
it is paused.

```bash
# Pause the System
./scripts/utility/pauseSystem.sh ${INSTANCE_ID}

# Restart the System
./scripts/utility/restartSystem.sh ${INSTANCE_ID}
```

## Reinitialise a Table

Reinitialising a table means deleting all its contents. This can sometimes be useful when you are experimenting
with Sleeper or if you created a table with the wrong schema.

You can reinitialise the table quickly by running the following command:

```bash
./scripts/utility/reinitialiseTable.sh <instance-id> <table-name> <optional-delete-partitions-true-or-false> <optional-split-points-file-location> <optional-split-points-file-base64-encoded-true-or-false>
```

For example

```bash
./scripts/utility/reinitialiseTable.sh sleeper-my-sleeper-config my-sleeper-table true /tmp/split-points.txt false
```

If you want to change the table schema make sure you change the schema in the table properties file in the S3
config bucket.

## Sleeper Administration Client

We have provided a command line client that will enable you to:

1) List Sleeper instance properties
2) List Sleeper table names
3) List Sleeper table properties
4) Change an instance/table property

This client will prompt you for things like your instance id as mentioned above and/or
the name of the table you want to look at, the name of the property you want to update and its new value.

To run this client you can run the following command:

```bash
./scripts/utility/adminClient.sh ${INSTANCE_ID}
```

## Compact all files

If you want to fully compact all files in leaf partitions, but the compaction strategy is not compacting files in a
partition, you can run the following script to force compactions to be created for files in leaf partitions that were
skipped by the compaction strategy:

```bash
./scripts/utility/compactAllFiles.sh ${NSTANCE_ID}
```

0 comments on commit 16f03eb

Please sign in to comment.