Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding minio info to readme #8

Merged
merged 1 commit into from
Apr 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions data-processing-lib/doc/using_s3_transformers.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@ mc cp --recursive universal/tokenization/test-data/ds02/input/ local/test/tokeni
*Note*, that once the data is copied, Minio is storing it on the local file system, so you do not need to
copy it again after cluster restart

## Creating access and secret key for Minio access

The last thing is to add Minio access and secret keys for accessing it. The following command:

```shell
Expand Down
Empty file.
3 changes: 3 additions & 0 deletions transforms/code/code_quality/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,4 +76,7 @@ the following command line arguments are available in addition to
* "--tokenizer" - input a tokenizer to convert the data into tokens. The default tokenizer is `codeparrot/codeparrot`
* "--hf_token" - input the Hugging Face auth token to download the tokenizer. This option is only required for the tokenizer's whose access is restricted in Hugging Face.

## Executing S3 examples

To execute S3 examples, please refer to this [document](../../../data-processing-lib/doc/using_s3_transformers.md)
for setting up MinIO and mc prior to running the example
5 changes: 5 additions & 0 deletions transforms/code/malware/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,3 +148,8 @@ the following command line arguments are available in addition to
--malware_output_column MALWARE_OUTPUT_COLUMN
output column name
```

### Executing S3 examples

To execute S3 examples, please refer to this [document](../../../data-processing-lib/doc/using_s3_transformers.md)
for setting up MinIO and mc prior to running the example
7 changes: 7 additions & 0 deletions transforms/code/proglang_select/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,10 @@ the following command line arguments are available in addition to
secret_key: secret key help text
url: optional s3 url
region: optional s3 region```
```

### Executing S3 examples

To execute S3 examples, please refer to this [document](../../../data-processing-lib/doc/using_s3_transformers.md)
for setting up MinIO and mc prior to running the example

5 changes: 5 additions & 0 deletions transforms/universal/doc_id/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,3 +91,8 @@ the following command line arguments are available in addition to
Compute unique integer id and place in the given named column
```
These correspond to the configuration keys described above.

### Executing S3 examples

To execute S3 examples, please refer to this [document](../../../data-processing-lib/doc/using_s3_transformers.md)
for setting up MinIO and mc prior to running the example
4 changes: 4 additions & 0 deletions transforms/universal/ededup/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,4 +102,8 @@ the following command line arguments are available in addition to

These correspond to the configuration keys described above.

### Executing S3 examples

To execute S3 examples, please refer to this [document](../../../data-processing-lib/doc/using_s3_transformers.md)
for setting up MinIO and mc prior to running the example

5 changes: 5 additions & 0 deletions transforms/universal/fdedup/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,3 +210,8 @@ the following command line arguments are available in addition to
```

These correspond to the configuration keys described above.

### Executing S3 examples

To execute S3 examples, please refer to this [document](../../../data-processing-lib/doc/using_s3_transformers.md)
for setting up MinIO and mc prior to running the example
6 changes: 6 additions & 0 deletions transforms/universal/filter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,3 +258,9 @@ the following command line arguments are available in addition to
logical operator (AND or OR) that joins filter criteria

```
### Executing S3 examples
To execute S3 examples, please refer to this [document](../../../data-processing-lib/doc/using_s3_transformers.md)
for setting up MinIO and mc prior to running the example
4 changes: 4 additions & 0 deletions transforms/universal/noop/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@ In addition, there are some useful `make` targets (see conventions above):
* `make help` - displays the available `make` targets and help text.


## Executing S3 examples

To execute S3 examples, please refer to this [document](../../../data-processing-lib/doc/using_s3_transformers.md)
for setting up MinIO and mc prior to running the example



5 changes: 5 additions & 0 deletions transforms/universal/tokenization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,3 +97,8 @@ the following command line arguments are available in addition to
--tkn_chunk_size TKN_CHUNK_SIZE
Specify >0 value to tokenize each row/doc in chunks of characters (rounded in words)
```

### Executing S3 examples

To execute S3 examples, please refer to this [document](../../../data-processing-lib/doc/using_s3_transformers.md)
for setting up MinIO and mc prior to running the example