diff --git a/patchwork/steps/GenerateEmbeddings/README.md b/patchwork/steps/GenerateEmbeddings/README.md index 45400c437..66c267c7f 100644 --- a/patchwork/steps/GenerateEmbeddings/README.md +++ b/patchwork/steps/GenerateEmbeddings/README.md @@ -1,19 +1,19 @@ -## `patchwork/steps/GenerateEmbeddings/GenerateEmbeddings.py` +# Summary of `GenerateEmbeddings.py` -### Inputs: -- `inputs` dictionary with keys `"embedding_name"` and `"documents"`. +## Inputs: +- The code imports necessary modules and functions. +- Defines a function `filter_by_extension` to filter files by their extensions. +- Defines a function `split_text` to split a document text into chunks. +- Creates a class `GenerateEmbeddings` that inherits from `Step`. + - Constructor `__init__` initializes the class instance with required data. + - Method `run` processes documents for embedding generation. -### Code: -- Defines `filter_by_extension` function to filter files by extension. -- Defines `split_text` function to chunk text based on given parameters. -- Class `GenerateEmbeddings(Step)` inheriting from `Step`. -- Checks for required keys in the input dictionary. -- Initializes the step with input data and sets up a client connection to a vector database. -- Runs the step by processing documents and embeddings, splitting document texts if needed, and upserting data into the vector database. - -### Outputs: +## Outputs: +- The `GenerateEmbeddings` class processes document texts and embeddings, generates embeddings, and saves them to a database collection. - Returns an empty dictionary. -## `patchwork/steps/GenerateEmbeddings/__init__.py` - -- Empty file. \ No newline at end of file +## Usage: +1. Import the `GenerateEmbeddings` class from the module. +2. Create an instance of `GenerateEmbeddings` with the required input dictionary. +3. Call the `run` method to generate embeddings for the provided documents and store them in the database collection. +4. Receive the output dictionary indicating the completion of the embeddings generation process. \ No newline at end of file