ubc-cirrus-lab · arshiamoghimi · Oct 21, 2023 · Jul 29, 2023 · Jul 29, 2023 · Jul 29, 2023
diff --git a/CONTIBUTION.md b/CONTIBUTION.md
@@ -0,0 +1,59 @@
+# Welcome to Parrotfish contributing guide <!-- omit in toc -->
+
+Thank you for investing your time in contributing to our project! :sparkles:.
+
+## New contributor guide
+
+To get an overview of the project, read the [README](README.md). Here are some resources to help you get started:
+
+- [Set up Git](https://docs.github.com/en/get-started/quickstart/set-up-git)
+- [GitHub flow](https://docs.github.com/en/get-started/quickstart/github-flow)
+- [Collaborating with pull requests](https://docs.github.com/en/github/collaborating-with-pull-requests)
+- [AWS Lambda Documentation](https://docs.aws.amazon.com/lambda/latest/dg/welcome.html)
+- [Google Cloud Function](https://cloud.google.com/functions/docs)
+
+
+## Getting started
+
+### Steps to run parrotfish while making changes:
+1. Create and activate a virtualenv.
+```bash
+python3 -m venv src-env
+source src-env/bin/activate
+```
+
+2. Install required packages.
+```bash
+pip install -r requirements.txt -r requirements-dev.txt
+```
+
+3. Install Parrotfish as an editable package.
+```bash
+export PACKAGE_VERSION="dev"
+pip install -e .
+```
+
+4. Run it!
+```bash
+parrotfish -h
+```
+
+## Running tests:
+```bash
+pytest -v --cov 
+```
+
+## Versioning:
+We use [commitizen](https://commitizen-tools.github.io/commitizen/) to generate the CHANGELOG.md, bump the version and create a tag.
+
+### Configuring commitizen:
+```bash
+cz init
+```
+
+### Bump version:
+```bash
+cz bump
+```
+A tag is created with the new version. When you push this tag a packaging pipeline will be triggered which will create 
+a release, package the parrotfish tool and attaches that as an asset to the release. 
diff --git a/README.md b/README.md
@@ -9,9 +9,15 @@ You can learn more about the architecture and performance of Parrotfish in our r
 ## Setup
 
 ### Requirements
-- Python 3
-- AWS CLI (configured with `aws configure`)
-- MongoDB (by default, accessible on `localhost:27017`)
+- Python >= 3.8
+
+#### Requirement to run parrotfish for AWS Lambda Function:
+- AWS CLI (Configured with `aws configure`)
+
+#### Requirement to run parrotfish for Google Cloud Function:
+- gcloud CLI (Authenticate with your credentials: `gcloud auth application-default login`)
+- Should enable the Cloud Billing API in your account.
+
 
 ### Steps
 1. Create and activate a virtualenv.
@@ -20,48 +26,24 @@ python3 -m venv src-env
 source src-env/bin/activate
 ```
 
-2. Install required packages.
-```
-pip install -r requirements.txt
-```
-
-3. Install SPOT as an editable package.
+2. Install the parrtofish package from the latest release. 
 ```bash
-pip install -e .
+pip install ${path to parrotfish-version.whl}
 ```
 
-4. Run it!
-```bash
-parrotfish
-```
+3. Create the parrotfish configuration file.  
+Check the [configuration json object](src/configuration/README.md) to know configuration options.
 
-## Running new benchmark fucntions
-### Add a new function
-Follow the instructions [here](src/serverless_functions/README.md)
-### Prepare and train
-1. Profile to get initial data 
+4. Running it!
 ```bash
-parrotfish <function_name> -p
+parrotfish ${path to the configuration file}
 ```
-2. fetch new logs from CloudWatch
-```bash
-parrotfish <function_name> -f
-```
-3. train with selected model
-```bash
-parrotfish <function_name> -tm polynomial
-```
-4. You can get recommendation without updating config file at or after the previous step with `-r`
-5. update the config file and calculate error rate
-```bash
-parrotfish -um polynomial
-```
-Graphs for error and prediction vs epoch can be found corresponding folders in `serverless_functions/<function>/`
-
-### Profiling alternative
-To invoke only with the configurations defined in `config.json`, use `-i` flag
-```bash
-parrotfish <function_name> -i
+```text
+optional arguments:
+  -h, --help            show this help message and exit
+  --path PATH, -p PATH  Path to the configuration file
+  --verbose, -v         Set the logging level to INFO
+  --apply               Apply optimized configuration
 ```
 
 ## Acknowledgments

diff --git a/benchmarks/README.md b/benchmarks/README.md
diff --git a/cz.json b/cz.json
@@ -0,0 +1,9 @@
+{
+  "commitizen": {
+    "name": "cz_conventional_commits",
+    "tag_format": "$version",
+    "version_scheme": "semver",
+    "version": "0.0.1",
+    "update_changelog_on_bump": true
+  }
+}
diff --git a/setup.py b/setup.py
@@ -2,7 +2,7 @@
 
 from setuptools import setup, find_packages
 
-version = os.environ['PACKAGE_VERSION']
+version = os.environ["PACKAGE_VERSION"]
 
 with open("README.md", "r") as f:
     long_description = f.read()
@@ -13,7 +13,7 @@
 with open("requirements-dev.txt", "r") as rdev:
     requirements_dev = rdev.read()
 
-#TODO: Update licence attribute
+# TODO: Update licence attribute
 setup(
     name="parrotfish",
     version=version,

diff --git a/src/configuration/README.md b/src/configuration/README.md
@@ -0,0 +1,70 @@
+# Parrotfish configuration file:
+
+## Basic Configuration:
+
+```
+{
+    "function_name": The serverless function's name (Required),
+    "vendor": The cloud provider "AWS" or "GCP" (Required),
+    "region": The serverless function's region (Required),
+    "payload": Payload to invoke the serverless function with (Required if payloads attribute not provided),
+    "payloads": [
+        {
+            "payload": Payload to invoke the serverless function with (Required),
+            "weight": Impact of the exploration with this payload over the weighted average cost. (Required and Should be in [0, 1]),
+            "execution_time_threshold": The execution time threshold constraint. We leverages the execution time model to recommend a configuration 
+                                        that minimizes cost while adhering to the specified execution time constraint. (Optional),
+        }...
+    ] (Constraint: sum of weights must be equal to 1!),
+}
+```
+
+## Advanced configuration:
+```
+{
+    ...
+
+    "memory_bounds": Array containing two memory values that represent the memory configuration bounds (Optional),
+    "termination_threshold": When the knowledge value for the optimal memory configuration reaches this threshold the recommendation algorithm terminates. (Optional),
+    "max_sample_count": The maximum size of the sample. (Optional),
+    "number_invocations": The minimum number of invocations per iteration. (Optional),
+    "dynamic_sampling_params": {
+        "max_sample_count": The maximum number of samples we gather through dynamically,
+        "coefficient_of_variation_threshold": When sample dynamically until we find a consistant enough. Consistency is measured by the coefficient of variation, 
+                                              and when the calculated coefficient of variation reaches this threshold we terminate the dynamic sampling,
+    } (Optional),
+    "max_number_of_invocation_attempts": The maximum number of attempts per invocation when this number is reached an error is raised. (Optional)
+    "execution_time_threshold":  The execution time threshold constraint. We leverages the execution time model to recommend a configuration that minimizes cost while adhering to the specified
+                                 execution time constraint. In case of multiple payloads this value will be applied to all the payloads if no execution_time_threshold attribute is present.
+}
+```
+
+
+## Example single payload:
+```json
+{
+    "function_name": "example_function",
+    "vendor": "AWS",
+    "region": "example_region",
+    "payload": "payload"
+}
+```
+
+## Example multiple payloads:
+```json
+{
+    "function_name": "example_function",
+    "vendor": "AWS",
+    "region": "example_region",
+    "payloads": [
+      {
+        "payload": "payload",
+        "weight": 0.3
+      },
+      {
+        "payload": "payload",
+        "weight": 0.7
+      }
+    ]
+}
+```
diff --git a/src/parrotfish.py b/src/parrotfish.py
@@ -62,8 +62,12 @@ def __init__(self, config: any):
     def optimize(self, apply: bool = None) -> None:
         collective_costs = np.zeros(len(self.explorer.memory_space))
         min_memories = []
+        i = 1
 
         for entry in self.config.payloads:
+            if len(self.config.payloads) != 1:
+                print(f"Explorations for payload {i}:")
+                i += 1
             # Run recommender for the specific payload
             min_memories.append(self._optimize_one_payload(entry, collective_costs))