Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable building specified components #265

Merged
merged 14 commits into from
Jul 6, 2023

Conversation

PhilippeMoussalli
Copy link
Contributor

PR that enables specifying components during the build and tagging process, this makes it easier to build and push specific components during remote development.

Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @PhilippeMoussalli

I would like to be able to build the build_components.sh script for example components as well. So I would propose to add the following arguments:

  • -d, --components-dir directory with components to build, defaults to components
  • -co, --component specific component to build in the components dir, defaults to all

Note that currently we are passing in the first tag as the fondant version as well, which is used as pip install git+https://github.com/ml6team/fondant@$tag. So this will not work with arbitrary tags. They actually need to be refs available on the github repository. This script was made for usage in CI/CD and I'm not sure how well it will work outside it.

scripts/build_components.sh Outdated Show resolved Hide resolved
@PhilippeMoussalli
Copy link
Contributor Author

Thanks @PhilippeMoussalli

I would like to be able to build the build_components.sh script for example components as well. So I would propose to add the following arguments:

  • -d, --components-dir directory with components to build, defaults to components
  • -co, --component specific component to build in the components dir, defaults to all

Note that currently we are passing in the first tag as the fondant version as well, which is used as pip install git+https://github.com/ml6team/fondant@$tag. So this will not work with arbitrary tags. They actually need to be refs available on the github repository. This script was made for usage in CI/CD and I'm not sure how well it will work outside it.

I can add the component directory argument in this PR. Regarding the other issue, I am still tagging them with the latest release tag and then using the tag_component script to give them a desired tag and test them out (dev or latest). It's fine for now for testing but perhaps we should later on look into building directly from the current source code.

@RobbeSneyders
Copy link
Member

I can add the component directory argument in this PR. Regarding the other issue, I am still tagging them with the latest release tag and then using the tag_component script to give them a desired tag and test them out (dev or latest). It's fine for now for testing but perhaps we should later on look into building directly from the current source code.

Not sure if I understand what you mean with this, but outside of the CI/CD pipeline, we should never tag images with:

  • release versions
  • dev
  • latest

As these have special meaning in our release flow.

@PhilippeMoussalli PhilippeMoussalli force-pushed the enable-building-specified-components branch from bcd4fca to c08a6d8 Compare July 5, 2023 07:20
@PhilippeMoussalli
Copy link
Contributor Author

PhilippeMoussalli commented Jul 5, 2023

I can add the component directory argument in this PR. Regarding the other issue, I am still tagging them with the latest release tag and then using the tag_component script to give them a desired tag and test them out (dev or latest). It's fine for now for testing but perhaps we should later on look into building directly from the current source code.

Not sure if I understand what you mean with this, but outside of the CI/CD pipeline, we should never tag images with:

  • release versions
  • dev
  • latest

As these have special meaning in our release flow.

Updated to add the directory as an argument to enable building custom images.

I see, so if we want to avoid tagging image with the release_version we have to uncouple the fact that the tag is used to specify the fondant version to build and also specifies the image tag. For development before merging to main, we can give arbitrary tags (e.g. dev_philippe)

@RobbeSneyders
Copy link
Member

I don't think arbitrary tags will work, but (remote) commit hashes will.

Eg. if you run ./scripts/build_components.sh -t d23e23, it will:

  • Tag images with d23e23
  • Pin the fondant dependency to git+https://github.com/ml6team/fondant@d23e23
  • Update fondant_component.yaml to use the d23e23 image

@PhilippeMoussalli
Copy link
Contributor Author

I don't think arbitrary tags will work, but (remote) commit hashes will.

Eg. if you run ./scripts/build_components.sh -t d23e23, it will:

  • Tag images with d23e23
  • Pin the fondant dependency to git+https://github.com/ml6team/fondant@d23e23
  • Update fondant_component.yaml to use the d23e23 image

yes, the tagging I mentioned was mainly related to running the tag_component.sh script but I think this one is mainly reserved for releases to tag with the latest tag. In that case, we can still use the build_components.sh script for:

  • developing before merging to main (using the commit as the tag) -> requires running the script manually
  • start developing on the current main branch before release (using the dev as the tag) -> handled by CI/CD upon merge to main
  • using the latest images after the release (using the latest or release tag) -> handled by CI/CD upon release

Do you agree with this workflow? Anything else that needs to be done in this PR?

Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the nits on the help message, but I want to make sure we are talking about the same things :)

echo " -t, --tag=<value> Tag to add to image, repeatable
The first tag is set in the component specifications"
echo " -h, --help Display this help message"
echo " -t, --tag=<value> Tag to add to image, repeatable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo " -t, --tag=<value> Tag to add to image, repeatable
echo " -t, --tag <value> Tag to add to image, repeatable

I don't think we handle = correctly. This was already documented incorrectly in the past.

echo " -t, --tag=<value> Tag to add to image, repeatable
The first tag is set in the component specifications"
echo " -c, --cache <value> Use registry caching when building the components (default:false)"
echo " -d, --component-dirs <value> Directory where the component is built from the root directory (default:components)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo " -d, --component-dirs <value> Directory where the component is built from the root directory (default:components)"
echo " -d, --component-dirs <value> Directory containing components to build as subdirectories. The path should be relative to the root directory (default:components)"

The first tag is set in the component specifications"
echo " -c, --cache <value> Use registry caching when building the components (default:false)"
echo " -d, --component-dirs <value> Directory where the component is built from the root directory (default:components)"
echo " -n, --namespace <value> Set the namespace (default: ml6team)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo " -n, --namespace <value> Set the namespace (default: ml6team)"
echo " -n, --namespace <value> The namespace for the built images, should match the github organization (default: ml6team)"

-d |--components-dir ) components_dir="$2"; shift;;
-r |--repo) repo="$2"; shift;;
-t |--tag) tags+=("$2"); shift;;
-co|--component) component="$2"; shift;;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this repeatable so you can specify multiple components?

Comment on lines 12 to 14
echo " -co, --component <value> Set the component name. Pass the component folder name to build
a certain components or 'all' to build all components in the components
directory (default: all)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo " -co, --component <value> Set the component name. Pass the component folder name to build
a certain components or 'all' to build all components in the components
directory (default: all)"
echo " -co, --component <value> Specific component to build. Pass the component subdirecotry name
to build a certain components or 'all' to build all components in the components
directory (default: all)"

pushd "$dir"

BASENAME=${dir%/}
BASENAME=${BASENAME##*/}
if [[ "$BASENAME" == "${component}" ]] || [[ "${component}" == "all" ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this a bit different? Python pseudo code:

components = [dir in components_dir] if component == "all" else [component]
print(f"building components {components}")
for component in components:
    ...

I think this would be easier to understand and debug.

Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @PhilippeMoussalli! Looks good to me apart from some formatting issues.

@@ -75,5 +99,6 @@ for dir in "$component_dir"/*/; do
--label org.opencontainers.image.source=https://github.com/${namespace}/${repo} \
.

popd
popd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong indentation

cache_name=ghcr.io/${namespace}/${BASENAME}:build-cache
echo "Caching from/to ${cache_name}"
echo "Caching from/to ${cache_name}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong indentation

@@ -61,10 +84,11 @@ for dir in "$component_dir"/*/; do
args+=(-t "$tag")
done

# Add cache arguments if caching is enabled
# # Add cache arguments if caching is enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double comment

*) echo "Unknown parameter passed: $1"; exit 1;;
esac; shift; done

# Check for required argument
if [ -z "${tags}" ]; then
if [ ${#tags[@]} -eq 0 ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this needed?

Copy link
Contributor Author

@PhilippeMoussalli PhilippeMoussalli Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both achieve the same purpose but was getting suggestion to change it in my IDE so I opted for checking if the array length is zero instead. I'll revert it back.
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question was mainly meant to learn, not to ask you to revert 😅
But if they're equivalent, both are fine for me.

@RobbeSneyders RobbeSneyders linked an issue Jul 5, 2023 that may be closed by this pull request
@RobbeSneyders
Copy link
Member

Seems like you have a conflict @PhilippeMoussalli. Feel free to merge after you resolve it.

@PhilippeMoussalli PhilippeMoussalli merged commit bc5b7ce into main Jul 6, 2023
@PhilippeMoussalli PhilippeMoussalli deleted the enable-building-specified-components branch July 6, 2023 07:51
Hakimovich99 pushed a commit that referenced this pull request Oct 16, 2023
PR that enables specifying components during the build and tagging
process, this makes it easier to build and push specific components
during remote development.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make build_components.sh script generic for example components
2 participants