Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add KO_DATA_DATE_EPOCH env var to set the modification time for files in kodata #372

Merged
merged 2 commits into from
Jun 15, 2021

Conversation

skirsten
Copy link
Contributor

@skirsten skirsten commented May 30, 2021

Example:

export KO_DATA_DATE_EPOCH=$(git log -1 --format='%ct')
ko publish .

This allows static file servers mounted at KO_DATA_PATH to send the Last-Modified header and handle revalidation requests.

@google-cla google-cla bot added the cla: yes label May 30, 2021
@codecov-commenter
Copy link

codecov-commenter commented Jun 1, 2021

Codecov Report

Merging #372 (1c1f3b4) into main (2ba8bb2) will increase coverage by 0.12%.
The diff coverage is 62.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #372      +/-   ##
==========================================
+ Coverage   43.92%   44.04%   +0.12%     
==========================================
  Files          33       33              
  Lines        1589     1605      +16     
==========================================
+ Hits          698      707       +9     
- Misses        764      769       +5     
- Partials      127      129       +2     
Impacted Files Coverage Δ
pkg/build/options.go 79.31% <0.00%> (-9.16%) ⬇️
pkg/commands/resolver.go 28.49% <20.00%> (-0.23%) ⬇️
pkg/commands/config.go 54.87% <75.00%> (+1.12%) ⬆️
pkg/build/gobuild.go 62.63% <86.66%> (+0.59%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ba8bb2...1c1f3b4. Read the comment docs.

@imjasonh
Copy link
Member

imjasonh commented Jun 1, 2021

Nice! I really like this.

I'd like to add more to the README to describe why you might want to do this, and what kinds of values you might want to give it. Depending on the use case, users might want to report a modtime of the last Git commit, or the time the image was built (KO_DATA_DATE_EPOCH=$(date +%s)), or something else. Are you planning to use this in your own build process?

Would it make sense to have a special value for the env var that we interpret to mean "passthrough the files' actual modtimes"? (That wouldn't have to happen in this PR, just curious if it might be useful)

@jonjohnsonjr
Copy link
Collaborator

Would it make sense to just use SOURCE_DATE_EPOCH to set these values instead? Is there any situation where you want the container's created time to be different from the kodata and binary timestamps?

@skirsten
Copy link
Contributor Author

skirsten commented Jun 1, 2021

Hi @imjasonh and @jonjohnsonjr,
thanks for your feedback.

I'd like to add more to the README to describe why you might want to do this, and what kinds of values you might want to give it.

Yes, I can do that.

Are you planning to use this in your own build process?

I am using this (fork) in a GitHub Action to deploy a api that also has some static assets. Some files are generated at build time and served directly using http.FileServer from kodata and clients revalidate their local caches.

Would it make sense to have a special value for the env var that we interpret to mean "passthrough the files' actual modtimes"?

For me at least, this would not make sense. Most of my builds and deploys happen in pipelines so the actual modtimes will always be the "time of checkout" in the pipeline. But maybe for users building directly on their machine this might be useful.

Would it make sense to just use SOURCE_DATE_EPOCH to set these values instead? Is there any situation where you want the container's created time to be different from the kodata and binary timestamps?

Yes, lets say we have a app that requires some sort of static storage (e.g. an index etc.) baked into the container.
The container is automatically rebuild and redeployed on a cron and the data might change or might not between deploys.

I would expect that only the modtime of the entries in the tar file of the kodata layer will be updated based on the env var the pipeline set. The pipeline could be build to detect the last modified timestamp of the input data for the index and supply that via the env var.
I want to keep the reproducibility (and 0 timestamp) of the binary layer and the metadata.

So In this case, the amount of changes and pushes required is minimized to the one kodata layer.
Depending on where the image is deployed the platform might choose to skip the deploy, in the case that the image hash did not change.
Furthermore this keeps the CI completely stateless and does not require any information about previous runs or existing images.

@jonjohnsonjr
Copy link
Collaborator

I am using this (fork) in a GitHub Action to deploy a api that also has some static assets. Some files are generated at build time and served directly using http.FileServer from kodata and clients revalidate their local caches.
I want to keep the reproducibility (and 0 timestamp) of the binary layer and the metadata.

Brilliant. This makes a lot of sense. Thanks for the explanation.

README.md Outdated
@@ -344,6 +344,8 @@ or to the latest git commit's timestamp with:
export SOURCE_DATE_EPOCH=$(git log -1 --format='%ct')
```

The same applies to `KO_DATA_DATE_EPOCH` which sets the last modified time of all files in `kodata`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a little more why this is useful (e.g. mention modtime for https://golang.org/pkg/net/http/#ServeFile)?

Copy link
Contributor Author

@skirsten skirsten Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I improved the docs a bit.
As for examples for the timestamp source I could not think of anything useful.
Maybe I could add something more complex like https://stackoverflow.com/questions/4561895/how-to-recursively-find-the-latest-modified-file-in-a-directory.

Let me know what you think.

@skirsten skirsten force-pushed the kodata-date-epoch branch from 6dbd6cb to 46ff248 Compare June 11, 2021 21:08
@skirsten skirsten force-pushed the kodata-date-epoch branch from 46ff248 to 1c1f3b4 Compare June 11, 2021 21:09
@jonjohnsonjr jonjohnsonjr merged commit ee23538 into ko-build:main Jun 15, 2021
@jonjohnsonjr
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants