Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce feature tracking #13775

Closed
serathius opened this issue Mar 10, 2022 · 5 comments
Closed

Introduce feature tracking #13775

serathius opened this issue Mar 10, 2022 · 5 comments
Assignees

Comments

@serathius
Copy link
Member

serathius commented Mar 10, 2022

When looking through some recent reports of data inconsistencies in etcd v3.5 I was surprised that all of them where detected manually by users. Manual detection means that users find out pretty late and they no longer have a logs from the event. This is very worrying that we are unable to effectively reproduce nor fix such cases.

I wondered why etcd doesn't have any mechanism to detect data inconsistencies. However, after more digging I found that there is a feature called "corrupt checks" just for that reason and it was implemented in v3.3 #7125. Weird, I haven't found it being used neither by Kubernetes repo, etcdadm nor any other public project using etcd. This is a big problem, how do etcd developers expect to reproduce and fix data corruption issues that users are not aware of and don't use the feature intended to detect them.

After more digging, the reason was clear, it's an experimental feature, that was meant to be graduated in v3.4 (#9190), but got dropped to v3.5 (#10893) and forgotten to be included in v3.5. Without proper feature tracing, such essential features don't get enough attention causing long term negative consequences for whole project. This doesn't just happens once, if you look through experimental flags there are a lot of features that were never graduated.

Problem:

  • Over years etcd has collected a lot of experimental features that are abandoned and not tracked for the next release.
  • Work of features is not properly tracked between releases. If original owners is no longer actively contributing, work on the feature stops and issue is closed by original stale bot.
  • Important etcd features have low adoption and are not known to users.

Proposal:

  • Establish a basic feature graduation process and tracking for it. This process will be used to ensure that they are not dropped between releases, there are expectation that they will be moving forward and there is a plan to make the feature enabled by default. We need to agree on:
    • Stages: experimental -> optional -> default and also deprecated if we plan to remove it.
    • Expectation about progress: experimental and deprecated should only take one release. We should also plan to make optional features enabled as default, but it could take longer.
    • Method to track work: We could have an issue per feature and use Github project board with different stages. Such issues should have a special label that would prevent them from being closed by stale bot.
  • Use the process for v3.6 release. We should go through all experimental features in the codebase, create issues for each of them and decide whether we want to graduate or deprecate them.
@serathius
Copy link
Member Author

cc @ptabor @ahrtr @spzala

@serathius
Copy link
Member Author

For now created stage/tracked label to start preventing issues being closed.

@spzala
Copy link
Member

spzala commented Mar 10, 2022

@serathius +1 good thought on proposal. I would suggest to create a PR (let me know if you want me to create one) with the initial proposal to handle new features and graduation in the CONTRIBUTING.md under a sub-topic Adding new feature(s) or something like that - if that sounds good to you? We can review it there and have it merge it, and keep it permanently for easy reference. Thanks!!

@kkkkun
Copy link
Contributor

kkkkun commented Mar 21, 2022

Does Flag experimental-memory-mlock need to track?

@stale stale bot added the stale label Jun 19, 2022
@ahrtr ahrtr added stage/tracked and removed stale labels Jun 19, 2022
@etcd-io etcd-io deleted a comment from stale bot Jun 20, 2022
spzala added a commit to spzala/etcd that referenced this issue Sep 6, 2022
Related etcd-io#13775

Signed-off-by: Sahdev Zala <spzala@us.ibm.com>
spzala added a commit to spzala/etcd that referenced this issue Sep 6, 2022
Related etcd-io#13775

Signed-off-by: Sahdev Zala <spzala@us.ibm.com>
spzala added a commit to spzala/etcd that referenced this issue Sep 6, 2022
Related etcd-io#13775

Signed-off-by: Sahdev Zala <spzala@us.ibm.com>
spzala added a commit to spzala/etcd that referenced this issue Sep 7, 2022
Related etcd-io#13775

Signed-off-by: Sahdev Zala <spzala@us.ibm.com>
spzala added a commit to spzala/etcd that referenced this issue Sep 10, 2022
Addressed feedback with some added thoughts. Also, added
Unsafe features.

Related etcd-io#13775

Signed-off-by: Sahdev Zala <spzala@us.ibm.com>
spzala added a commit to spzala/etcd that referenced this issue Sep 14, 2022
Add an overview and initial development guidelines. Restructured
the doc for a better readabiltiy and easier review, and per the
previous review feedback. The TODOs will be addressed iteratively.

Related etcd-io#13775

Signed-off-by: Sahdev Zala <spzala@us.ibm.com>
spzala added a commit to spzala/etcd that referenced this issue Sep 14, 2022
Add an overview and initial development guidelines. Restructured
the doc for a better readabiltiy and easier review, and per the
previous review feedback. The TODOs will be addressed iteratively.

Related etcd-io#13775

Signed-off-by: Sahdev Zala <spzala@us.ibm.com>
@serathius
Copy link
Member Author

Marking as done via #14045

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants