Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensuring the development of CM-based automation for MLPerf complies with MLCommons bylaws #641

Open
4 of 8 tasks
gfursin opened this issue Dec 9, 2024 · 7 comments
Open
4 of 8 tasks

Comments

@gfursin
Copy link
Contributor

gfursin commented Dec 9, 2024

Several recent changes from last week disrupted my Collective Mind approach of reusing CM automations and artifacts, rather than duplicating or remixing them across different repositories and forks. Consequently, this also impacted several ongoing projects involving various MLCommons members. While addressing these issues with MLCommons, I outlined a set of tasks to resolve them and ensure the continuation of collaborative development for MLPerf automations.

  • avoid creating forks in MLCommons GitHub with stripped CM automations and artifacts, such as those in mlcommons/mlperf-automations or mlcommons/ck/tree/mlperf-inference. These forks have caused numerous issues currently under discussion with MLCommons. Any such forks or branches involving CM automations must be reviewed and approved via tickets to ensure they do not disrupt or negatively impact other MLCommons members' projects.
  • Eliminate above branches and forks and consolidate development back into the cm4mlops repository, using the mlperf-inference and main branches as before.
  • restore these files in the mlperf-inference and main branches of cm4mlops that were removed from the parent mlcommons/ck project without any discussion and explanation.
  • discuss acknowledgments and the truncated history of contributions in this repository with MLCommons
  • add link to the parent mlcommons/ck project and white paper in the README
  • use the stable main branch instead of mlperf-inference and checkouts in the PYPI package to be utilized in other MLCommons repositories. See here
  • use a stable CM version from PYPI instead of checkouts: see here
  • Reevaluate the use of the cm4mlops PyPI package, as it obscures the underlying repository with CM automations for MLPerf. Instead, revert to explicitly using pip install cmind and cm pull repo {url} {branch|checkout} for greater transparency.
gfursin added a commit to ctuning/cm4mlops that referenced this issue Dec 9, 2024
@arjunsuresh
Copy link
Contributor

arjunsuresh commented Dec 9, 2024

@gfursin Can you please point to the section in the MLCommons bylaws which is violated by the creation of a "fork"? Fork by itself does an acknowledgement and endorsement and MLCommons bylaws clearly states the below.

Section 13.3 No Obligation to Endorse
No Member shall, by reason of its Membership or participation in the Association or otherwise,
be obligated to license from the Association, use or endorse any Intellectual Property (as defined
in Section 2.7) developed or endorsed by the Association, or to conform any of its products to
any Guidance Materials developed or adopted by the Association, nor shall any such Member be
precluded from independently licensing, using or endorsing similar intellectual property,
software, specifications or documentation developed by it or by others.

Meanwhile when two people jointly develop a project it is unfair to write a white paper under one person name and publicize it. If anyone does this, no sane person will ever collaborate with him/her. No law matters here.

"I donated CK, CM and CM4MLOps to MLCommons " is mentioned here - AFAIK and what the github history shows - CM and CM4MLOps development started within MLCommons and from the start I was a part. But I don't want to argue over the ownership of an open source project - I'm just saying my reasons for not collaborating on such personal projects.

Also I'm no longer using this repository and thanks for removing my name as the maintainer here. You can fix and maintain this repository as you prefer. If you feel we are violating any laws in repositories we maintain please point to the exact violation. Otherwise we'll see what MLCommons has to say.

ctuning-admin added a commit that referenced this issue Dec 9, 2024
@gfursin
Copy link
Contributor Author

gfursin commented Dec 9, 2024

@arjunsuresh . What you are stating is false and easy to check via history of commits (in the original repository and not in this one with truncated history) - checking that with MLCommons ...

@arjunsuresh
Copy link
Contributor

arjunsuresh commented Dec 9, 2024

Sorry - what exactly is false? This file is created by you only and you can see the development history in case you have forgotten.

@gfursin
Copy link
Contributor Author

gfursin commented Dec 10, 2024

We will discuss all of this with MLCommons. In the meantime, I have restored the deleted files and the core functionality necessary for other projects and will consult with MLCommons to decide what to do with this repository, branches and forks.

@arjunsuresh
Copy link
Contributor

Sure. We are no longer using this repository as we are not working on any child projects. Below are the suggestions from chatgpt.

When managing an open-source repository, ensuring that forks don't negatively impact the original project involves a combination of proper licensing, governance, and clear communication. Here are steps to manage this effectively:


1. Use an Open Source License

Why:

A well-defined license ensures users know their rights and obligations when forking your repository.

How:

  • Choose the Right License:
    • Use a permissive license like MIT or Apache 2.0 to encourage use and attribution.
    • Use a copyleft license like GPL to require forks to remain open source.
  • Specify the License Clearly:
    • Add a LICENSE file to the root of your repository.

Impact:

A proper license gives legal clarity, ensuring forks align with your project’s values.


2. Set Contributor Guidelines

Why:

Forks often occur because contributors find it hard to work within the main repository.

How:

  • Add a CONTRIBUTING.md file outlining:
    • Coding standards.
    • How to submit pull requests.
    • Branching and issue management policies.
  • Use templates for issues and pull requests to streamline contributions.

Impact:

This reduces forks caused by communication gaps and encourages collaboration.


3. Build a Strong Governance Model

Why:

A clear governance structure helps maintain control over the direction of the project.

How:

  • Define decision-making processes in a GOVERNANCE.md file.
  • Clarify roles (maintainers, contributors, users).
  • Ensure decisions are transparent (e.g., via public discussions or GitHub issues).

Impact:

A strong governance model ensures contributors trust the process, reducing unnecessary forks.


4. Maintain High-Quality Documentation

Why:

Forks often arise due to unclear goals, vision, or usage instructions.

How:

  • Include a detailed README.md with:
    • Project overview.
    • Usage instructions.
    • Contribution guidelines.
  • Maintain a changelog (CHANGELOG.md) to track updates.

Impact:

Good documentation keeps contributors aligned with your vision, reducing fragmented forks.


5. Monitor and Engage With Forks

Why:

Understanding why forks exist helps you address potential issues proactively.

How:

  • Regularly check GitHub forks:
    • Use the "Forks" section to view popular forks.
    • Analyze changes in forks to see if they provide useful improvements.
  • Engage with maintainers of forks:
    • Offer collaboration opportunities if their changes align with your vision.

Impact:

Proactive engagement prevents forks from diverging in ways that harm the ecosystem.


6. Trademark or Branding Controls

Why:

Prevent unauthorized use of your project's name or brand in forks.

How:

  • Use trademark policies for your project name (if applicable).
  • Require forks to rebrand if they deviate significantly from your project.

Impact:

This ensures forks don’t confuse users or dilute the reputation of your project.


7. Encourage Upstream Contributions

Why:

Fork maintainers might bypass your repository if contributing upstream is difficult.

How:

  • Foster a welcoming culture for pull requests.
  • Provide timely reviews and feedback.
  • Highlight contributors in release notes or documentation.

Impact:

Forks are more likely to contribute improvements back to the original project.


8. Automate Repository Maintenance

Why:

A well-maintained repository is less likely to be forked unnecessarily.

How:

  • Use tools like Dependabot or Renovate for dependency updates.
  • Automate CI/CD pipelines to ensure high code quality.
  • Regularly close stale issues with bots like GitHub Actions.

Impact:

A well-maintained repository signals professionalism, attracting contributors instead of forkers.


9. Be Open to Collaboration

Why:

Community engagement reduces competition between forks and the main project.

How:

  • Host regular discussions or public meetings with contributors.
  • Maintain an active presence in community forums or GitHub Discussions.

Impact:

Collaborative practices strengthen the main repository and minimize fragmentation.


10. Mitigate Harmful Forks

Why:

Sometimes, forks can misrepresent or misuse your work.

How:

  • Report Forks Violating Licenses:
    • If a fork violates your license, file a DMCA takedown request through GitHub.
  • Alert the Community:
    • If a fork introduces harmful code, inform users via announcements or warnings.

Impact:

This protects your project’s integrity and user trust.


Summary

To ensure forks don't negatively impact your open-source repository:

  1. Use a proper open-source license.
  2. Set clear contributor guidelines.
  3. Monitor and engage with forks.
  4. Maintain high-quality documentation and governance.
  5. Proactively encourage collaboration and upstream contributions.

By fostering a welcoming and organized ecosystem, forks are more likely to enhance rather than harm your project. Let me know if you’d like examples for any specific step!

@gfursin
Copy link
Contributor Author

gfursin commented Dec 11, 2024

Discussing that with MLCommons ...

@gfursin
Copy link
Contributor Author

gfursin commented Dec 20, 2024

My assessments have been confirmed, along with a few additional actions:

  • Create a CM HISTORY file to document key events and milestones.
  • Retain the original cm-mlops repository.
  • Include links to the license and copyright files in each automation script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants