Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss Project Design and Initial Version Proposal #13

Closed
Karanjot786 opened this issue Jun 23, 2024 · 4 comments
Closed

Discuss Project Design and Initial Version Proposal #13

Karanjot786 opened this issue Jun 23, 2024 · 4 comments
Assignees

Comments

@Karanjot786
Copy link
Member

Summary

This issue is to discuss the overall design of the project and propose an initial version. The goal is to create a well-organized, maintainable, and scalable codebase.

Objectives

  1. Define Clear Objectives and Scope

    • Clearly define what the project aims to achieve.
    • Identify the scope of the project to prevent feature creep.
  2. Use Object-Oriented Design (OOP)

    • Encapsulation: Group related data and methods into classes.
    • Inheritance: Use inheritance to extend functionality.
    • Polymorphism: Design methods that can process objects differently based on their data type or class.
  3. Modular Architecture

    • Separation of Concerns: Break the project into distinct modules, each responsible for a specific functionality.
    • Reusability: Write modular code that can be reused across different parts of the project or in future projects.
  4. Design Patterns

    • Utilize common design patterns like Singleton, Factory, Strategy, Observer, etc., where appropriate to solve design problems in a standardized way.
  5. Define Models for TES and WES

    • Use Pydantic models to validate and manage TES and WES request payloads and responses.
    • Ensure proper error handling and data validation.

Proposed Project Structure

CrateGen/
├── src/
│ ├── converters/
│ │ ├── tes_to_wrroc.py
│ │ ├── wes_to_wrroc.py
│ ├── models/
│ │ ├── tes_models.py
│ │ ├── wes_models.py
│ ├── utils/
│ │ ├── validation.py
│ │ ├── formatting.py
│ └── cli.py
├── tests/
│ ├── test_converters.py
│ ├── test_models.py
│ ├── test_utils.py
├── docs/
│ └── index.rst
├── .github/
│ └── workflows/
│ └── ci.yml
├── pyproject.toml
├── mypy.ini
├── README.md
└── LICENSE

Key Components

  1. Models: Define data models for TES and WES using Pydantic.
  2. Converters: Implement the conversion logic in separate modules.
  3. Utils: Utility functions for validation and formatting.
  4. CLI: Command-line interface for the tool.
  5. Tests: Unit and integration tests for all components.
  6. Docs: Documentation files.
  7. CI/CD: GitHub Actions workflow for CI.

Tasks

  1. Define Models

    • Create Pydantic models for TES and WES.
  2. Implement Converters

    • Write functions to convert TES to WRROC and WES to WRROC.
  3. Set Up CLI

    • Create a CLI for the tool using Click or another framework.
  4. Write Tests

    • Implement unit and integration tests for all components.
  5. Documentation

    • Write comprehensive documentation for the codebase, APIs, and overall project.
  6. CI/CD Pipeline

    • Set up a CI/CD pipeline to automate testing and deployment.

Next Steps

  1. Discuss and Refine Design

    • Review the proposed design and provide feedback.
    • Refine the design based on feedback.
  2. Break Down Work into Small Work Packages

    • Define small, manageable work packages for implementation.
    • Ensure each work package includes tests and documentation.
  3. Start Implementation

    • Begin with the abstraction layer and proceed with the TES to WRROC conversion, followed by WES to WRROC.

Please provide your feedback and suggestions on this proposed design.

Thank you!

@uniqueg
Copy link
Member

uniqueg commented Jul 2, 2024

This looks very reasonable.

A few points:

  • Please call your code folder according to the (short) name of the project, not src.
  • You will probably need a class that coordinates everything and which can be imported from another Python project. This should not be in cli.py. Rather, cli.py should instantiate that class and run its methods to do what needs to be done. In other words, we need a library entry point on top of the CLI entry point - which will just be a slim wrapper around the library entry point.
  • Add some abstraction layers for your converters. You want to make these pluggable and, to ease that, you want to create an abstract class with a generic converter interface that you then implement for the WES and TES converters. Note that it might make sense to have different abstract classes for converters from and to RO-Crates. You may want to have different subpackages for these two directions.
  • Talk to Javed about trying his new Python project Cookiecutter to generate the boilerplate code (it will replace some of what we already have, so it might make sense to start from scratch; however, we will get a lot of goodies for free)
  • In tests/, it might be good to have tests/unit/ and tests/integration/ subpackages. The cookiecutter will take care of that.
  • For the WES and TES models, we will eventually reuse the (Rust) models (with Python bindings) that @aaravm is creating for his project. For now, let's not spend too much time on that and reuse the models we already defined for proTES (here is an upgrade from @athith-g) and/or TESK (talk to @JaeAeich). Btw, no need to call the modules tes_models if they are already in a package models. Better to import from src.models import tes rather than from src.models import tes_models - it's redunant.

@Karanjot786
Copy link
Member Author

I have renamed the src folder to CrateGen. Now, I am working on refactoring the conversion functions according to the feedback. This includes creating abstract classes for the converters and separating the library entry point from the CLI.

I'll update you once I have made significant progress on these changes.

@uniqueg
Copy link
Member

uniqueg commented Jul 3, 2024

Thanks a lot!

Please rename it to crategen (good name) and let's call the package the same way. By convention, capital letters should only be used in exceptional cases for package names.

We can still name the project and repo (I have already renamed it) CrateGen. Also make sure to update the title in the README.md.

You can do the renaming in a single PR.

Rest sounds good :)

@Karanjot786
Copy link
Member Author

Hi @uniqueg ,

I wanted to let you know that I’m going to open a PR shortly. This PR will include the initial project structure along with the implemented files. This is the initial version of the project and is not yet complete. Please review the structure and let me know if you have any feedback or suggestions for improvement.

Thank you!

@Karanjot786 Karanjot786 closed this as completed by moving to Done in gsoc-24-rocrate-ga4gh Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

3 participants