Best way to organize multiple similar projects #11103
Replies: 2 comments 2 replies
-
There's not one good way to do this, and I think everyone you ask will give a slightly different answer for it. While I have not personally had to design a system to deal with an array of models like you describe, I would start with something very basic, like a folder with small scripts you can re-use or customize between projects. So each language would have its own directory and project file, but anything you would want to update later would be in a script in a centralized directory you could refer to. You could then make separate scripts by language, or have language-specific logic in the scripts. So something like...
Your project files would then have something like If your customizations for a given language are drastic this is not the best way to do it. But if you have a few small changes for many languages, this keeps all the code in one place. Unfortunately the details of what is best depend a lot of the specifics I think. |
Beta Was this translation helpful? Give feedback.
-
To provide an example of a different answer / at a different scale: For the trained pipelines (https://spacy.io/models), we generate projects from templates, with templates for both |
Beta Was this translation helpful? Give feedback.
-
Hi!
I could use some advice about spacy project management.
Some time ago I had to train a bunch of custom NER models for different languages. Models are mostly similar, but there are different details, some models need extra steps to be done, some have different config files. At first I thought that I can create separate directory for each language with separate
project.yaml
file and all the stuff, but now I ended up having 10 projects, that have differentvars
sections, but share mostly the same list of commands. And some commands are different between files, because I've changed something on the way. So now I have a complete mess and have to remember, how one project is different from another, because they look so similar.I wish I could store all the commands and everything common in one place and share them between projects somehow. All my commands are already generalized to work with variables from
vars
section. But I'm not sure if I can "import" command from other file inproject.yaml
. I think that I can leverage dvc project file to launch spacy projects from dvc project using environmental variable overload or something like that.But I didn't come to elegant solution yet, that's why I want to ask, if you've already had such problem or can advice me something.
Beta Was this translation helpful? Give feedback.
All reactions