Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the dot character in task names and/or parameters #3542

Closed
matthewrmshin opened this issue Mar 30, 2020 · 13 comments
Closed

Support the dot character in task names and/or parameters #3542

matthewrmshin opened this issue Mar 30, 2020 · 13 comments

Comments

@matthewrmshin
Copy link
Contributor

Describe exactly what you would like to see in an upcoming release

It would be nice to be able to have the dot character (.) in task names and/or parameters.

My use case is to use Cylc to build and run multiple versions of a model. E.g.I want to build multiple versions X.Y of a piece of software. The natural thing to do would be to use parameters on the version strings of the piece of software, so I can have tasks called build_software_x.y, build_software_x.z and so on, and then simply feed the version string from the parameters to the build scripts.

This is currently not possible due to the dot character between used as a delimiter between task name and cycle point in task ID.

If the logic can assume that cycle points never have the dot (.) character, then it should be possible to safely split a Task ID into the task name and cycle point components without issues. However, there may be other subtle ambiguity elsewhere.

Additional context

As discussed in this Discourse thread:
https://cylc.discourse.group/t/support-the-dot-character-in-task-names-and-or-parameters/217

Pull requests welcome!

@matthewrmshin matthewrmshin added this to the some-day milestone Mar 30, 2020
@oliver-sanders
Copy link
Member

oliver-sanders commented Mar 30, 2020

I think the task/cycle delimiter is something we will need to look at soonish as we currently support a strange myriad of approaches.

  • CLI
    • [CYCLE-POINT-GLOB/]TASK-NAME-GLOB[:TASK-STATE]
    • [CYCLE-POINT-GLOB/]FAMILY-NAME-GLOB[:TASK-STATE]
    • TASK-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
    • FAMILY-NAME-GLOB[.CYCLE-POINT-GLOB][:TASK-STATE]
  • GraphQL/protobuf
    • workflow[|cycle[|(family|task)[|job]]]

(personal preference: the GraphQL/protobuf approach is the most universal and arranges the components in the correct hierarchical order like ISO8601)

Sadly we are baked into the current task.cycle approach, it will be a big hit on users to change.

@hjoliver
Copy link
Member

Sadly we are baked into the current task.cycle approach, it will be a big hit on users to change.

Yeah that's the main problem.

But I also agree that our new GraphQL protobof approach is cleanest, and perhaps we can think about how to do it without enraging users 😠

@dwsutherland
Copy link
Member

dwsutherland commented Mar 30, 2020

My only concern with | is how it plays out in the shell world WRT CLI.. i.e.

(flow) sutherlander@cortex-vbox:~$ echo hello|goodbye
goodbye: command not found

so we'd always need quotes now (which I always do anyway):

(flow) sutherlander@cortex-vbox:~$ echo hello\|goodbye
hello|goodbye
(flow) sutherlander@cortex-vbox:~$ echo "hello|goodbye"
hello|goodbye

Of course we can change | to anything else (as it's centralised), but the UI will be dependent soon, so if this is not desirable perhaps we can choose another (or combination of characters).

@hjoliver
Copy link
Member

Yep, pipe character in the shell.

I don't think users would appreciate having to quote strings that don't contain spaces.

This is one of those problems that seem like it should be easier than it is!

@matthewrmshin
Copy link
Contributor Author

matthewrmshin commented Mar 31, 2020

If we can assume that cycle point does not have . in it, then it should be possible to support the dot character in task names.

Other considerations

I remember choosing the cycle-point/task-name/submit-number syntax because the slash / does not require quoting in CLI and also maps conveniently to relative directory hierarchy under the log output directory.

However, the slash / cannot be used now because we also have hierarchical suite names that may contain / characters?

In case no one noticed, there is yet another syntax in the suite.rc scheduling graph: task-name[cycle-point-ish] as in my-task<param>[-P1D] => .... We can, in theory, have a syntax in graph expression similar to the Task ID syntax like my-task<param>.-P1D or the slash syntax -P1D/my-task<param>.

(Unfortunately, the pipe | character cannot be used here either because it is already the OR operator.)

(We also have a syntax, which I can no longer remember, to specify inter-suite dependency in the scheduling graph.)

It would be nice if there is some rethink of all these. (But please don't let me lead you to this distraction for now while you are concentrating on other things!)

@dwsutherland
Copy link
Member

dwsutherland commented Apr 1, 2020

(Unfortunately, the pipe | character cannot be used here either because it is already the OR operator.)

@matthewrmshin - We/I weren't advocating for | as a task name character but ID delimiter.. However we need a universal delimiter that works with owner|suite|cycle|task_name|submit_num and in the shell/CLI without quoting (where | doesn't, as you've memtioned before) .. / was ruled out because of it's use in nest suite run dir.

@matthewrmshin
Copy link
Contributor Author

@dwsutherland My comment above on the use of the / slash and pipe | character as delimiters was meant to be an observation. We have tried our best in a quest to find the perfect delimiter character so we can express all the relevant information in a single path-like syntax. I remembered well the debates and the decisions. It is only unfortunate that we seem to end up with a lot of inconsistency over the years. (No doubt including a lot of my bad!)

But hopefully someone will eventually have the time to reorganise and rethink.

@dwsutherland
Copy link
Member

Hmm... Maybe migrating to a two character delimiter for CLI/IDs (internally and externally) is something to look into, then we can probably jump through all shell, datetime, suite/task name hoops.

Otherwise restrictions may have to stay...

@hjoliver
Copy link
Member

hjoliver commented May 4, 2020

Closing as superseded by #3592 (universal delimiter).

@hjoliver hjoliver closed this as completed May 4, 2020
@oliver-sanders oliver-sanders removed this from the some-day milestone Jun 8, 2020
@hippalectryon-0
Copy link

hippalectryon-0 commented Feb 8, 2024

Closing as superseded by #3592 (universal delimiter).

#3592 was closed as completed, however dots still aren't supported in task parameters. Can we reopen this issue ?

@hjoliver
Copy link
Member

hjoliver commented Feb 8, 2024

Hi @hippalectryon-0 - it was a while ago, but at first glance closing this Issue might have been an error on my part.

#3592 got rid of the . in the task ID (Cylc 7's task.cycle became Cylc 8's cycle/task as part of the wider-scoped universal ID. Technically that should make supporting the dot in task names easier, although we still have back-compat support to deal with.

I'll reopen this and see what others have to say about it.

@hjoliver hjoliver reopened this Feb 8, 2024
@oliver-sanders
Copy link
Member

The dot character is currently reserved for future uses, IMO we should not permit its use in task names. However, the following characters can be used _, -, +, % and @.

@hippalectryon-0
Copy link

hippalectryon-0 commented Feb 9, 2024

Maybe we should provide in the docs an official workaround example for users (like me) who have parameters with dots in their name (which is very common for, e.g. climate model names/versions).

For now I've resorted to replacing "." by "@dot", and using sed in the parameterized scripts to replace it, but it's not great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants