From 93e655cbb3feeecd49e21c18db75c0a65b860471 Mon Sep 17 00:00:00 2001 From: Colin Leach Date: Sat, 26 Oct 2024 12:43:57 -0700 Subject: [PATCH] WIP draft of function-composition concept --- .../function-composition/.meta/config.json | 7 + concepts.wip/function-composition/about.md | 149 ++++++++++++++++++ .../function-composition/introduction.md | 1 + concepts.wip/function-composition/links.json | 6 + 4 files changed, 163 insertions(+) create mode 100644 concepts.wip/function-composition/.meta/config.json create mode 100644 concepts.wip/function-composition/about.md create mode 100644 concepts.wip/function-composition/introduction.md create mode 100644 concepts.wip/function-composition/links.json diff --git a/concepts.wip/function-composition/.meta/config.json b/concepts.wip/function-composition/.meta/config.json new file mode 100644 index 00000000..75930499 --- /dev/null +++ b/concepts.wip/function-composition/.meta/config.json @@ -0,0 +1,7 @@ +{ + "authors": [ + "colinleach" + ], + "contributors": [], + "blurb": "Julia supports composing multiple functions into one, and piping data through a chain of functions." +} diff --git a/concepts.wip/function-composition/about.md b/concepts.wip/function-composition/about.md new file mode 100644 index 00000000..162b8daf --- /dev/null +++ b/concepts.wip/function-composition/about.md @@ -0,0 +1,149 @@ +# About + +Julia encourages programmers to put as much code as possible inside functions that can be JIT-compiled, and creating many small functions is, by design, performant. + +That tends to leave many small, simple functions, which need to be combined to carry out non-trivial tasks. + +One obvious approach is to nest function calls. +The following example is very contrived, but illustrates the point. + +```julia-repl +julia> first.(titlecase.(reverse.(["my", "test", "strings"]))) +3-element Vector{Char}: + 'Y': ASCII/Unicode U+0059 (category Lu: Letter, uppercase) + 'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase) + 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase) +``` + +The disadvantage of this approach is that readability drops rapidly as nesting gets deeper. + +We need a simpler and more flexible approach. + +## Composition + +This is the technique beloved of mathematicians, and Julia copies the mathematical syntax. + +An arbitrary number of functions can be [`composed`][comp] together with `∘` operators (entered as `\circ` then tab). +The result can be used as a single function. + +```julia-repl +julia> compfunc = first ∘ titlecase ∘ reverse +first ∘ titlecase ∘ reverse + +julia> compfunc.(["my", "test", "strings"]) +3-element Vector{Char}: + 'Y': ASCII/Unicode U+0059 (category Lu: Letter, uppercase) + 'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase) + 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase) + +# alternative syntax, giving the same result +julia> (first ∘ titlecase ∘ reverse).(["my", "test", "strings"]) +``` + +A couple of points to note: + +- The starting functions appear in the same order as when nesting, and are executed in right-to-left order. +- Broadcasting is not simple to use when composing, but can be applied when calling the composed function. + +## Pipelining + +An alternative might be thought of as the _programmers'_ approach, rather than the _mathematicians'_. + +[`Pipelines`][comp] have long been used in Unix shell scripts, and more recently became popular in mainstream programming languages (F# is sometimes credited with pioneering their adoption). + +The basic concept is to start with some data, then pipe it through a sequence of functions to get the result. + +The pipe operator is `|>` (as in F# and recent versions of R), though Julia also has a broadcast version `.|>`. + +```julia-repl +julia> ["my", "test", "strings"] .|> reverse .|> titlecase .|> first +3-element Vector{Char}: + 'Y': ASCII/Unicode U+0059 (category Lu: Letter, uppercase) + 'T': ASCII/Unicode U+0054 (category Lu: Letter, uppercase) + 'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase) +``` + +Execution is now strictly left-to-right, with the output of each function flowing in the direction of the arrow to become the input for the next function. + +## Limitations, workarounds, and other options + +It is no coincidence that the functions used to illustrate composition and pipelining all take a _single_ argument. + +Some purely-functional languages, pipe the _first_ argument into a function but allow others to be included. + +In contrast, Julia only expects function _names_ (or something equivalent) in a pipeline, without any additional arguments. + +There are important technical reasons for this (related to the fact that [`currying`][currying] is not a standard part of the language design). +The _many_ people who have no understanding of currying should merely accept that this limitation is not a careless oversight, and is not likely to change in future Julia versions. + +### Workarounds + +We need single-arguments functions that do whatever is needed. +Fortunately, defining new functions in Julia is easy. + +Most simply, we could use an [`anonymous function`][anonymous-function]. +For example, if we have a single input string and we want to split on underscores: + +```julia-repl +julia> "my_test_strings" |> (s -> split(s, '_')) +3-element Vector{SubString{String}}: + "my" + "test" + "strings" +``` + +That vector could then be piped to other functions, as before. + +Enclosing the anonymous function in parentheses is optional in this case, but more generally is a useful way to reduce ambiguity. + +Equally, we could create a named function, earlier in the program, and reuse it as needed. + +[`Closures`][closures] are beyond the scope of this Concept, but anyone familiar with them from other languages will recognise that they offer a more flexible way to create single-argument functions. + +```julia-repl +julia> function makesplit(sep) + fs(str) = split(str, sep) + fs + end +makesplit (generic function with 1 method) + +julia> f_us = makesplit('_') +(::var"#fs#32"{Char}) (generic function with 1 method) + +julia> "my_test_strings" |> f_us +3-element Vector{SubString{String}}: + "my" + "test" + "strings" + +# alternatively: +julia> "my_test_strings" |> makesplit('_') +3-element Vector{SubString{String}}: + "my" + "test" + "strings" +``` + +Once `makesplit()` is defined, it can be used to work with any separator. +Note that `makesplit('_')` is a _function call_ that evaluates to another function, which in turn receives input from the pipe. + +If this seems confusing, that is normal at first (but it becomes clearer with practice). + +### Other options + +There has been a long discussion about making pipes more versatile in base Julia, but the various suggestions are mutually incompatible and no agreement has been reached. + +Meanwhile, users have taken the usual approach of creating various installable packages that address specific needs. +None will work within Exercism, but take a look at these if you are interested: + +- [`Chain.jl`][chain] +- [`Underscores.jl`][underscores] +- [`DataPipes.jl`][datapipes] + +[comp]: https://docs.julialang.org/en/v1/manual/functions/#Function-composition-and-piping +[chain]: https://github.com/jkrumbiegel/Chain.jl +[underscores]: https://c42f.github.io/Underscores.jl/stable/ +[datapipes]: https://github.com/JuliaAPlavin/DataPipes.jl +[closures]: https://en.wikipedia.org/wiki/Closure_(computer_programming) +[currying]: https://en.wikipedia.org/wiki/Currying +[anonymous-function]: https://docs.julialang.org/en/v1/manual/functions/#man-anonymous-functions diff --git a/concepts.wip/function-composition/introduction.md b/concepts.wip/function-composition/introduction.md new file mode 100644 index 00000000..e10b99d0 --- /dev/null +++ b/concepts.wip/function-composition/introduction.md @@ -0,0 +1 @@ +# Introduction diff --git a/concepts.wip/function-composition/links.json b/concepts.wip/function-composition/links.json new file mode 100644 index 00000000..34752bc9 --- /dev/null +++ b/concepts.wip/function-composition/links.json @@ -0,0 +1,6 @@ +[ + { + "url": "https://docs.julialang.org/en/v1/manual/functions/#Function-composition-and-piping", + "description": "Manual section on function composition and piping." + } +]