[Design question] I don't like `factory`. Why don't we just use modules? #2741

cshaa · 2020-09-26T15:48:04Z

cshaa
Sep 26, 2020
Collaborator

I'm working on this PR and the current dependency system of math.js seems really frustrating to me.

^^ the first lines of a more complicated script look like this with factory

To be more precise, I don't understand why don't we use ES6 modules more, instead of the factory function which to me seems worse in many ways (the code isn't DRY at all, worse IntelliSense, problems with circular dependency).

I have a vague understanding that math.js has custom bundling support and some package-wide settings and that's the reason why factory was invented, but I think these should be doable with modules too.

So what's the reason why we use factory again? 😁️

josdejong · 2020-09-27T08:29:13Z

josdejong
Sep 27, 2020
Maintainer

Good question. Thanks for asking.

One side-note about the screenshot you shared for createEigs: maybe there you should inject realSymmetric and complex instead of creating them inside createEigs and having to manage their dependencies too in createEigs?

To explain the context: What I wanted to achieve with mathjs is an environment where you can do calculations with mixed data types, like multiplying a regular number to a Complex number or a BigNumber, and work with all of those in matrices. And I wanted to make it possible to add a new data type, like say BigInt, with little effort.

The solution that we have in mathjs now is a combination of two things:

typed-function, which makes it easier to (dynamically) create and extend a single function with new data types, automatically do type conversions on function inputs, etc. So, if you create function multiply for two numbers, you can extend it with support for multiplying two BigInts, and if you define a conversion from BigInt to number, the typed-function will automatically allow you to multiply a BigInt with a number.
Dependency injection using factory. When I have extended my function multiply with support for BigInt, thanks to the dependency injection, the function prod and many others will automatically support BigInt too, since it uses multiply under the hood. This also works the other way around: if I don't need the heavyweight multiply (which supports BigNumbers, matrices, etc), and I just need plain and simple number support, I can use a lightweight implementation of multiply for numbers, and inject that in prod and other functions.

The dependency injection indeed complicates the code hugely. If you or anyone can come up with a simpler approach I would love hear! Maybe we can do something smart during a compile/pre-build step instead or so.

0 replies

cshaa · 2020-09-28T00:14:35Z

cshaa
Sep 28, 2020
Collaborator Author

Thanks for your reply!

One side-note about the screenshot you shared for createEigs: maybe there you should inject realSymmetric and complex instead of creating them inside createEigs and having to manage their dependencies too in createEigs?

You mean I should bundle complex and createEigs using factory in the same way as eg. identity is bundled? I was thinking about that... but it would make them visible for the end users, right? And that in turn would mean that the functions have to have a reasonable, user-proof API, which isn't the case right now 😁️ Or are you talking about a different type of injection?

typed-function, which makes it easier to (dynamically) create and extend a single function with new data types, automatically do type conversions on function inputs, etc.

I think I understand typed quite well already. It seems pretty smart to me 👍️ and there doesn't seem to be much space for further improvement... except maybe adding TypeScript typings to it once I figure out how to implement #1779

Dependency injection using factory. When I have extended my function multiply with support for BigInt, thanks to the dependency injection, the function prod and many others will automatically support BigInt too, since it uses multiply under the hood. [...] Maybe we can do something smart during a compile/pre-build step instead or so.

Hmm... I was convinced that if was possible to add call signatures to a typed function, but I couldn't find that anywhere in the examples. Okay, I take back my last statement, maybe there are things to improve in typed after all 😁️. Specifically, if something like this was possible:

const fn4 = typed({
  'number': function (a) {
    return 'a is a number';
  }
});

fn4.addSignature({
  'number, number': function (a, b) {
    return 'a is a number, b is a number';
  }
})

fn4(1,2) // a is a number, b is a number

Then the use_bigint example could be rewritten into modules as simply as:

import math from 'mathjs'


math.typed.addType({
  name: 'BigInt',
  test: (x) => typeof x === 'bigint'
})

math.bigint = (x) => BigInt(x)

math.add.addSignature({
  'BigInt, BigInt': (a, b) => a + b
})

math.pow.addSignature({
   'BigInt, BigInt': (a, b) => a ** b
})

export default math

This is arguably much simpler than the original, and no compile-time magic was needed – just vanilla modules.

This also works the other way around: if I don't need the heavyweight multiply (which supports BigNumbers, matrices, etc), and I just need plain and simple number support, I can use a lightweight implementation of multiply for numbers, and inject that in prod and other functions.

This one sounds a lot more difficult to achieve with modules. Generally speaking, it's impossible to make a module "unlearn" some dependency – at least without some compile-time magic. But at the same time, it seems quite impractical to manually remove dependencies on something like BigNumbers... Since more than half of the code directly mentions them, one would have to rewrite almost all functions to remove the dependency – I struggle to understand why would anyone do that. Are there any more practical examples of where such "unlearning" might be used?

0 replies

josdejong · 2020-10-03T13:00:11Z

josdejong
Oct 3, 2020
Maintainer

Hm, good point. We should probably not expose realSymmetric publicly. On the other hand... maybe that's no problem at all? We could think about being able to inject a function but not having it exposed publicly, but in my experience if you want to internally create realSymmetric as a separate function, such a function can often be useful for others too. I.e. there should be nothing to hide.

So far, typed-function is created in an immutable way. So you can merge typed functions like const fnMerged = typed(fn1, fn2) into new ones, but not change an existing typed function like fn4.addSignature. You could of course do something like fn4 = typed(fn4, typed({ ...new signatures.... })).

The world would be simple if we have a single mathjs instance and can extend functions there like you describe, and also change the config there. However, suppose you're using two libraries in you application, library A and library B, and both use mathjs. If those two libraries both need different config and extend functions in some ways, they would get in conflict with each other. Therefore, I think it's important that you can create your own instance of mathjs with your own config and extensions, and the global instance should be immutable.

Are there any more practical examples of where such "unlearning" might be used?

I suppose you mean loading light-weight functions with just number support instead of the full package? Reasons are performance (typed-function does introduce overhead) and bundle size for the browser. Currently, functions like add contain implementations for each supported data type. One idea I have is to rewrite this to split the implementation per data type. That would result in a huge amount of tiny functions, which are merged together when loading a specific selection of functions and data types.

0 replies

cshaa · 2020-10-06T00:18:58Z

cshaa
Oct 6, 2020
Collaborator Author

Huh, I didn't realize mathjs is immutable now and making it mutable could break some real world code... This makes things infinitely more complicated. After some time of listening to music and thinking hard, I've come up with this design idea. It's probably still full of holes, so if you notice something that wouldn't work, doubt that something is a good idea, or simply don't understand my explanation, please do comment on that. It's definitely more complicated than “just exporting and importing” as I initially hoped, so please be patient with my explanation 😅️

Proposal v0

In any mathjs bundle, according to this proposal, there would always be two abstract types:

Real
- all number types that correspond to real numbers
- default subtypes: number, BigNumber, Fraction
Scalar
- all “tensors of order zero”, ie. elements of a division ring, ie. things you can multiply a vector with
- default subtypes: Real, Complex, Unit

The subtypes of these can be modified (you can remove some, or add your own), but these two abstract types shall always remain.

In any mathjs bundle, the math object will always have these core methods:

addScalar, subtractScalar, multiplyScalar, divideScalar, squareScalar
absScalar (returns a Real (what about Unit?))
compareReal, equalReal, largerReal, largerEqReal, smallerReal, smallerEqReal, unequalReal
expReal, logReal, powReal, sqrtReal, signReal
roundReal, floorReal and other rounding functions
sinReal, cosReal and other trigonometric functions

If a user wants to add their own Real type or Scalar type, they should provide these methods – for the best results they should provide all of them.

Ordinary methods (like dotPow, gcd, factorial, ...) are in individual files, exported, only wrapped in typed (no factory). If such a method depends on core methods, it can access them using this, eg. this.addScalar. Similarly the configuration can be accessed with this.config. If the method depends on a more complicated/niche function, it imports it directly from the corresponding file. If the method has overloads that are reasonable to include in some bundles, but exclude from others, those overloads should be in separate files – for example multiply should be split into multiply_base, multiply_DenseMatrix, multiply_SparseMatrix.

Pseudo-code to showcase this convention:

// file: multiply_DenseMatrix.js

import { typed } from "../core/typed.js"
import { DenseMatrix, pointwise } from "../type/matrix/DenseMatrix.js"

export const multiply_DenseMatrix = typed('multiply', {
  'Scalar, DenseMatrix': (a, B) => pointwise(B, e => this.multiplyScalar(a, e)),
  'DenseMatrix, DenseMatrix': (A, B) => DenseMatrix([ /* actual multiplication code */ ])
})

There are various pre-made bundles like all.js or number/all.js that set up the two abstract types, import all the important (hehe) methods and re-export them in one math object. This object is not yet ready to be served to the user, as it contains various overloads separated into different methods (multiply_base, multiply_DenseMatrix, ...) and doesn't contain any configuration. It is the job of the create method to do this.

This is a pseudo-code implementation of create:

export create(methods, config = {})
{
  const math = { config }

  for (fname of keysOf(methods))
  {
    if (!fname.contains('_'))
    {
      math[fname] = bind(methods[fname], math)
    }
    else
    {
      const name = fname.split('_')[0]
      const method = mergeAllOverloads(name, methods)
      math[name] = bind(methods, math)
    }
  }

  return math
}

The result of create(all) is the math object the user wants.

How a user uses this new API

import { math } from 'mathjs' – use the default (immutable) instance of the default bundle
import { create, all } from 'mathjs'; const math = create(all) – make your own instance of the default bundle
make an instance of a different pre-made bundle
put together your own bundle and instantiate it
create a bundle with custom extensions and instantiate it
- one can either extend the two abstract types and thus modify the behavior of existing methods
- or create new methods and new overloads for existing methods, which adds new features while leaving the pre-existing functionality unchanged
- or combine these two approaches

Pros & Cons

Cons
- less flexible – originally you had the power to remove any one method and extend any method including its internal use
- the code might internally use methods that aren't in the math object – the user has less control over the “size” of the bundle
- the change requires a lot of refactoring
Pros
- there's a clear division between which internally used functions can and cannot be altered by the user
- no compile-time magic needed
- dependency between methods is more clear – if you're extending eg. multiply to handle some different type but you also need the original multiply as a dependency, you put the new overload in a new file and import from the existing file
- problems with circular dependency solved
- the act of extending mathjs is hopefully more tangible and requires less knowledge of its internal workings
- the standard module structure makes it possible to automatically generate TypeScript definitions

0 replies

josdejong · 2020-10-07T10:06:31Z

josdejong
Oct 7, 2020
Maintainer

Mathjs currently has a complicated hybrid: the lowest building blocks, functions, are immutable. On higher level, you can create a mathjs instance and change configuration. That will result in all functions being re-created with the new configuration.

Interesting idea to separate required core functions (like addScalar) from other functions like dotPow, and having a similar separation between the data types. It's a good distinction to know what functions to implement when adding a new data type. I'm not sure though whether there really is such a clear distinction in two groups: the higher-level functions not only depend on "core" functions, but also on other higher-level functions.

So if I understand your idea correctly, when creating a mathjs instance, the functions are simply bound to new math instance. So they would not be immutable but rely on say this.config to get the right config, and could use this.addScalar etc? Something like this:

function add (a, b) {
  return a + b
}

function sum (values) {
  let total = 0
  values.forEach(value => {
    total = this.add(total, value) // <-- here we use 'this'
  })
  return total
}

const instance = {}
instance.add = add.bind(instance)
instance.sum = sum.bind(instance)

console.log('sum', instance.sum([1, 2, 3]))

Relying on a this context would indeed remove the need for dependency injection and make things much simpler (except the dynamic binding may be tricky here and there). We've had this in solution the past, where all functions did use their dependencies simply from math.*. Difficulty with that was that you could not see how functions did depend on other functions, making it very hard to cherry pick functions.

It would be nice if it would be possible to export already bound functions which can be used directly, in such a way that tree-shaking would work and you should be able to re-bind the function to a different context later. Something like this works (but having to wrap it is still ugly):

// because we import all dependencies here, tree-shaking and cherry picking a function just works
import { add } from './add.js'
 
const sum = (function () {
  // default binding of all dependencies
  this.add = add

  return function sum (values) {
    let total = 0
    values.forEach(value => {
      total = this.add(total, value)
    })
    return total
  }
})()

Which can be used like:

import { sum } from './sum.js'

console.log('sum', sum([1, 2, 3]))

And you can bind it to a different context like:

const instance = {}
instance.add = // ... some different implementation
instance.sum = sum.bind(instance)
// now, instance.sum uses the new add function

I think though that it is not possible to bind instance.sum again to a different context, so maybe this is not the right approach. It would be great though if we can find a solution to allow for automatic resolving of the dependencies somehow. An alternative is indeed simply not allowing cherry picking functions, but offering a couple of 'bundles' where we make sure the bundles do work and contain all dependencies.

Really interesting to think this through 😎

0 replies

cshaa · 2021-03-31T23:28:37Z

cshaa
Mar 31, 2021
Collaborator Author

Bump! I'm interested in continuing this discussion and possibly making a few prototypes to test the possibilities!

Regarding your last comment: I think I have an idea how to solve this. I'll assume we'll be using TypeScript, but the idea would work without it too.

Each “area” of math.js (by that I mean matrix, parse & symbolic expressions, statistics ...) would have a fixed set of minimalistic functions that are needed for pretty much everything in that area. Eg. for matrix it would be multiply, transpose, inv, column... From now on, I will call these necessary sets of functions “an essential bundle” of that specific area.
Every function in an area would then have a typed this, which contains all these essential functions in its type. If someone wants to cherry-pick functions from different areas, they'd have to include at least this essential bundle, or else they get a descriptive compile-time error (or a runtime error if they use plain JS). This way, all functions from the essential bundles will be automatically extensible (or “rewritable”), because they are in this, and not imported directly from a specific file. This would also mean that we wouldn't have to import all the essential functions again and again in our codebase – the code will be shorter and easier to read.
If your function depends on a more specialized method from its area, a decision has to be made: would the code work even with an extended version of that function provided by the user, or is the algorithm closely tied to the datatypes you expect? If you don't think an extended function for a different datatype would work out-of-the-box, you can just import the function directly. If you think the function should be extensible, you can do something like this.add = this.add || add.
A more sophisticated way to have extensible non-essential methods would be to construct an extras object where we either copy a function from this or fall back to an imported one. This way we wouldn't have to mutate this and we could know its exact properties at compile time. An example in the (pseudo)code below:

// file: matrix/essential.ts
export { matrix } from './matrix'
export { add } from './add'
export { multiply } from './multiply'
export { transpose } from './transpose'
...

// file: matrix/solve.ts
import * as core from '../core'
import * as essential from './essential'
import { createExtras } from '../utils/createExtras'

// non-essential functions; for the sake of example, only one of them is made extensible
import { lup } from './lup' // extensible
import { usolve } from './usolve' // not extensible
import { lsolve } from './lsolve' // not extensible

export function solve(this: core & essential, M: Matrix, b: Vector) {
  const extras = createExtras(this, { lup }) // only creates the object once, then caches it
  
   const { L, U, P } = extras.lup(M)
   
   const c = usolve(U, b)
   const d = lsolve(L, c)

   return d
}

Then if you cherry-pick solve for your custom bundle:

import { create } from 'mathjs/custom'
import * as core from 'mathjs/core'
import * as matrix from 'mathjs/matrix/essential'
import { solve } from 'mathjs/matrix'

const fn = { ...core, ...matrix, solve, lup: ()=>{ my code }, usolve: ()=>{ my code }  }
const math = create(fn)

math.lup === fn.lup // true
math.usolve === fn.usolve //true

math.solve([[1]], [1])
// uses your custom lup
// but doesn't use your custom usolve

What do you think about the proposal? Should I make a prototype repo to test this in practise?

0 replies

cshaa · 2021-04-01T23:42:31Z

cshaa
Apr 1, 2021
Collaborator Author

An argument for designing a new architecture: In the current system, creating one function means modifying five different files in different parts of the codebase. I think this number should go down :^)

0 replies

josdejong · 2021-04-07T11:56:04Z

josdejong
Apr 7, 2021
Maintainer

Your proposal can be interesting @m93a . I guess a main difference with the proposal I made before (see #1975 (comment)) is that you're relying on a create function again to "bind and create" the functions. I'm not sure whether the function solve(this: core & essential,...) syntax would work out nicely, but that's something to try out in a prototype, right? 😄 . So the groups like essential and core are just for convenience, right? So you don't have to import a lot of individual functions?

I'll reply on the other issues in mathjs that you commented on soon but I'm not managing to keep up with all the issues at this moment.

0 replies

gwhitney · 2022-03-25T10:09:43Z

gwhitney
Mar 25, 2022
Collaborator

For a working tiny prototype using many of the ideas in the discussion of this issue, but which does not use any manipulation of "this" and takes a more simpleminded approach to bundle-gathering than the "areas" and "createExtras" later in the conversation, but nevertheless seems as though it may well be scalable to the size of mathjs, see: https://code.studioinfinity.org/glen/picomath

0 replies

josdejong · 2022-03-25T15:16:06Z

josdejong
Mar 25, 2022
Maintainer

Oohh, I'm really curious to have a look at your PoC 😎 ! Will be next week I expect though.

0 replies

gwhitney · 2022-03-25T21:54:52Z

gwhitney
Mar 25, 2022
Collaborator

Great. I also went ahead and (in an additional feat/lazy branch, so as not complicate the base PoC) implemented invalidation and lazy reloading to handle changes to a math instance's global configuration object. I feel that also worked out pretty smoothly (although probably the API for an implementation registering itself for reloading could be smoothed out somewhat). Looking forward to your thoughts!

0 replies

gwhitney · 2022-07-18T05:04:02Z

gwhitney
Jul 18, 2022
Collaborator

Unfortunately the picomath proof-of-concept relies on a possible version of typed-function in which typed-functions are mutable (e.g. they can have additional signatures added to them after initial creation, without changing object identity). That is possible, but the experiments in josdejong/typed-function#138 indicate that it comes at too heavy a performance cost. So I think that approach is a dead end for now

0 replies

josdejong · 2022-07-18T13:19:26Z

josdejong
Jul 18, 2022
Maintainer

Thanks for adding the pointer here to the experiments and benchmarking that you did in this regard. That was indeed quite a bummer. We have too keep experimenting and trying out new ideas :)

0 replies

gwhitney · 2022-07-18T14:24:39Z

gwhitney
Jul 18, 2022
Collaborator

Yes, I have a new concept: in typed-function v3, we allow implementations to flag that they refer to another signature of themselves (or their entire self). I am imagining a variant of typed-function in which individual implementations can flag that they refer to a signature of any other typed-function (or the entire other typed-function) as well. Then we just load modules gathering up all implementations of all typed-functions (creating a huge directed graph among impementations, that is hopefully acyclic). But no actual typed-functions are instantiated yet, because we don't know if any will get more implementations as we load more modules. Then just before we actually start to compute (maybe triggered by the first attempt to call a typed function, rather than define one), the whole web of definitions is swept through in topological sort order, finally instantiating all of the typed-functions, so that all of their references have been instantiated as well, and they can all be compiled down to code that doesn't need to call functions through possibly changing variables, which is the slow operation.

This would have the likely effect of slowing down initialization (but perhaps it can be done incrementally so that this burden is spread out) but hopefully keeping computational performance once everything is initialized as high or higher than it currently is.

This description may be a bit vague; as time permits I will do a "pocomath" proof of concept and post here.

0 replies

gwhitney · 2022-07-19T19:51:38Z

gwhitney
Jul 19, 2022
Collaborator

OK, the new proof-of-concept Pocomath is working. It does everything the original picomath did, and more. (I have not implemented the lazy-reloading of functions when a configuration option changes. I think it's quite clear from what's there that Pocomath can easily handle this capability, but I would be happy to do a specific reference implementation of it if anyone would like.)

Specifically, it uses the current, non-mutable, typed functions of typed-function v3, but allows gathering of implementations solely via module imports, it has no factory functions, and it should easily allow tree-shaking.

To show that this latest architecture makes it very easy to implement new types and more generic types, I have implemented a bigint type in Pocomath and in combination with the Complex type there it gives you Gaussian integers (ie. Complex numbers whose real and imaginary parts are bigints) automatically.

I am very encouraged by this proof-of-concept, and would be delighted for everyone interested to look it over. I humbly submit that adopting this basic architecture would make organizing the mathjs code as desired and adding new types and functions much easier. (In particular on @m93a's criterion of the number of files you have to touch to add a new operation.) Looking forward to your thoughts, @josdejong.

0 replies

gwhitney · 2022-07-30T15:21:33Z

gwhitney
Jul 30, 2022
Collaborator

But, it isn't really simpler than number: () => Math.sqrt so I'm not sure if it is really needed. Shall we only implement something for that when the need arises?

Sure, I am perfectly happy with a "wait and see" approach on an "adapter for a plain-function implementation".

Or is your concern that for some operations you don't want a typed-function at all

That was a second concern actually :). The use case I'm thinking about is that it is very powerful if you can just import some statistics or numerical library from npm straight into pocomath. These are no typed-functions. Here is a mathjs example demonstrating that in mathjs: https://github.com/josdejong/mathjs/blob/develop/examples/import.js

Yes, it does seem powerful to import external libraries. I will add an issue for installing ordinary JavaScript functions to the list I am implementing in Pocomath.

One other thought: in mathjs, over the years, their grew a need to have a distinction between functions that operate on scalars vs the functions that accept matrices too. So now you have addScalar and add, divideScalar and divide, equalScalar and equal, and a few more. It was needed for performance reasons mostly, and in some cases to prevent circular references. It may be worth thinking through how to identify a type as a scalar (i.e. number, bigint, BigNumber, Fraction, Complex, Unit, string, boolean). I'm not sure if a special provisioning would be needed or helpful to deal with this. So, maybe the function add could have a signature Matrix,Matrix which requires a dependency like add(scalar,scalar) or so. Any ideas?

This is an excellent point. It is not yet clear to me if the reasons that addScalar is needed in the current architecture will arise with a Pocomath-based architecture. One question that has been kicking around in my head is why is there a Matrix type in mathjs in addition to Arrays? I assume the primary answer is so there can be both DenseMatrix and SparseMatrix. But it has also been kicking around in my head that another item that makes Matrix powerful is that it is (or at least can be) type-homeogeneous, i.e., all entries are the same type. And if this is part of the sauce, then it seems to me that it would be pleasant if Matrix<number> were actually a distinct type from Matrix<Complex> -- in other words, for Matrix to be a template type in the usual meaning of that.

It is indeed possible to build a concept of template types on top of typed-function as it exists (or eventually it could perhaps be incorporated internally therein). Then we might have something like (in pseudo-code):

export const add = {
  'Matrix<T>,Matrix<T>': ({'add(T,T)': eltAdd}) => (M,N) => {
    // check that dimensions match
    const result = M.clone()
    for (const index of result.indices) {
      result.set(index, eltAdd(result.at(index), N.at(index)))
    }
    return result
  }
}

(Hence, an item to implement templates is next up on my list of things to try in Pocomath.) I think an approach like this might make it unlikely that the need for a dependency on add(scalar,scalar) would come up.

On the other hand, if we find ourselves needing a way to refer to "add for scalar types", maybe to deal with potentially inhomogeneous collections like Arrays, I think it would be possible to introduce esentially "typedefs" into Pocomath in which we install a type 'scalar' into an instance with a designation that it simply means number | Complex | Fraction | etc. etc. and a way to construct a new typed-function on the fly (with something like a dependency on add(scalar, scalar)) that filters the installed implementations of add to include only those that mention types included in scalar.

(After all, the current version of Pocomath always filters implementations to include only those that mention a defined type, in order to allow operation-centric files like mathjs has where a single operation is defined for numerous types, but only the implementations for types that have actually been installed in the instance are employed. Pocomath is trying very deliberately to be completely agnostic as to how operations and their implementations are organized into source files, even though the demo so far uses one operation for one type per file. I plan to add an example of many operations for one type in a single file, and I will also add an example of one operation for many types in a single file just to verify it works.)

However, this feels like it may end up being a little complicated so I would definitely put this concept on the same list as plain-function-implementation-adapters, i.e., things to be implemented if the actual need arises.

0 replies

josdejong · 2022-07-31T09:04:52Z

josdejong
Jul 31, 2022
Maintainer

It is not yet clear to me if the reasons that addScalar is needed in the current architecture will arise with a Pocomath-based architecture.

I'm not sure either, but I think the same issues would pop up, it helps for performance and also better error messaging to have functions that only work for scalars. There was one (quite) specific circular dependency: divide needs inv, and inv needs divide. It may be possible to implement that differently though.

The reasons for a Matrix shell around a nested Array where:

Create support for one interface Matrix and multiple implementations (DenseMatrix, SparseMatrix).
In the constructor you can do validation checks once, so you don't have to do consecutive checks on all operations. Calculate the size, validate that all rows/cols have the same size.
In the constructor, you can once determine whether the matrix is type-homeogeneous, and if so, select a specific operation for that type. I.e. if you know that two matrices only contain numbers, and you want to add them, you can pick the signature addScalar(number, number) and use that instead of using the slower typed function addScalar.
Have a nice, user friendly API with methods like .size(), and be able to hide stuff behind the interface.

When rethinking matrices, here are some thoughts:

The current "hybrid" solution of Matrix and Array is not ideal, it is a lot of back and forth. Life would be much simpler if it would if we would just have Arrays and no ceremony around it. But like I explained above, there are/where good reasons for it.
Supporting a matrix with mixed contents may be needlessly complex. Maybe we should not support that at all, and like you say, only support Matrix<number> and Matrix<Complex>. Maybe it would be better to convert the full matrix with numbers to complex numbers as soon as one of it's values becomes complex. It's easier to reason about and optimize. I like your idea of 'Matrix<T>,Matrix<T>': ({'add(T,T)': eltAdd}) => (M,N) => { ...}, that makes sense indeed! And it would make the whole reason for addScalar redundant.
I regularly got feedback that using nested arrays to hold the contents of a DenseMatrix is not ideal. If we would instead store the contents as a flat list with values and an index, all kind of algorithms would be much simpler and faster. And many implementations in say C/C++ use that structure, so it would be very easy to port them. This is something I would love to figure out further.

[...] construct a new typed-function on the fly (with something like a dependency on add(scalar, scalar))

That is a very interesting idea 🤔. So basically, that would allow not only to inject a function, or a single function signature, or self reference, but it would allow you to filter a set of signatures out of all of the signatures of a function and create a new, optimized function. It sounds very cool. I agree with you though, that something like this could open a lot of tricky complexities, so maybe best to keep it as an idea for now.

0 replies

gwhitney · 2022-07-31T18:20:28Z

gwhitney
Jul 31, 2022
Collaborator

On scalar functions:

it helps for performance and also better error messaging to have functions that only work for scalars.

I am thinking/hoping that the template implementations will be as good or better on both of these counts.

There was one (quite) specific circular dependency: divide needs inv, and inv needs divide. It may be possible to implement that differently though.

The Pocomath-style of infrastructure has no problem with co-recursion (as long as it bottoms out at some point). For example, the exact circular dependency you mention currently exists in Pocomath as it stands right now: the generic implementation of divide is via the invert operation (invert and multiply), and Complex relies on the generic, supplying a definition of invert that in turn uses divide (on its components). No special handling/syntax/anything is needed for such references, they are all just resolved at "bundling" time, as I call it (constructing a typed-function from the accumulated implementations for different signatures via the _bundle method, nothing to do with build-time bundling of JavaScript with webpack etc.) So that aspect will not require separate scalar functions.

0 replies

gwhitney · 2022-07-31T18:28:34Z

gwhitney
Jul 31, 2022
Collaborator

On Matrix and friends:
Sounds to me like that reconsideration is close to orthogonal to switching to a Pocomath-style infrastructure. Presuming that goes through, it should not be too hard to add a third StrideMatrix implementation of the Matrix interface that is strictly type-homogeneous, dense, and stores all matrices linearly. Then if it in fact works better and appears adequate for mathjs needs, eventually the current DenseMatrix could be dropped. I think Pocomath-style infrastructure would certainly make such a switch no harder, and might well ease the implementation and transition. But just to set expectations, I doubt such efforts in Matrix would be something I would take on. But hopefully we can leave all of the tools in place for someone who came along interested in carrying that torch.

0 replies

gwhitney · 2022-08-01T10:23:04Z

gwhitney
Aug 1, 2022
Collaborator

OK, template implementations (but not yet template types) are working in Pocomath now. I've switched as many of the generic operations to use them now as I could figure out how to. Also, there's a nifty built-in implementation of typeOf in every PocomathInstance that comes basically for free -- I definitely think at least this should be moved into typed-function, and possibly other features if it looks like this is the direction mathjs is headed.

0 replies

gwhitney · 2022-08-01T15:33:56Z

gwhitney
Aug 1, 2022
Collaborator

And now I have added an example of defining the 'floor' function in an operation-centric way. Note I am not proposing that in a putative reorganization of mathjs to use a Pocomath-like core that both operation-centric and type-centric code layouts would be mixed, I just wanted to demonstrate that the Pocomath core is completely agnostic as to the organization of the code. (Except for importDependencies which assumes a certain layout, but it shouldn't really be in the core, it's more of a tool, and a different version could be written for a different layout.)

0 replies

gwhitney · 2022-08-01T23:43:47Z

gwhitney
Aug 1, 2022
Collaborator

Pocomath now has a working Chain type (with the methods only being chainified when called through a chain and updating when they are modified on the underlying Pocomath instance). However, the chain methods are not quite as completely "lazily" updated in that at every extraction of a method, this implementation of Chain checks whether the underlying operation has changed on the Pocomath instance. If you think that's enough of a problem for performance (as opposed to the Pocomath instance invalidating the chainified version when it updates the underlying, so that the chainified version can simply be called directly whenever it's valid, without checking for an update) to be worth it, I can modify it to work that way as well. The only cost is somewhat tighter coupling between the instance and the Chain. (Right now, the instance provides a place for Chains to store their chainified functions, since of course the underlying items that are being chainified could vary from instance to instance. So that repository has to be associated with the specific instance. But the contents of that place are completely managed by Chain. To make things even lazier, Chain and PocomathInstance would have to have shared knowledge of the format used to store chainified functions in the repository so that PocomathInstance could do the invalidation directly when it was changing one of its operations. Let me know if it seems OK for the proof of concept or if you'd like the invalidations to be "pushed" from the instance.

0 replies

gwhitney · 2022-08-02T07:57:54Z

gwhitney
Aug 2, 2022
Collaborator

Import of plain functions is working now.

0 replies

gwhitney · 2022-08-07T03:40:42Z

gwhitney
Aug 7, 2022
Collaborator

Template operations and template types are both working in the prototype (Pocomath) now. There is a type-homogeneous vector type Tuple<T> where all operations are componentwise, and Complex has been templatized into Complex<T> to force its real and imaginary parts to have the same type. I even added a cute little demo that this scheme provides quaternions with absolutely no additional code, as Complex<Complex<number>> (for the components to be of number type).

0 replies

gwhitney · 2022-08-07T17:12:16Z

gwhitney
Aug 7, 2022
Collaborator

And now there is an adapter that just sucks in fraction.js (well actually bigfraction.js, because I think it's clear mathjs and fraction.js should move to the bigint version) with no additional code added elsewhere in Pocomath.

With that, my feeling is that Pocomath is "feature-complete" as a proof-of-concept: it demonstrates all of the aspects that I would be proposing to bring to mathjs with a reorganization along these lines. There are a couple of very minor concerns mentioned on the issues page, but I think they would easily come out in the wash in an integration with typed-function/mathjs so I don't feel they are worth running down now.

So I think the only remaining "proof" the concept needs for evaluation is some benchmarks, to ensure that the additions to typed-function do not bog the system down. I believe they won't, because in fact Pocomath tries very hard to bypass typed-function dispatch as much as possible and bundle in direct references to individual implementations rather than to typed-function entry points. So please just point me to some benchmarks for mathjs that you think might be reasonable for evaluating Pocomath in this regard, and I will duplicate them in Pocomath and post the results. Thanks!

0 replies

josdejong · 2022-08-18T10:07:44Z

josdejong
Aug 18, 2022
Maintainer

The Pocomath-style of infrastructure has no problem with co-recursion

just 😎

Sounds to me like that reconsideration is close to orthogonal to switching to a Pocomath-style infrastructure.

Yes you're right. I should better separate the different discussions (and at the same time, the discussions related to TypeScript and the pocomath architecture are a bit scattered right now).

template implementations (but not yet template types) are working in Pocomath now

The genererics are impressive, it makes sense. The Complex<T> example is very powerful. I think we should maybe just call this "generics" instead of "templates"?

I just wanted to demonstrate that the Pocomath core is completely agnostic as to the organization of the code.

👍

Pocomath now has a working Chain type

😎 again. I how to implement the lazy or less lazy updating of functions is an implementation detail. I think we can just start with the current approach, and do some benchmarks to see if there is a performance issue in the first place, and if so try out alternative solutions.

I even added a cute little demo that this scheme provides quaternions with absolutely no additional code, as Complex<Complex<number>>

🤯 ...Shortcut in my head... 😂

And now there is an adapter that just sucks in fraction.js

yeeees, that's where we want to go, easily import a new data type that has a set of built-in functions/methods. Fraction.js, Complex.js, BigNumber.js, UnitMath, ...

Optional dependencies
I also see the sqrt function now being able to return either number output or mixed number/complex output depending on the config. It also checks whether there is a complex implementation (if (config.predictable || !cplx) { ... }. I think in general it would be good to throw an exeption when a dependency is not satisfied, currently pocomath is just silent, which can make debugging hard. Currently in mathjs you can explictly mark a dependency as optional with a ? notation, like '?complex(number,number)': cplx,. What do you think?

Benchmarks
I think the only difference is that dependencies are now resolved on each individual signature instead of on the typed-function as a whole. I do not expect this has a performance impact, but would be good to verify. To test this, I think we can take a function with a reasonable number of dependencies, like distance, and inside createDistance create a temporary (second) dependency injection function to mimic the behavior of pocomath. And then verify the runtime performance.

With that, my feeling is that Pocomath is "feature-complete" as a proof-of-concept

Thanks a lot for working this concept out Glen. I really appreciate this. I think this concept addresses all the pain points of the current architecture, I really believe this is the way to go. There is one big open challenge: TypeScript support discussed in #2076. I want to find a satisfying solution for that first, before implementing this architecture for real in mathjs.

0 replies

gwhitney · 2022-08-30T20:01:52Z

gwhitney
Aug 30, 2022
Collaborator

Return type annotations are now supported, and in fact supplied for all operations, in the Pocomath proof of concept. This move the POC closer, I suppose, to the sort of organization need for a conceivable TypeScript switch/unification. The change took a while both because I physically relocated, but also because it was a very good bench for strengthening the underlying infrastructure. Type checking and template instantiations are now significantly more robust.

As soon as I can I will get to the final piece of the proof of concept, namely some reasonable benchmark. I am virtually certain that building an instance with all of its dependencies satisfied will be slower in Pocomath; there's just a great deal to do as far as instantiating templates and resolving every individual implementation, which is clearly less efficient than resolving the dependencies for an entire source file all at once, which may have many implementations in it. What we're getting is a great deal more flexibility in organizing code. So in a benchmarking of bootstrapping time, Pocomath is bound to lose by a significant margin. But that's just a one-time initialization cost. The aspect I feel needs benchmarking is the performance of the resulting implementations of the operations, and there I think I can reasonably hope that Pocomath will noticeably outperform current mathjs because with the templating (or generics if you prefer that terminology), a nested Pocomath operation should perform many fewer typed-function dispatches.

I will report back here when I have results.

0 replies

gwhitney · 2022-09-01T03:48:00Z

gwhitney
Sep 1, 2022
Collaborator

Hmm, I looked into this but currently the only benchmarks that do a significant amount of numerical computation are the matrix benchmarks (but reimplementing matrices in Pocomath doesn't make sense) and isPrime (but that test just calls a single typed-function that has no dependencies, so there really isn't any difference to test). So I will need to add a new benchmark on the mathjs side. I need something fairly realistic that calls a variety of basic arithmetic functions. Please does anyone following this issue have a suggestion for an algorithm or calculation I might use as a benchmark for this purpose? I could solve a quadratic and a cubic using the respective general formulas... That's reasonably attractive because the cubic goes back and forth into complex numbers. But it's fairly simplistic. Any better suggestions?

0 replies

josdejong · 2022-09-02T14:44:41Z

josdejong
Sep 2, 2022
Maintainer

I see the Returns is used dynamically, inside the function body. I'm not sure, but I have the feeling that that will make it hard or impossible to statistically analyse with TypeScript. What do you think?

About benchmarks:
It is indeed the runtime performance that counts, bootstrapping performance is less important (though still relevant of course). I did a small benchmark to verify whether there is a performance penalty on doing dependency injection on a per-signature base, and that is not the case:

branch: https://github.com/josdejong/mathjs/tree/experiment/benchmark_dependency_injection
commit: ef2631f

I thought I had reported about this but I probably forgot or wasn't yet happy with it yet, sorry.

This benchmark that I parked in this separate branch is quite limited, it would be better indeed to have a real expression with multiple operations.

2 replies

gwhitney Sep 2, 2022
Collaborator

I see the Returns is used dynamically, inside the function body. I'm not sure, but I have the feeling that that will make it hard or impossible to statistically analyse with TypeScript. What do you think?

It's not in the inner implementation body, but the Returns type in the Pocomath prototype is dynamic in terms of the dependencies, i.e., it depends on them, too, and can change as the dependencies change.
There are two reasons it worked out that way:

To replicate current mathjs functionality, the return type of sqrt (for example) depends on the value of the config object, so it can only be determined once the dependencies are supplied, and it might change as the config object is changed.
The type-language of Pocomath is fairly rudimentary, and it had no way of expressing the return type of absolute value, which for complex numbers and quaternions can be described in ordinary language as "the underlying real numeric type of the complex instance I am being instantiated on". To get around this, the code just looks at the return types of its dependencies, from which it can easily figure out what that numeric type is. This is not a true dynamism, in the sense that the return type of abs on Complex<Complex<BigNumber>> is never actually going to change from BigNumber. But since I already had point (1), it was far easier to implement this way.

So if one were willing to abandon cases like (1), and say that the config is specified at build time for a given bundle and can't change, and one provided some amount more of type-computation machinery, I think it would be possible to rearrange the type specification for a typical implementation from (schematically speaking):

signature: {{dependencies}) => Returns(returnType(signature, dependencies), args => value)

to

signature: Returns(returnType(signature), ({dependencies}) => args => value)

(Although it would then also be possible to move the return type into the signature key, I would still advocate against that because as far as I can tell it is nonsense in the mathjs context to specify

'number -> boolean': () => n => (n>0),
'number -> number': () => n => n+1

as two implementations of the same operation. In other words, implementations must be distinct as to input signatures, so input signature is the natural key to use.)

As to whether such a shift would enable an overall conversion to a math.ts, or whether actually if math.ts is possible at all then having the return type depend on the dependencies would be no extra block to TypeScript conversion, I am not sufficiently experienced with TypeScript to hazard a guess.... Hopefully those more keen on moving mathjs overall to TypeScript (beyond just providing an automated way to get the correct index.d.ts for it which I think we can already see our way to) will be able to determine some answers to these questions so we can move on with the enterprise one way or another.

josdejong Sep 7, 2022
Maintainer

Ahh, yes now I see the Returns is not inside the function implementation but inside the factory function of sqrt. That makes sense. On thought: this means that a function must be recreated when a config property changes, else you have a function implementation based on old config. I think that is fine. I'm indeed not sure if TypeScript is able to handle something like this but it will be interesting to figure out. It may be possible, when the config is defined statically, as constant known at build time.

One other thought: if it simplifies things (or makes things possible in the first place), it may be acceptable to not adjust the return types based on config, and instead always return the most conservative return type. Or maybe we could use two different implementations of sqrt instead of having a single very smart and dynamic sqrt. We can probably be pragmatic in that regard. Just thinking aloud here 🤔

At this point I'm not concerned with the exact syntax, but I first would like to see if this stuff can work at all 😄

gwhitney · 2022-09-02T15:33:44Z

gwhitney
Sep 2, 2022
Collaborator

On benchmarking: OK, I will add functions for roots of quadratics and cubics and use computation of several roots with both real and complex cases to both mathjs and Pocomath and report on performance results.

0 replies

[Design question] I don't like factory. Why don't we just use modules? #2741

cshaa Sep 26, 2020 Collaborator

Replies: 39 comments · 2 replies

josdejong Sep 27, 2020 Maintainer

cshaa Sep 28, 2020 Collaborator Author

josdejong Oct 3, 2020 Maintainer

cshaa Oct 6, 2020 Collaborator Author

Proposal v0

How a user uses this new API

Pros & Cons

josdejong Oct 7, 2020 Maintainer

cshaa Mar 31, 2021 Collaborator Author

cshaa Apr 1, 2021 Collaborator Author

josdejong Apr 7, 2021 Maintainer

gwhitney Mar 25, 2022 Collaborator

josdejong Mar 25, 2022 Maintainer

gwhitney Mar 25, 2022 Collaborator

gwhitney Jul 18, 2022 Collaborator

josdejong Jul 18, 2022 Maintainer

gwhitney Jul 18, 2022 Collaborator

gwhitney Jul 19, 2022 Collaborator

gwhitney Jul 30, 2022 Collaborator

josdejong Jul 31, 2022 Maintainer

gwhitney Jul 31, 2022 Collaborator

gwhitney Jul 31, 2022 Collaborator

gwhitney Aug 1, 2022 Collaborator

gwhitney Aug 1, 2022 Collaborator

gwhitney Aug 1, 2022 Collaborator

gwhitney Aug 2, 2022 Collaborator

gwhitney Aug 7, 2022 Collaborator

gwhitney Aug 7, 2022 Collaborator

josdejong Aug 18, 2022 Maintainer

gwhitney Aug 30, 2022 Collaborator

gwhitney Sep 1, 2022 Collaborator

josdejong Sep 2, 2022 Maintainer

gwhitney Sep 2, 2022 Collaborator

josdejong Sep 7, 2022 Maintainer

gwhitney Sep 2, 2022 Collaborator

[Design question] I don't like `factory`. Why don't we just use modules? #2741

cshaa
Sep 26, 2020
Collaborator

Replies: 39 comments 2 replies

josdejong
Sep 27, 2020
Maintainer

cshaa
Sep 28, 2020
Collaborator Author

josdejong
Oct 3, 2020
Maintainer

cshaa
Oct 6, 2020
Collaborator Author

josdejong
Oct 7, 2020
Maintainer

cshaa
Mar 31, 2021
Collaborator Author

cshaa
Apr 1, 2021
Collaborator Author

josdejong
Apr 7, 2021
Maintainer

gwhitney
Mar 25, 2022
Collaborator

josdejong
Mar 25, 2022
Maintainer

gwhitney
Mar 25, 2022
Collaborator

gwhitney
Jul 18, 2022
Collaborator

josdejong
Jul 18, 2022
Maintainer

gwhitney
Jul 18, 2022
Collaborator

gwhitney
Jul 19, 2022
Collaborator

gwhitney
Jul 30, 2022
Collaborator

josdejong
Jul 31, 2022
Maintainer

gwhitney
Jul 31, 2022
Collaborator

gwhitney
Jul 31, 2022
Collaborator

gwhitney
Aug 1, 2022
Collaborator

gwhitney
Aug 1, 2022
Collaborator

gwhitney
Aug 1, 2022
Collaborator

gwhitney
Aug 2, 2022
Collaborator

gwhitney
Aug 7, 2022
Collaborator

gwhitney
Aug 7, 2022
Collaborator

josdejong
Aug 18, 2022
Maintainer

gwhitney
Aug 30, 2022
Collaborator

gwhitney
Sep 1, 2022
Collaborator

josdejong
Sep 2, 2022
Maintainer

gwhitney Sep 2, 2022
Collaborator

josdejong Sep 7, 2022
Maintainer

gwhitney
Sep 2, 2022
Collaborator