Skip to content
Kyle Daruwalla edited this page Nov 10, 2020 · 3 revisions

Note: This wiki has now been moved to https://github.com/FluxML/ML-Coordination-Tracker/wiki

A central component of FastAI.jl is the training loop. This routine accepts some initial state (e.g. a Learner) and maintains this state throughout the training process. A callback system allows a user or developer extending FastAI.jl to insert their own code into the training loop.

The bulk of this section is concerned with what constitutes state, how it is modified, when it is modified, and by whom it is modified.

What is state?

The following table details examples of state, how it is mutated, and typically where it is located.

State Example of mutation Owner
Model (architecture) Pruning methods Learner
Model (parameters) Pruning methods, weight clipping Learner
Data Adversarial training Learner/Data loading
Optimizer Hyperparameter tuning Learner
Augmentation Flow Hyperparameter tuning Data loading

Notions of mutability

The concept of mutability with respect to state is well-studied in functional programming languages (or courses that introduce them in contrast to OOP). But since Julia is not distinctly functional, it does not have the same language semantics. So, it becomes important to distinguish what is meant by mutability.

  1. The first notion of mutability is tied to physical memory. State is encoded in a data structure, and that structure is stored in physical memory. Mutating state in this context refers to modifying memory. In this context, proper modification of state by many functions is referred to as safety[^1].
  2. The second notion of mutability is related to correctness of state. We can refer to this semantically mutating the state, and the definition of mutation means that the meaning of the state is changed. In this context, proper modification of state by many functions is referred to as correctness or semantic validity[^2].

Note that these concepts are related but orthogonal. In functional languages, (1) is guaranteed by pure functions. On the other hand, if two pure functions are applied in composition on some state (i.e. f(g(x))), then it is still possible that g semantically invalidates the state such that the input to f is uninterpretable. Similarly, a series of functions applied to some state may result in semantically valid output at each step, but that can be done by mutating memory in-place or creating new copies of state as return values.

The reason for defining both notions of mutability is because FastAI.jl must guarantee safety (for multi-threaded code) and should guarantee correctness, but we want to do both without compromising flexibility.

[^1]: These terms are made-up; you can learn about "safety" for mutable structures by searching it online, but there is no guarantee that any of the defined terms are commonly used. We are defining them in order to refer to the concepts throughout the design process.

[^2]: These terms are made-up; you can learn about "safety" for mutable structures by searching it online, but there is no guarantee that any of the defined terms are commonly used. We are defining them in order to refer to the concepts throughout the design process.