copy develop over to master #657

svlandeg · 2022-05-09T12:06:15Z

[DON'T SQUASH]

Copying everything from develop to master after which we'll retire develop for now.

Sync develop with master

* custom_kernels: check out shape - Non-elementwise kernels: check the shape of `out`. - Elementwise kernels: make the interface consistent, by using `inplace` rather than `out`. * custom_kernels: assert that floating point arrays are float32 * custom_kernels: check that length/which arrays are correct - These arrays should have dtype int32 - Lengths should be >= 0. - Which should be >= 0 and <= the innermost dimension. - Lengths should sum up to the batch size. * custom_kernels: factor out output array checks * custom_kernels: check that dY and X shape match in backprop * custom_kernels: improve assertion error message * Fix invalid identifier * custom_kernels: fix check for reduce_max * Raise an exception on shape/lengths/index errors Before this change, shapes, lengths, and (which) indices were checked with an assertion. However, these can be user errors, so we should raise an exception. * CupyOps: dispatch some functions to super for unsupported dtypes When our custom kernel cannot handle a dtype (e.g. because it is float64), pass it to the the more generic implementation in the superclass. * _custom_kernels: remove out kwarg from kernels The `out` arguments are currently not used or tested. * _custom_kernel: replace first arg dtype asserts by _check_array * _custom_kernels: type fix * _custom_kernels: some cleanups * _custom_kernels: raise IndexError where applicable Raise IndexError in place of ValueError when it would also lead to an IndexError in NumPy: - Using a wrong maximum index (which) would result in indexing that is incompatible with the shape. - Using incorrect lengths would also result in invalid indexing, since the lenghts are added to index into the array. * _custom_kernels: speed up _check_which_maxout with reduction kernel * Fix incorrect call to reduce_mean in backprop_reduce_mean * Rename _check_array to _check_compatible * _custom_kernels: fix formatting of assertions

Merge master into develop

@danieldk

* label smoothing initial * smoothing in to_categorical as @danieldk suggested * more efficient suggestion from @danieldk * check range * exception when label-smoothing is not applied * mypy typing fix * avoid numpy 32 --> 64 casting * mypy ignore * always return float32 * @danieldk efficiency suggestion * formatting * formatting * change boundary values for smoothing * include smoothing in to_categorical tests * label smoothing arg for sequence cross entropy * docs for cross-entropy loss * prettier formatting * Update website/docs/api-loss.md Co-authored-by: Daniël de Kok <me@github.danieldk.eu> * new versions for cross-entropy losses * loss: undo two type removals * loss: test v3 registry functions * Update website/docs/api-loss.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * label smoothing keyword only * raise instead of assert * reformat * changes suggested by @svlandeg * Update website/docs/api-loss.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update thinc/util.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * grammar * check for number of classes * handle None Co-authored-by: Kádár Ákos <akos@onyx.uvt.nl> Co-authored-by: Daniël de Kok <me@github.danieldk.eu> Co-authored-by: Daniël de Kok <me@danieldk.eu> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

explosion#605) * incomplete docs for new activations and left out initializers * docs for all new activations * cite fix * documenting new activations * prettier formatting * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-backends.md Co-authored-by: Daniël de Kok <me@github.danieldk.eu> * Update website/docs/api-backends.md Co-authored-by: Daniël de Kok <me@github.danieldk.eu> * Update website/docs/api-backends.md Co-authored-by: Daniël de Kok <me@github.danieldk.eu> * Update website/docs/api-backends.md Co-authored-by: Daniël de Kok <me@github.danieldk.eu> * Update website/docs/api-layers.md Co-authored-by: Daniël de Kok <me@github.danieldk.eu> * docs: may be modified -> is modified for inplace kwarg * docs: new activation functions take FloatsXd, not just Floats2d * docs: do not write activation names using typewriter font * docs: consistency improvements for new activations * docs: more consistency fixes in new activation descriptions * Update website/docs/api-backends.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-backends.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-initializers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-initializers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-initializers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-initializers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-layers.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update website/docs/api-backends.md Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * removed some links * resolve conflict * docs: fix up some GELU/Swish mentions * docs: run prettier Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> Co-authored-by: Daniël de Kok <me@github.danieldk.eu> Co-authored-by: Daniël de Kok <me@danieldk.eu>

…osion#607) Also add tests to verify that the n_classes kwarg is correctly checked.

CI fails with pytest 7.1.0.

)

* Fix compatibility with Torch without torch.cuda.amp.common * Disable PyTorch-based activation tests pre-PyTorch 1.9.0 * Don't use gradient scaling unconditionally in PyTorch wrapper test * Disable gradient scaling tests on older PyTorch versions * Set minimum required PyTorch version to 1.6.0 * Check that torch>=1.9.0 for mixed-precision training Torch versions prior to 1.9.0 do not have the functionality that we need for mixed-precision training and gradient scaling. * Refine exception message for mixed-precision training

@svlandeg

* custom_kernels: make all CUDA kernels generic This change makes all CUDA kernels generic using C++ templates, so that they can be used for float and double arrays. * custom_kernels: fix comment typo Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * custom_kernels: add a bunch of fixes from @svlandeg * test_ops: test float64 implementations * Ops.maxout: the keepdims argument of argmax requires numpy>=1.22 We didn't trigger this issue before because we were not testing Ops.maxout and the implementation was overriden by the NumPyOps and CupyOps implementations. Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* NumpyOps: check `which` indices for backprop_{maxout,reduce_max} * NumpyOps: check that dY and X shapes match in backprop * NumpyOps: check that lengths are valid in backprop_reduce_{max,mean,sum} * test_ops: fix duplicate function name * NumyOps: check lengths in reduce_{max,mean,sum}

* _custom_kernels: make all arguments kwarg-only Some kernels take multiple arguments, which are all arrays with the same shape. It's easy to accidentally mix up the order and introduce bugs in that way. This change makes all arguments kwarg-only, so that we always specify what we intend to pass at the call site. Since _custom_kernels is an internal module, this does not result in any public API changes. * Make X positional for forward and dY position for backward kernels * Only make non-required args keyword-only

…ter-20220331

…0220331 Merge master into develop

@svlandeg

* Make NumpyOps CPU kernels generic This PR makes most CPU kernels generic, so that they can take both float32 and float64 arrays (and hopefully in the future float16). I experimented with kernels in Cython + fused types and kernels as C++ with templates, I found the C++ template route more promising: - More compact/ergonomic implementations with fewer compile-time conditionals. - Opens up the possibility to easily use SIMD intrinsics in the future. To allow genericity in the NumpyOps method arguments, we use: - Fused types when we require a specific dimensionality; - np.ndarray otherwise. Some of the kernels are not made generic: - cpu_scatter_add: needs tests to verify that the op still works correctly. - cpu_position_encode: the position_encode op doesn't take float array(s). - lstm kernels: I need to look more deeply into them. * Include C++ headers in sdist * NumpyOps: Use workaround for cython/cython#4697 * Namespace-qualify memcpy * ReLU kernel: never output -0.0 * Add fixes suggested by @svlandeg

* Add support to allocate uninitialized arrays in Ops, NumpyOps Replace superfluous zero-init'd allocs with uninit'd ones * Use uninit'd allocs in the following NumPy ops: maxout, reduce_max Minor optimization to Ops.gelu_approx * Replace kwdarg `uninitialized` with `zeros` * Update docs * Run prettier * Cupy: Replace zero'd allocs with empty allocs * Refactor alloc code into separate functions * Make zero'd allocs more explicit * Replace zero-allocs in backprop_reduce_sum/mean

. (explosion#636) * `NumpyOps`: Revert uninit'd alloc in `reduce_max` * Test `which` in `test_reduce_max`

…0220422 Merge master into develop

…anges (explosion#646) * Add sanity checks to handle cases of hidden GPU devices (`CUDA_VISIBLE_DEVICES=-1`) Add `has_tensorflow_gpu` * Allow `...2xp` utility methods to accept a target `Ops` object for conversions Refactor 3rd party framework GPU tensor detection * Handle `cupy.fromDlpack` deprecation for `cupy >= 10.0.0` * Make `ops` arg in `...2xp` functions keyword-only * Short-circuit `..._gpu_array` functions Add `Ops` type to `..2xp` functions Defer `cupy` deprecation-related changes to a different PR * Fix type annotation * Defer sanity check in `_custom_kernels.py` to another PR

@shadeMe

* Fix reductions when applied to zero-length sequences * reduce_max: do not accept zero-length sequences * Document behavior of reduce_{max,mean,sum} for zero-length sequences * Apply fixes by @shadeMe Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com> * reduce_{mean,sum}: explicitly handle length == 0 for clarity * docs: a -> the zero vector Co-authored-by: Madeesh Kannan <shadeMe@users.noreply.github.com>

* Tests: Correctly handle GPU-resident Tensorflow tensors * Simplify `is_supported_backend_array`

update develop with latest from master

@overload

* Fixed typing problems mostly in with... methods * Sorted out flatten and unflatten * Iterable and Concatenatable types * Corrections * More corrections * Corrections to layers * Moved type definitions from types to ops * Updated documentation * Simplified ops type declarations * Fixed mypy backwards compatibility issue * Correct type-ignore comment * Updated Mypy version in azure-pipelines * Added CI checks with Python 3.7 * Any as first parameter of with_... layers * Revert "Any as first parameter of with_... layers" This reverts commit aa55834. * Tidied up init methods * Removed unnecessary imports * Put import statement on one line * Changes based on PR review comments * Improvements after PR feedback * Went through ignore statements in layers * Removed unnecessary covariance * Improvements based on PR review * Remove Python 3.7 additions * Reverted lstm_tagger.py changes * Added ArrayTXd_co * Final changes before review * Cast in main rather than in type-specific forward methods * Added empty line * Corrections * More corrections * Corrections * Returned to ListXd types * More corrections * Further corrections * Corrected model typing * Further corrections * Corrections * Tidying up * Corrections * Removed line * Made imports clearer * Readded line * Reformatted * Readded line * Corrected residual.py * Changed imports back to original order * Changes in response to review comments * Update thinc/layers/dropout.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Update thinc/layers/embed.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Changes responding to Github review * Reversed changes to init() return types * Reversed changes to init() return types * Corrected embed.py and hashembed.py * Corrections based on Github review * Fixed chain.py * Further correction to chain.py * Removed unnecessary cast * Updated documentation * Changes based on review * Added @overload signatures in ops * Added comment * Changes based on review comments * Final corrections * Bumped mypy version * Changes based on review comments * Added space to trigger CI * Corrected Pydantic version ranges * Fixed mypy version range * Correct documentation for clone Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

danieldk and others added 28 commits March 2, 2022 17:40

Merge pull request explosion#602 from danieldk/update-develop

9d196fb

Sync develop with master

Merge remote-tracking branch 'upstream/master' into sync-develop

ce912ea

Merge pull request explosion#606 from danieldk/sync-develop

c8f7446

Merge master into develop

to_categorical: permit having one class without label smoothing (expl…

31a2cf5

…osion#607) Also add tests to verify that the n_classes kwarg is correctly checked.

Set version to 8.0.14

df6b2bd

Limit pytest >=5.2.0,<7.1.0

05655c6

CI fails with pytest 7.1.0.

Reduce CI instances to one OS per pre-3.10 python version (explosion#613

85b291d

)

Set version to 8.0.15

8a4adbc

Merge remote-tracking branch 'upstream/master' into develop-merge-mas…

19834a6

…ter-20220331

Merge pull request explosion#628 from danieldk/develop-merge-master-2…

45249c2

…0220331 Merge master into develop

Revert uninit'd allocs in Numpy.reduce_max introduced in explosion#632

fb92135

. (explosion#636) * `NumpyOps`: Revert uninit'd alloc in `reduce_max` * Test `which` in `test_reduce_max`

Merge remote-tracking branch 'upstream/master' into develop

40e5ab5

Merge pull request explosion#644 from danieldk/develop-merge-master-2…

1d02520

…0220422 Merge master into develop

Tests: Correctly handle GPU-resident Tensorflow tensors (explosion#653)

a7916c8

* Tests: Correctly handle GPU-resident Tensorflow tensors * Simplify `is_supported_backend_array`

Merge branch 'develop' into feature/master_copy

229b577

Merge pull request explosion#655 from svlandeg/feature/master_copy

4fc5ded

update develop with latest from master

svlandeg marked this pull request as ready for review May 9, 2022 12:11

svlandeg merged commit a9b0471 into explosion:master May 9, 2022

svlandeg deleted the feature/develop_copy branch May 9, 2022 12:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

copy develop over to master #657

copy develop over to master #657

svlandeg commented May 9, 2022

copy develop over to master #657

copy develop over to master #657

Conversation

svlandeg commented May 9, 2022