Skip to content

Releases: explosion/thinc

v7.0.1: Fix import errors

23 Feb 11:52
Compare
Choose a tag to compare

🔴 Bug fixes

  • Fix import errors introduced when dropping dependencies in v7.0.0.

v7.0.0: Overhaul package dependencies

15 Feb 11:23
Compare
Choose a tag to compare

⚠️ Backwards incompatibilities

  • Thinc v7.0 drops support for Python 2.7 on Windows. Python 2.7 remains supported on Linux and OSX. Support could be restored in future. We're currently unable to build our new dependency, blis, for Windows on Python 2.7. If you can assist with this, please let us know.

✨ New features and improvements

  • Use blis for matrix multiplication. Previous versions delegated matrix multiplication to platform-specific libraries via numpy. This led to inconsistent results, especially around multi-threading. We now provide a standalone package, with the Blis linear algebra routines. Importantly, we've built Blis to be single-threaded. This makes it much easier to do efficient inference, as the library will no longer spawn threads underneath you.

  • Use srsly for serialization. We now provide a single package with forks of our preferred serialisation libraries – specifically, msgpack, ujson and cloudpickle. This allows us to provide a single binary wheel for these dependencies, and to maintain better control of our dependency tree, preventing breakages.

  • Update versions of cymem, preshed and murmurhash. Thinc is compiled against our memory pool and hash table libraries, cymem and preshed. Changing these build-time dependencies requires Thinc to be recompiled. This is one reason the major version number needed to be incremented for this release.

v6.12.1: Fix messagepack pin

30 Nov 17:21
Compare
Choose a tag to compare

🔴 Bug fixes

  • Fix issue explosion/spaCy#2995: Pin msgpack to version <0.6.0, to avoid the low message-length limit introduced in v0.6.0, which breaks spaCy. We will relax the pin once spaCy is updated to set the max_xx_len argument to msgpack.dumps()

v6.12.0: Wheels and separate GPU ops

15 Oct 11:52
Compare
Choose a tag to compare

✨ New features and improvements

  • Update dependencies to be able to provide binary wheels.
  • Move GPU ops to separate package, thinc_gpu_ops.
  • Support pip specifiers for GPU installation, e.g. pip install thinc[cuda92].

🔴 Bug fixes

  • Update murmurhash pin to accept newer version.

v6.10.3: Python 3.7 support and dependency updates

21 Jul 13:55
Compare
Choose a tag to compare

✨ New features and improvements

  • Update cytoolz version pin to make Thinc compatible with Python 3.7.
  • Only install old pathlib backport on Python 2 (see #69).
  • Use msgpack instead of msgpack-python.
  • Drop termcolor dependency.

v6.11.2: Improve GPU installation

22 May 09:26
Compare
Choose a tag to compare

✨ New features and improvements

You can now require GPU capability using the pip "extras" syntax. Thinc also now expects CUDA to be installed at /usr/local/cuda by default. If you've installed it elsewhere, you can specify the location with the CUDA_HOME environment variable. Once Thinc is able to find CUDA, you can tell pip to install Thinc with cupy, as follows:

  • thinc[cuda]: Install cupy from source (compatible with a range of cuda versions)
  • thinc[cuda80]: Install the cupy-cuda80 wheel
  • thinc[cuda90]: Install the cupy-cuda90 wheel
  • thinc[cuda91]: Install the cupy-cuda91 wheel

If you're installing Thinc from a local wheel file, the syntax for adding an "extras" specifier is a bit unintuitive. The trick is to make the file path into a URL, so you can use an #egg clause, as follows:

pip install file://path/to/wheel#egg=thinc[cuda]

6.11.1: Support direct linkage to BLAS libraries

20 May 16:56
Compare
Choose a tag to compare

✨ New features and improvements

  • Thinc now vendorizes OpenBLAS's cblas_sgemm function, and delegates matrix multiplications to it by default. The provided function is single-threaded, making it easy to call Thinc from multiple processes. The default sgemm function can be overridden using the THINC_BLAS environment variable --- see below.
  • thinc.neural.util.get_ops now understands device integers, e.g. 0 for GPU 0, as well as strings like "cpu" and "cupy".
  • Update StaticVectors model, to make use of spaCy v2.0's Vectors class.
  • New .gemm() method on NumpyOps and CupyOps classes, allowing matrix and vector multiplication to be handled with a simple function. Example usage:

Customizing the matrix multiplication backend

Previous versions of Thinc have relied on numpy for matrix multiplications. When numpy is installed via wheel using pip (the default), numpy will usually be linked against a suboptimal matrix multiplication kernel. This made it difficult to ensure that Thinc was well optimized for the target machine.

To fix this, Thinc now provides its own matrix multiplications, by bundling the source code for OpenBLAS's sgemm kernel within the library. To change the default BLAS library, you can specify an environment variable, giving the location of the shared library you want to link against:

THINC_BLAS=/opt/openblas/lib/libopenblas.so pip install thinc --no-cache-dir --no-binary
export LD_LIBRARY_PATH=/opt/openblas/lib
# On OSX:
# export DYLD_LIBRARY_PATH=/opt/openblas/lib

If you want to link against the Intel MKL instead of OpenBLAS, the easiest way is to install Miniconda. For instance, if you installed miniconda to `/opt/miniconda', the command to install Thinc linked against MKL would be:

THINC_BLAS=/opt/miniconda/numpy-mkl/lib/libmkl_rt.so pip install thinc --no-cache-dir --no-binary
export LD_LIBRARY_PATH=/opt/miniconda/numpy-mkl/lib
# On OSX:
# export DYLD_LIBRARY_PATH=/opt/miniconda/numpy-mkl/lib

If the library file ends in a .a extension, it is linked statically; if it ends in .so, it's linked dynamically. Make sure you have the directory on your LD_LIBRARY_PATH at runtime if you use the dynamic linking.

🔴 Bug fixes

  • Fix pickle support for FeatureExtracter class.
  • Fix unicode error in Quora dataset loader.
  • Fix batch normalization bugs. Now supports batch "renormalization" correctly.
  • Models now reliably distinguish predict vs. train modes, using the convention drop=None. Previously, layers such as BatchNorm relied on having their predict() method called, which didn't work they were called by layers which didn't implement a predict() method. We now set drop=None to make this more reliable.
  • Fix bug that caused incorrect data types to be produced by FeatureExtracter.

👥 Contributors

Thanks to @dvsrepo, @justindujardin, @alephmelo and @darkdreamingdan for the pull requests and contributions.

v6.10.2: Efficiency improvements and bug fixes

06 Dec 11:42
Compare
Choose a tag to compare

✨ New features and improvements

  • Improve GPU utilisation for attention layer.
  • Improve efficiency of Maxout layer on CPU.

🔴 Bug fixes

  • Bug fix to foreach combinator, useful for hierarchical models.
  • Bug fix to batch normalization.

📖 Documentation and examples

  • Update imdb_cnn text classification example.

v6.10.1: Fix GPU install and minor memory leak

15 Nov 13:57
Compare
Choose a tag to compare

🔴 Bug fixes

  • Fix installation with CUDA 9.
  • Fix minor memory leak in beam search.
  • Fix dataset readers.

v6.10.0: CPU efficiency improvements, refactoring

28 Oct 17:04
Compare
Choose a tag to compare

✨ Major features and improvements

  • Provisional CUDA 9 support. CUDA 9 removes a compilation flag we require for CUDA 8. As a temporary workaround, you can build on CUDA 9 by setting the environment variable CUDA9=1. For example:
CUDA9=1 pip install thinc==6.10.0
  • Improve efficiency of NumpyOps.scatter_add, when the indices only have a single dimension. This function was previously a bottle-neck for spaCy.
  • Remove redundant copies in backpropagation of maxout non-linearity
  • Call floating-point versions of sqrt, exp and tanh functions.
  • Remove calls to tensordot, instead reshaping to make 2d dot calls.
  • Improve efficiency of Adam optimizer on CPU.
  • Eliminate redundant code in thinc.optimizers. There's now a single Optimizer class. For backwards compatibility, SGD and Adam functions are used to create optimizers with the Adam recipe or vanilla SGD recipe.

👥 Contributors

Thanks to @RaananHadar for the pull request!