Per-factor embedding dimensions when concatenating #986

eltorre · 2023-04-05T08:46:21Z

Feature description

Right now, factors-dim-emb takes a single INT. Then, in Layers::Embedding creates a matrix where every embedding has the same dimension:

      FactorEmbMatrix_
          = graph_->param("factor_" + name, {numberOfFactors, dimFactorEmb}, initFunc, fixed);

Then, embedWithConcat (and maybe data::factored_vocab?) take this into account.

I feel like this is not too good when dealing with factors with very different vocab sizes, for example capitalization of a word (vocab size 3) vs word inflection (vocab size ~100 for some languages). This forces either a too small embedding for the second factor, or a too large embedding for the first, which seems wasteful.

Example

factors-dim-emb should behave like dim-vocabs when --factors-combine=concat

Comments

This seems easy enough to implement
Famous last words

I'd appreciate if somebody with a good knowledge of the codebase would gauge the size of the footgun beforehand.

The text was updated successfully, but these errors were encountered:

eltorre added the enhancement label Apr 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per-factor embedding dimensions when concatenating #986

Per-factor embedding dimensions when concatenating #986

eltorre commented Apr 5, 2023