Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seq2Seq Transformer Tutorial #1225

Closed
mmwebster opened this issue Nov 8, 2020 · 5 comments · Fixed by #2451
Closed

Seq2Seq Transformer Tutorial #1225

mmwebster opened this issue Nov 8, 2020 · 5 comments · Fixed by #2451
Assignees
Labels
docathon-h1-2023 A label for the docathon in H1 2023 easy module: torchtext

Comments

@mmwebster
Copy link

mmwebster commented Nov 8, 2020

I'm having difficulty understanding a few aspects of the Seq2Seq transformer tutorial (https://pytorch.org/tutorials/beginner/transformer_tutorial.html)

  1. The tutorial says that it implements the architecture from Attention Is All You Need, but I don't see a TransformerDecoder used anywhere. It instead looks like only a TransformerEncoder is used. How does this example work without the decoder?
  2. The tutorial says that it uses a softmax to output probabilities over the dictionary, but I only see a linear output layer. Where is the softmax applied?
  3. Is this model learning to predict one word ahead (e.g. [hi how are you] -> [how are you doing])? I can't find the actual task described anywhere, only the inputs and targets in terms of an alphabet

Appreciate any help.

cc @pytorch/team-text-core @Nayef211

@holly1238 holly1238 added the Text Issues relating to text tutorials label Jul 27, 2021
@svekars svekars added module: torchtext and removed Text Issues relating to text tutorials labels Mar 16, 2023
@svekars svekars added easy docathon-h1-2023 A label for the docathon in H1 2023 labels May 31, 2023
@HemanthSai7
Copy link
Contributor

The tutorial has used criterion=nn.CrossEntropyLoss() which internally implements log-softmax.

@QasimKhan5x
Copy link
Contributor

The tutorial is supposed to be abut using the nn.Transformer module but it actually uses an nn.TransformerEncoder and a nn.Linear layer for the decoder. Rather than defining a custom TransformerModel class, shouldn't an nn.Transformer object be instantiated directly? I think this is an ambiguity that ought to be solved.

Second, the task seems to be language modeling i.e., given a sequence of tokens as input, generate another sequence of tokens given the input sequence as context. This ought to be explicitly mentioned, since it's a beginner tutorial.

Also, I am unable to run the tutorial on Google Colab. At first, I got a dependency issue upon attempting to run the following line::

[/usr/local/lib/python3.10/dist-packages/torchdata/datapipes/iter/util/cacheholder.py](https://localhost:8080/#) in _assert_portalocker()
37     try:
---> 38         import portalocker  # noqa: F401
39     except ImportError as e:

ModuleNotFoundError: No module named 'portalocker'

During handling of the above exception, another exception occurred:

ModuleNotFoundError                       Traceback (most recent call last)
6 frames
[<ipython-input-4-b02c7921f3b1>](https://localhost:8080/#) in <cell line: 5>()
      3 from torchtext.vocab import build_vocab_from_iterator
      4 
----> 5 train_iter = WikiText2(split='train')
      6 tokenizer = get_tokenizer('basic_english')
      7 vocab = build_vocab_from_iterator(map(tokenizer, train_iter), specials=['<unk>'])

[/usr/local/lib/python3.10/dist-packages/torchtext/data/datasets_utils.py](https://localhost:8080/#) in wrapper(root, *args, **kwargs)
    191             if not os.path.exists(new_root):
    192                 os.makedirs(new_root, exist_ok=True)
--> 193             return fn(root=new_root, *args, **kwargs)
    194 
    195         return wrapper

[/usr/local/lib/python3.10/dist-packages/torchtext/data/datasets_utils.py](https://localhost:8080/#) in new_fn(root, split, **kwargs)
    153         result = []
    154         for item in _check_default_set(split, splits, fn.__name__):
--> 155             result.append(fn(root, item, **kwargs))
    156         return _wrap_datasets(tuple(result), split)
    157 

[/usr/local/lib/python3.10/dist-packages/torchtext/datasets/wikitext2.py](https://localhost:8080/#) in WikiText2(root, split)
     75     url_dp = IterableWrapper([URL])
     76     # cache data on-disk
---> 77     cache_compressed_dp = url_dp.on_disk_cache(
     78         filepath_fn=partial(_filepath_fn, root),
     79         hash_dict={_filepath_fn(root): MD5},

[/usr/local/lib/python3.10/dist-packages/torch/utils/data/datapipes/datapipe.py](https://localhost:8080/#) in class_function(cls, enable_df_api_tracing, source_dp, *args, **kwargs)
    137 
    138         def class_function(cls, enable_df_api_tracing, source_dp, *args, **kwargs):
--> 139             result_pipe = cls(source_dp, *args, **kwargs)
    140             if isinstance(result_pipe, IterDataPipe):
    141                 if enable_df_api_tracing or isinstance(source_dp, DFIterDataPipe):

[/usr/local/lib/python3.10/dist-packages/torchdata/datapipes/iter/util/cacheholder.py](https://localhost:8080/#) in __init__(self, source_datapipe, filepath_fn, hash_dict, hash_type, extra_check_fn)
    205         extra_check_fn: Optional[Callable[[str], bool]] = None,
    206     ):
--> 207         _assert_portalocker()
    208 
    209         self.source_datapipe = source_datapipe

[/usr/local/lib/python3.10/dist-packages/torchdata/datapipes/iter/util/cacheholder.py](https://localhost:8080/#) in _assert_portalocker()
     45             raise
     46         else:
---> 47             raise ModuleNotFoundError(
     48                 "Package `portalocker` is required to be installed to use this datapipe."
     49                 "Please use `pip install 'portalocker>=2.0.0'` or"

ModuleNotFoundError: Package `portalocker` is required to be installed to use this datapipe.Please use `pip install 'portalocker>=2.0.0'` or`conda install -c conda-forge 'portalocker>=2/0.0'`to install the package

After running pip install portalocker>=2.0.0 and installing the package, I got another error:

  AttributeError                            Traceback (most recent call last)
[<ipython-input-6-b02c7921f3b1>](https://localhost:8080/#) in <cell line: 7>()
      5 train_iter = WikiText2(split='train')
      6 tokenizer = get_tokenizer('basic_english')
----> 7 vocab = build_vocab_from_iterator(map(tokenizer, train_iter), specials=['<unk>'])
      8 vocab.set_default_index(vocab['<unk>'])
      9 


61 frames
[/usr/local/lib/python3.10/dist-packages/torchdata/datapipes/iter/util/cacheholder.py](https://localhost:8080/#) in _cache_check_fn(data, filepath_fn, hash_dict, hash_type, extra_check_fn, cache_uuid)
    260                 os.makedirs(dirname)
    261 
--> 262             with portalocker.Lock(promise_filepath, "a+", flags=portalocker.LockFlags.EXCLUSIVE) as promise_fh:
    263                 promise_fh.seek(0)
    264                 data = promise_fh.read()


AttributeError: 'NoneType' object has no attribute 'Lock'
This exception is thrown by __iter__ of _MemoryCellIterDataPipe(remember_elements=1000, source_datapipe=_ChildDataPipe)

Didn't paste the entire trace since it has 61 frames.

@HemanthSai7
Copy link
Contributor

@QasimKhan5x even I got this error. Maybe it's due to the version of torchtext and supporting libraries.

@NM512
Copy link
Contributor

NM512 commented Jun 4, 2023

/assigntome

@NM512
Copy link
Contributor

NM512 commented Jun 8, 2023

I found it confusing that the diagram of the Transformer used in the tutorial includes the Decoder. Therefore, I have made it clear that the task focuses on predicting the next word from a sequence and that the nn.TransformerDecoder is not used.

I also faced the same error mentioned earlier, and I noticed that the cause was executing the same code without exit() after installing portalocker when encountering an error related to the absence of portalocker. To resolve this issue, I installed portalocker alongside torchdata before encountering the error, which successfully resolved the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docathon-h1-2023 A label for the docathon in H1 2023 easy module: torchtext
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants