Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Penn Treebank: #33

Open
jonathanrobie opened this issue Nov 19, 2019 · 1 comment
Open

Penn Treebank: #33

jonathanrobie opened this issue Nov 19, 2019 · 1 comment

Comments

@jonathanrobie
Copy link
Member

We need to consider what conventions to use for the categories discussed in "Penn Treebank: Empty Categories and Resumptive Elements":

https://www.ling.upenn.edu/ppche/ppche-release-2016/annotation/

@ErwinKomen
Copy link

ErwinKomen commented Dec 17, 2019

The empty categories used in the (historical!) Penn treebanks are:

  1. subjects elided under conjunction - *con*
  2. General empty subjects - *pro*
  3. dislocated constituents - *ICH*-n
  4. expletive subjects - *exp*
  5. ECM infinitive subjects - *arb*
  6. Traces - *T*-n

Relevant for Greek are probably only 1, 2 and 3.

Type 3 occurs in the 'surfaced' variants of the lowfat GNT and the lowfat HebOT, so visualizing that in lowfat should perhaps have priority.

If Greek (or Hebrew) really has 'subjects elided under conjunction' (type 1) or 'general empty subjects' (type 2), then these kinds of empty categories would need to be added to the texts. However, I have two questions:

  1. Do these types really exist in Greek/Hebrew?
  2. Would it be possible to add them automatically?

As for (1), their existence: Greek and Hebrew have subject person/number marking on the finite verb. English has that too. But the difference between these languages is that English normally requires an overt subject, whereas Greek/Hebrew have 'no pronoun' as default in situations where the subject is the same as in a previous clause. Here is an example from Luke 4:1

cl
   s Ἰησοῦς δὲ 
   cl πλήρης πνεύματος ἁγίου 
   v ὑπέστρεψεν
   pp ἀπὸ τοῦ Ἰορδάνου , 
conj καὶ 
cl
   s *con*
   v ἤγετο
   pp ἐν τῷ πνεύματι
   loc ἐν τῇ ἐρήμῳ 
   time ἡμέρας τεσσεράκοντα
   cl πειραζόμενος ὑπὸ τοῦ διαβόλου .

The grammatical subject stays the same from clause 1 to clause 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants