Skip to content

ErgLeTypes

DanFlickinger edited this page Aug 25, 2015 · 12 revisions

ERG Lexical Types

Each lexical entry in the ERG is assigned to exactly one (leaf) lexical type, which determines most of its properties. These leaf types themselves inherit their properties from the full hierarchy of types defined in the ERG, but only the leaf types and those identified in the SEM-I (semantic interface) should be relevant for external applications.

Names of types

All leaf types have names consisting of three or four fields, as follows, each separated by an underscore:

Part-of-Speech Complements (optional) Annotations "le"

The Part-of-Speech field identifies the broad lexical category, distinguishing e.g. verbs from nouns. Current values in this field include only the following:

v verb
n noun
aj adjective
av adverb
p preposition
pp prep phrase
d determiner
c conjunction
cm complementizer
x miscellaneous
pt punctuation

The second field identifies the ordered sequence of complements which this type selects for, with each complement identified by a category abbreviation drawn from the following inventory, and with a hyphen used to separate multiple complement identifiers. Lexical types that do not select for any complements indicate this with a single hyphen in this field. Complements marked with an asterisk ("*") are optional, but note that for brevity not all optional complements are so marked. The order of complements is taken to be the unmarked order.

np noun phrase, no longer obligatorily awaiting a specifier
vp verb phrase, not yet subject-saturated
cp sentential complement, with or without an overt complementizer
pp prepositional phrase, complement-saturated
ap adjective phrase
prd predicative phrase, including VP, AP, PP
p particle, semantically empty
it expletive "it"
nb nominal phrase, not yet specifier-saturated
adv adverb phrase
vpslnp verb phrase containing an NP gap
xp underspecified category

The third field, which is optional, provides annotations of the finer-grained distinctions among types with the same part-of-speech and complement selection. These annotations are briefly illuminated in the middle field of the table of all types below.

The last field is always simply the suffix "le" to avoid name space conflicts, and to support regular-expression searches within the grammar source files.

As an example, the lexical type name "v_np-cp_q_le" indicates that verbs of this type select for two obligatory complements (a noun phrase and a sentential complement), and the "q" annotation in the third field is a mnemonic that the sentential complement must be an embedded question. As a second example, the type name "aj_pp-vp_i-it_le" is for adjectives like "tough" which select for two complements (a prepositional phrase and a verb phrase) as well as an expletive "it" subject.

Documentation on each lexical type for the most recent stable version of the ERG (currently "1214"), along with examples from manually-annotated treebanks, can be found here.

Clone this wiki locally