JSON LD

The revised JSON-LD output format for Eidos is documented in the two tables below. An example follows to illustrate some of the notation, which just mimics JSON. [] indicates an array, {} indicates a map, and @id(Word) expands to something like "@id" : "_:Word_1".

Note: Recent changes are indicated with bold text. While many values are optional, they are written in italics where it is particularly important to keep that in mind. The example further below does not yet match the tables.

Name	Property	Type	Description
Corpus	@type	"Corpus"	A corpus is typed.
	documents	[Document]	It has a list of documents
	extractions	[Extraction]	and a set of mixed extractions.
Document	@type	"Document"	A document is typed
	@id	IRI	and provided an ID.
	id	string	It has an ID determined by the source of the document,
	title	string	a title,
	text	string	some text,
	location	string	a human-interpretable indication of where the text was found,
	dct	DCT	a document creation time,
	sentences	[Sentence]	and a list of sentences.
DCT	@type	"DCT"	A document creation time is typed
	@id	IRI	and provided an ID.
	text	string	It has a text.
	start	string	It starts (in ISO-8601 format)
	end	string	and ends (in ISO-8601 format).
Sentence	@type	"Sentence"	A sentence is typed
	@id	IRI	and provided an ID.
	text	string	It has the tokenized text with tokens, particularly punctuation marks, separated by spaces
	rawText	string	and a raw version which matches the document text,
	relevance	float	a relevance value,
	words	[Word]	a list of words,
	dependencies	[Dependency]	a set of universal enhanced dependencies,
	timexes	[TimeExpression]	a list of time expressions,
	geolocs	[GeoLocation]	a list of geographic location records,
	~~counts~~	~~[Count]~~	~~and a list of counts.~~
Word	@type	"Word"	A word is typed
	@id	IRI	and provided an ID.
	text	string	It has a text,
	tag	string	a tag from the Penn Treebank tag set,
	entity	string	an entity type,
	startOffset	integer	an inclusive, 0-based index of the first letter of the word in the text,
	endOffset	integer	an exclusive, 0-based index of the last letter of the word,
	lemma	string	a lemma,
	chunk	string	a chunk,
	norm	string	and a norm, including "B-TIME" or "I-TIME"
Dependency	@type	"Dependency"	A dependency is typed.
	type	string	It belongs to one of several (sub)types,
	source	{@id(Word)}	it has a source ID referring to a Word,
	destination	{@id(Word)}	a destination ID referring to a Word,
	relation	string	and a relation.
Extraction	@type	"Extraction"	An extraction is typed
	@id	IRI	and provided an ID.
	type	string	a description of type from the list below,
	subtype	string	a description of subtype from the list below,
	labels	[string]	It has a list of labels,
	text	[string]	a text,
	rule	[string]	a rule,
	canonicalName	string	a canonical name,
	relevance	float	a relevance value from the corresponding sentence,
	groundings	[Groundings]	a list of groundings(es),
	provenance	[Provenance]	a set of provenance values,
	trigger	Trigger	a trigger,
	states	[State]	and a set of states.
	arguments	[Argument]	and a list of arguments.
Groundings	@type	"Groundings"	A groundings is typed.
	name	string	It has a name such as "un", "wdi", or "fao",
	version	string	an indication of the ontology version used,
	versionDate	string	the date of the ontology version (in ISO-8601 format),
	values	[Grounding\|PredicateGrounding]	and a list of grounding values.
Grounding	@type	"Grounding"	A grounding is typed.
	ontologyConcept	string	It has an ontology concept
	value	float	and a matching numeric value.
PredicateGrounding	@type	"PredicateGrounding"	A predicate grounding is typed.
	theme	[Grounding]	It has groundings for a theme,
	themeProperties	[Grounding]	for theme properties,
	themeProcess	[Grounding]	for theme process,
	themeProcessProperties	[Grounding]	and for theme process properties.
	value	float	Furthermore, it has a numeric value
	display	string	and text for display.
Provenance	@type	"Provenance"	A "provenance" is typed.
	document	{@id(Document)}	It has a document ID referring to a Document,
	documentCharPositions	[Interval]	a set of intervals for characters within the document,
	sentence	{@id(Sentence)}	a sentence ID referring to a Sentence,
	sentenceWordPositions	[Interval]	and a set of intervals for words within the sentence.
Interval	@type	"Interval"	An interval is typed.
	start	integer	For sentenceWordPositions, this is an inclusive, 1-based index of the first word of the interval in the sentence. For documentCharPositions, it is an inclusive, 0-based index of the first character of the interval in the document
	end	integer	For sentenceWordPositions, this is an inclusive, 1-based index of the last word of the interval in the sentence. For documentCharPositions, it is an inclusive, 0-based index of the last character of the interval in the document.
State	@type	"State"	A state is typed.
	type	string	It has an Eidos type such as INC, DEC, QUANT, TIMEX, HEDGE, NEGATION, PROP, or LocationExp, or a BBN type such as TENSE, POLARITY, and so on. This list is non-exhaustive. See details below.
	text	string	a text,
	value	value_type	a value of type detailed below,
	provenance	[Provenance]	a set of provenance values,
	modifiers	[Modifier]	and a set of modifiers.
Modifier	@type	"Modifier"	A modifier is typed.
	text	string	It has a text,
	provenance	[Provenance]	a set of provenance values,
	intercept	double	an intercept,
	mu	double	a mu,
	sigma	double	and a sigma.
Trigger	@type	"Trigger"	A trigger is typed.
	text	string	It has a text for the head word
	provenance	[Provenance]	and a set of provenance values.
Argument	@type	"Argument"	An argument is typed.
	type	string	It has a description of type from the table below,
	value	{@id(Extraction)}	and an ID referring to the extraction.
TimeExpression	@type	"TimeExpression"	A time expression is typed
	@id	IRI	and provided with an ID.
	startOffset	integer	It has a starting, zero-based character offset into the sentence text,
	endOffset	integer	and an ending, exclusive character offset,
	text	string	along with the actual text determined by the offsets,
	intervals	[TimeInterval]	and an optional list of associated, concrete time intervals.
TimeInterval	@type	"TimeInterval"	A time interval is typed
	@id	IRI	and provided with an ID.
	start	string	It starts (in ISO-8601 format)
	end	string	and ends (in ISO-8601 format).
	~~duration~~	~~integer~~	~~and it lasts for a number of seconds.~~
GeoLocation	@type	"GeoLocation"	A geographic location record is typed
	@id	IRI	and provided with an ID.
	startOffset	integer	It has a starting, zero-based character offset into the sentence text,
	endOffset	integer	and an ending, exclusive character offset,
	text	string	along with the actual text determined by the offsets,
	geoID	string	and a geographic identifier.
~~Count~~	~~@type~~	~~"Count"~~	~~A count is typed~~
	~~@id~~	~~IRI~~	~~and provided with an ID.~~
	~~startOffset~~	~~integer~~	~~It has a starting, zero-based character offset into the sentence text,~~
	~~endOffset~~	~~integer~~	~~and an ending, exclusive character offset,~~
	~~text~~	~~string~~	~~along with the pertinent text between the offsets,~~
	~~value~~	~~double~~	~~and a numeric value,~~
	~~modifier~~	~~"NoModifier", "Approximate", "Min", or "Max"~~	~~a required modifier,~~
	~~unit~~	~~"Absolute", "Daily", "Weekly", "Monthly", or "Percentage"~~	~~and a required unit.~~

Extraction type	Extraction subtype	Argument type	Notes
"concept"			This is just something being there.
	"entity"
	"event"	"actor"
		"place"
		"time"
"relation"			Link between concepts
	"causation"		This is a directed relation.
		"source"	This source may appear multiple times.
		"destination"	This destination may appear multiple times.
	"precondition"		The "source" concept must have occurred/begun for the "destination" concept to happen. This is a directed relation.
		"source"	This source may appear multiple times.
		"destination"	This destination may appear multiple times.
	"catalyst"		The "source" concept increases the intensity of the "destination" concept, but does not cause the "destination" concept. This is a directed relation.
		"source"	This source may appear multiple times.
		"destination"	This destination may appear multiple times.
	"mitigation"		The "source" concept decreases the intensity of the "destination" concept, but does not cause the "destination" concept. This is a directed relation.
		"source"	This source may appear multiple times.
		"destination"	This destination may appear multiple times.
	"prevention"		The "source" concept will stop or prevent the 'destination' concept. This is a directed relation.
		"source"	This source may appear multiple times.
		"destination"	This destination may appear multiple times.
	"temporallyPrecedes"		The "source" concept occurs before the 'destination' concept. This is a directed relation.
		"source"	This source may appear multiple times.
		"destination"	This destination may appear multiple times.
	"correlation"		This is an undirected relation.
		"argument"	The argument must appear multiple times.
	"unification"		This is a directed relation.
		"group"	The argument must appear only once.
		"member"	The argument must appear only once.
	"coreference"		This is a directed relation.
		"anchor"	The argument must appear only once.
		"reference"	The argument must appear only once.
	~~"migration"~~		~~This is an undirected relation.~~
		~~"group"~~	~~Who,~~
		~~"groupModifier"~~	~~with modifications,~~
		~~"moveTo"~~	~~where to,~~
		~~"moveFrom"~~	~~where from,~~
		~~"moveThrough"~~	~~where through,~~
		~~"timeStart"~~	~~starting when,~~
		~~"timeEnd"~~	~~ending when,~~
		~~"time"~~	~~and?~~

State type	Value type	Notes
INC		Indicates that the Concept has been increased.
DEC		Indicates that the Concept has been decreased.
QUANT		The gradable adjective associated with the Concept.
~~COUNT~~	~~{@id(Count)}~~	~~This is an ID referring to a count.~~
TIMEX	{@id(TimeExpression)}	This is an ID referring to a time expression.
LocationExp	{@id(GeoLocation)}	This is an ID referring to a geographic location.
HEDGE		Indicates that the Extraction has been hedged.
NEGATION		Indicates that the Extraction has been negated.
PROP		Indicates a property such as price, distance, weight, or volume.
polarity		Negative when it is explicitly indicated that the Event did not occur. All other Events are Positive.
modality		Asserted when the author or speaker makes reference to it as though it were a real occurrence. All other Events are Other.
genericity		Specific if it is a singular occurrence at a particular place and time, or a finite set of such occurrences. All other Events are Generic.

This example shows valid JSON-LD syntax and links between elements, even though the linguistic analysis is fabricated.

{
  "@context" : {
    "Argument" : "https://w3id.org/wm/cag/argument",
    "Corpus" : "https://w3id.org/wm/cag/corpus",
    "Dependency" : "https://w3id.org/wm/cag/dependency",
    "Document" : "https://w3id.org/wm/cag/document",
    "Extraction" : "https://w3id.org/wm/cag/extraction",
    "Interval" : "https://w3id.org/wm/cag/interval",
    "Modifier" : "https://w3id.org/wm/cag/modifier",
    "Provenance" : "https://w3id.org/wm/cag/provenance",
    "Sentence" : "https://w3id.org/wm/cag/sentence",
    "State" : "https://w3id.org/wm/cag/state",
    "Trigger" : "https://w3id.org/wm/cag/trigger",
    "Word" : "https://w3id.org/wm/cag/word"
  },
  "@type" : "Corpus",
  "documents" : [ {
    "@type" : "Document",
    "@id" : "_:Document_1",
    "title" : "Example Document",
    "sentences" : [ {
      "@type" : "Sentence",
      "@id" : "_:Sentence_1",
      "text" : "Hello, world!",
      "words" : [ {
        "@type" : "Word",
        "@id" : "_:Word_1",
        "text" : "Hello",
        "tag" : "UH",
        "entity" : "O",
        "startOffset": 0,
        "endOffset" : 5,
        "lemma" : "hello",
        "chunk" : "B-ADVP"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_2",
        "text" : ",",
        "tag" : ",",
        "entity" : "O",
        "startOffset" : 5,
        "endOffset" : 6,
        "lemma" : ",",
        "chunk" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_3",
        "text" : "world",
        "tag" : "NN",
        "entity" : "O",
        "startOffset" : 7,
        "endOffset" : 12,
        "lemma" : "world",
        "chunk" : "B-NP"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_4",
        "text" : "!",
        "tag" : ".",
        "entity" : "O",
        "startOffset" : 12,
        "endOffset" : 13,
        "lemma" : "!",
        "chunk" : "O"
      } ],
      "dependencies" : [ {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_3"
        },
        "destination" : {
          "@id" : "_:Word_1"
        },
        "relation" : "discourse"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_3"
        },
        "destination" : {
          "@id" : "_:Word_2"
        },
        "relation" : "punct"
      } ]
    } ]
  } ],
  "extractions" : [ {
    "@type" : "Extraction",
    "@id" : "_:Extraction_1",
    "type" : "concept",
    "subtype" : "entity",
    "labels" : [ "NounPhrase", "Entity" ],
    "text" : "world",
    "rule" : "simple-np",
    "canonicalName" : "world",
    "grounding" : [ {
      "@type" : "Grounding",
      "ontologyConcept" : "/entities/human/livelihood",
      "value" : 0.47524851930210044
    }, {
      "@type" : "Grounding",
      "ontologyConcept" : "/entities/human/financial/economic/economy",
      "value" : 0.4713680118187502
    } ],
    "provenance" : [ {
      "@type" : "Provenance",
      "document" : {
        "@id" : "_:Document_1"
      },
      "sentence" : {
        "@id" : "_:Sentence_1"
      },
      "positions" : {
        "@type" : "Interval",
        "start" : 3,
        "end" : 3
      }
    } ],
    "states" : [ {
      "@type" : "State",
      "type" : "INC",
      "text" : "Hello",
      "modifiers" : [ {
        "@type" : "Modifier",
        "text" : "world",
        "intercept" : 0.6154,
        "mu" : 1.034E-5,
        "sigma" : -0.001123
      } ]
    } ]
  }, {
    "@type" : "Extraction",
    "@id" : "_:Extraction_2",
    "type" : "relation",
    "subtype" : "causation",
    "labels" : [ "Causal", "DirectedRelation", "EntityLinker", "Event" ],
    "text" : "Hello",
    "rule" : "dueToSyntax1-Causal",
    "canonicalName" : "hello",
    "provenance" : [ {
      "@type" : "Provenance",
      "document" : {
        "@id" : "_:Document_1"
      },
      "sentence" : {
        "@id" : "_:Sentence_1"
      },
      "positions" : {
        "@type" : "Interval",
        "start" : 1,
        "end" : 1
      }
    } ],
    "trigger" : {
      "@type" : "Trigger",
      "text" : "world",
      "provenance" : [ {
        "@type" : "Provenance",
        "document" : {
          "@id" : "_:Document_1"
        },
        "sentence" : {
          "@id" : "_:Sentence_1"
        },
        "positions" : {
          "@type" : "Interval",
          "start" :3,
          "end" : 3
        }
      } ]
    },
    "arguments" : [ {
      "@type" : "Argument",
      "type" : "source",
      "value" : {
        "@id" : "_:Extraction_1"
      }
    }, {
      "@type" : "Argument",  
      "type" : "destination",
      "value" : {
        "@id" : "_:Extraction_3"
      }
    } ]
  }, {
    "@type" : "Extraction",
    "@id" : "_:Extraction_3",
    "type" : "relation",
    "subtype" : "correlation",
    "labels" : [ "Correlation", "UndirectedRelation", "EntityLinker", "Event" ],
    "text" : "world",
    "rule" : "dueToSyntax1-Causal",
    "canonicalName" : "world",
    "provenance" : [ {
      "@type" : "Provenance",
      "document" : {
        "@id" : "_:Document_1"
      },
      "sentence" : {
        "@id" : "_:Sentence_1"
      },
      "positions" : {
        "@type" : "Interval",
        "start" : 3,
        "end" : 3
      }
    } ],
    "trigger" : {
      "@type" : "Trigger",
      "text" : "Hello",
      "provenance" : [ {
        "@type" : "Provenance",
        "document" : {
          "@id" : "_:Document_1"
        },
        "sentence" : {
          "@id" : "_:Sentence_1"
        },
        "positions" : {
          "@type" : "Interval",
          "start" : 1,
          "end" : 1
        }
      } ]
    },
    "arguments" : [ {
      "@type" : "Argument",
      "type" : "argument",
      "value" : {
        "@id" : "_:Extraction_2"
      }
    }, {
      "@type" : "Argument",
      "type" : "argument",
      "value" : {
        "@id" : "_:Extraction_1"
      }
    } ]
  } ]
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON LD

Table of Contents

Clone this wiki locally