This repository has been archived by the owner on Aug 30, 2018. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 11
/
teisimple-pi.html
32 lines (31 loc) · 7.34 KB
/
teisimple-pi.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><!--THIS FILE IS GENERATED FROM AN XML MASTER. DO NOT EDIT (5)--><title>TEI Simple: Performance Indicators</title><meta name="author" content="James Cummings and Sebastian Rahtz" /><meta name="generator" content="Text Encoding Initiative Consortium XSLT stylesheets" /><meta name="DC.Title" content="TEI Simple: Performance Indicators" /><meta name="DC.Type" content="Text" /><meta name="DC.Format" content="text/html" /><link href="http://www.tei-c.org/release/xml/tei/stylesheet/tei.css" rel="stylesheet" type="text/css" /><link rel="stylesheet" media="print" type="text/css" href="http://www.tei-c.org/release/xml/tei/stylesheet/tei-print.css" /></head><body class="simple" id="TOP"><!--TEI front--><div class="titlePage"><div class="docTitle"><div class="titlePart">TEI Simple: Performance Indicators</div></div><div class="docAuthor">James Cummings</div><div class="docAuthor">Sebastian Rahtz</div><div class="docDate">Version 0.3; October 2015</div></div><!--TEI body--><p>Although TEI Simple is designed to be very constrained, it will still allow for many choices by the encoder. Do they choose, for example, to explicitly identify names of people and places? Will they mark where spelling has been normalized? Will all the words be marked with part of speech information for linguistic analysis? This will affect the query potential of a corpus of texts, and cannot be done simply by analyzing the markup. The TEI already has extensive provision in the metadata header for describing the encoding decisions which have been made, but this is largely targeted at storing human-readable notes, and is thus not machine readable. This objective is to develop and implement an extra level of notation aimed at automatically profiling a text. This notation is be stored as machine-readable data in the <span class="gi"><teiHeader></span> using the <span class="gi"><interpretation></span> element in <span class="gi"><editorialDecl></span>. If mediated through a good visualizing tool, such data can alert an end user to the fact that texts in a corpus do or not include linguistic annotation, named entity identification, etc. They are, if you will, metadata about metadata. TEI Simple has not developed that visualizing tool, but it will lay the groundwork for data structures that routine visualizing tools can easily access and mediate, provided the data are there in the first place and in a machine-readable form.</p><div class="p">The TEI Simple Performance Indicators can be used in the <span class="gi"><editorialDecl></span> of any TEI Simple document by pointing to the file <a class="link_ref" href="http://raw.githubusercontent.com/TEIC/TEI-Simple/master/teisimple-pi.xml">http://raw.githubusercontent.com/TEIC/TEI-Simple/master/teisimple-pi.xml</a> and the <span class="att">xml:id</span> values of categories in its taxonomy. This is done by using the <span class="gi"><interpretation></span> element, one for each element or aspect being recorded. <div id="index.xml-egXML-d30e779" class="pre egXML_valid"><span class="element"><editorialDecl></span>
<span class="element"><interpretation <span class="attribute">ana</span>="<span class="attributevalue">simple-pi:name-person simple-pi:PI-Linked</span>"></span>
<span class="element"><p></span>Where present in the text names of people are marked.<span class="element"></p></span>
<span class="element"></interpretation></span>
<span class="element"><interpretation <span class="attribute">ana</span>="<span class="attributevalue">simple-pi:name-place simple-pi:PI-NotMarked</span>"></span>
<span class="element"><p></span>Where present in the text names of places are not marked.<span class="element"></p></span>
<span class="element"></interpretation></span>
<span class="element"><interpretation <span class="attribute">ana</span>="<span class="attributevalue">simple-pi:date</span>"></span>
<span class="element"><p></span>Where present in the text dates have been marked.<span class="element"></p></span>
<span class="element"></interpretation></span>
<span class="element"><interpretation <span class="attribute">ana</span>="<span class="attributevalue">simple-pi:prose</span>"></span>
<span class="element"><p></span>The work is primarily prose.<span class="element"></p></span>
<span class="element"></interpretation></span>
<span class="element"></editorialDecl></span></div></div><div class="p">Here the private URI prefix simple-pi: is used before the value of the <span class="att">xml:id</span> of the relevant category in the <span class="att">ana</span> attribute of the <span class="gi"><interpretation></span> element to point to that category. This should be documented with a <span class="gi"><listPrefixDecl></span> in the TEI Simple document instance. For example: <div id="index.xml-egXML-d30e812" class="pre egXML_valid"><span class="element"><listPrefixDef></span>
<span class="element"><prefixDef <span class="attribute">ident</span>="<span class="attributevalue">simple-pi</span>"
<span class="attribute">matchPattern</span>="<span class="attributevalue">([A-Z0-9-]+)</span>"
<span class="attribute">replacementPattern</span>="<span class="attributevalue">http://raw.githubusercontent.com/TEIC/TEI-Simple/master/teisimple-pi.xml#$1</span>"></span>
<span class="element"><p></span> Private URIs using the <span class="element"><code></span>simple-pi<span class="element"></code></span> prefix are pointers to <span class="element"><gi></span>category<span class="element"></gi></span> elements in
the teisimple-pi.xml file. For example, <span class="element"><code></span>simple-pi:name-person<span class="element"></code></span> dereferences to
<span class="element"><code></span>http://raw.githubusercontent.com/TEIC/TEI-Simple/master/teisimple-pi.xml#name-person<span class="element"></code></span>
and would indicate that names of people have been intended to be marked. If another value in the same
<span class="element"><gi></span>interpretation<span class="element"></gi></span> element pointed to <span class="element"><code></span>simple-pi:PI-NotMarked<span class="element"></code></span> this would indicate
that these have not been marked.<span class="element"></p></span>
<span class="element"></prefixDef></span>
<span class="element"></listPrefixDef></span></div></div><p>The list of category <span class="att">xml:id</span>s of TEI Simple Performance Indicators can be seen in the XML source for this file at: <a class="link_ref" href="https://github.com/TEIC/TEI-Simple/blob/master/teisimple-pi.xml">https://github.com/TEIC/TEI-Simple/blob/master/teisimple-pi.xml</a>.</p><!--TEI back--><div class="stdfooter autogenerated"><div class="footer"><!--standard links to project, institution etc--><a class="plain" href="/">Home</a> </div><address>James Cummings and Sebastian Rahtz.
Date: 2015-10-18<br /><!--
Generated from index.xml using XSLT stylesheets version 7.40.1
based on http://www.tei-c.org/Stylesheets/
on 2015-10-18T00:04:31Z.
SAXON HE 9.5.1.5.
--></address></div></body></html>