-
Notifications
You must be signed in to change notification settings - Fork 21
Process Units
Table of Contents generated with DocToc
-
Process units documentation
- FlatTokenizer
- EnchantSpellingAlternatives
- RegexMatcher
- SimpleWord
- CoreferencesSolving
- WordSenseDisambiguation
- HyphenWordAlternatives
- GeoEntitiesTagger
- ApplyRecognizer
- DefaultProperties
- SimpleDefaultProperties
- ViterbiPosTagger
- SvmToolPosTagger
- DynamicSvmToolPosTagger
- SentenceBoundariesFinder
- SyntacticAnalyzerChains
- SyntacticAnalyzerNoChains
- SyntacticAnalyzerDisamb
- SyntacticAnalyzerDeps
- SyntacticAnalyzerSimplify
- SyntacticAnalyzerDepsHetero
- StatusLogger
- SpecificEntitiesXmlLogger
- FullTokenXmlLogger
- SentenceBoundariesXmlLogger
- WordSenseXmlLogger0
- DisambiguatedGraphXmlLogger
- DebugSyntacticAnalysisLogger
- DotGraphWriter
- CorefSolvingLogger
- DotDependencyGraphWriter
- AnnotDotGraphWriter
- LinearTextRepresentationLogger
- SyntacticAnalysisXmlLogger
- Dumpers
The core of LIMA is the execution of process units in pipeline. This page is the technical documentation of each process unit available in the standard LIMA distribution. Thanks to its plugins mechanism, LIMA can be extended with new features. These will have their own documentation.
For each process unit, we describe:
- class: the identifier used to instantiate the corresponding C++ class;
- role: the description of this process unit role;
- inputs: the state of the LIMA data structures necessary to run this process unit and the parameters available to modify the behavior of this processs unit;
- outputs: the kind of data written to files or standard output;
- preconditions: the sate of the data structures that must be reached before running this process unit;
- effects: the changes made to the LIMA data structures by the execution of this process unit.
Class: FlatTokenizer
Role: The role of this process unit is to split the input text in tokens. It uses for this an automaton allowing a rich behavior, far away a simple tokenization on white spaces. It is usually the first element of the pipeline.
Inputs: an AnalysisContent containing the initial text.
Parameters | |
---|---|
automatonFile | the path to the tokenization automaton file to use, relative to the main resources folder |
charChart | The name of a group in the Resources module. This defines a resource of class FlatTokenizerCharChart with a parameter named charFile giving the path to the chars chart file to use, relative to the main resources folder |
Outputs: an AnalysisContent
Preconditions: the AnalysisContent must contain an AnalysisData of type LimaStringText named "Text"
Effects: the AnalysisContent will contain an AnalysisData of type AnalysisGraph named "AnalysisGraph" which is a linear graph (a string) containing one vertex for each detected token.
Class: EnchantSpellingAlternatives
Role: Use the enchant spell checker to find corrections for tokens not found in the dictionary.
Inputs: the AnalysisGraph.
Parameters | |
---|---|
dictionary | the LIMA dictionary resource (usually mainDictionary) where to search for suggestions by Enchant |
Outputs: the same AnalysisGraph enriched with spelling corrections
Preconditions: the AnalysisGraph must already exist
Effects: the AnalysisGraph will tokens that had no linguistic information are enriched with spelling alternatives.
Notes:
- This process unit is available only if the Enchant spell checker has been found at compile time.
Class: RegexMatcher
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="regexmatcher" class="RegexMatcher">
<map name="regexes">
<entry key="[\w\-_]+(\.[\w\-_]+)*\@[\w\-_](\.[\w\-_]+)+" value="t_url"/>
<entry key="((mailto|http|ftp|https):\/\/)?[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?" value="t_url"/>
</map>
</group>
Class: SimpleWord
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="simpleWord" class="SimpleWord">
<param key="dictionary" value="mainDictionary"/>
<param key="confidentMode" value="true"/>
<param key="charChart" value="flatcharchart"/>
<param key="parseConcatenated" value="false"/>
</group>
Class: CoreferencesSolving
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="coreferencesSolving" class="CoreferencesSolving">
<param key="scope" value="3" />
<param key="threshold" value="60" />
<param key="Resolve Definites" value="0" />
<param key="Resolve non third person pronouns" value="0" />
<map name="MacroCategories">
<entry key="PronMacroCategory" value="PRON"/>
<entry key="VerbMacroCategory" value="V" />
<entry key="PrepMacroCategory" value="PREP" />
<entry key="NomCommunMacroCategory" value="NC" />
<entry key="NomPropreMacroCategory" value="NP" />
</map>
<list name="LexicalAnaphora">
<item value="CLR"/>
</list>
<list name="UndefinitePronouns">
<!--item value="PRON_INDEFINI"/>
<item value="PRON_INDEFINI_VAL_NEG"/-->
</list>
<list name="PossessivePronouns">
<!--item value="PRON_POSSESSIF_SUJET" />
<item value="PRON_POSSESSIF_COD" />
<item value="PRON_POSSESSIF_COI" /-->
</list>
<list name="PrepRelation">
<item value="PREPSUB"/>
<item value="PrepDetInt"/>
<item value="PrepInf"/>
<item value="PrepPronRelCa"/>
<item value="PrepPron"/>
<item value="PrepPronRel"/>
<item value="PrepPronCliv"/>
<item value="PrepAdv"/>
</list>
<list name="PleonasticRelation">
<item value="Pleon"/>
</list>
<list name="DefiniteRelation">
<item value="DETSUB"/>
</list>
<list name="SubjectRelation">
<item value="SUJ_V" />
<item value="SUJ_V_REL" />
<item value="PronSujVerbe" />
<item value="SujInv" />
</list>
<list name="AttributeRelation">
<item value="ATB_S"/>
</list>
<list name="CODRelation">
<item value="COD_V" />
<item value="CodPrev" />
<item value="PronReflVerbe" />
</list>
<list name="COIRelation">
<item value="CPL_V" />
<item value="CoiPrev" />
</list>
<list name="AdjunctRelation">
<item value="CPLV_V" />
<item value="CC_TEMPS" />
<item value="CC_LIEU" />
<item value="CC_BUT" />
<item value="CC_MOYEN" />
<item value="CC_MANIERE" />
<item value="COMPADJ" />
<item value="COMPADV" />
</list>
<list name="AgentRelation">
<item value="COMPADJ" />
</list>
<list name="NPDeterminerRelation">
<item value="COMPDUNOM" />
<item value="COMPDUNOM2" />
<item value="SUBSUBJUX" />
<item value="COMP_N-N" />
<item value="COMPDUNOM_INC" />
</list>
<!-- Lappin & Leass salience factors -->
<map name="SalienceFactors">
<entry key="SentenceRecency" value="90"/>
<entry key="SubjEmph" value="90"/>
<entry key="ExistEmph" value="70"/>
<entry key="CodEmph" value="50"/>
<entry key="CoiCoblEmph" value="40"/>
<entry key="HeadEmph" value="80"/>
<entry key="NonAdvEmph" value="50"/>
<entry key="IsInSubordinate" value="-70"/>
<!-- local factors -->
<entry key="Cataphora" value="-120"/>
<entry key="SameSlot" value="90"/>
<entry key="Itself" value="-140"/>
</map>
<map name="SlotValues">
<entry key="SubjectRelation" value="4"/>
<entry key="AgentRelation" value="3"/>
<entry key="CODRelation" value="2"/>
<entry key="COIRelation" value="1"/>
<entry key="AdjunctRelation" value="1"/>
</map>
</group>
Class: WordSenseDisambiguation
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="wordSenseDisambiguation" class="WordSenseDisambiguation" >
<!--param key="mode" value="b_Romanseval_most_frequent"/>
<param key="sensesPath" value="/home/cm218888/opendata/romanseval_data/SenseInventory" /-->
<param key="mode" value="b_Jaws_most_frequent"/>
<param key="sensesPath" value="/home/cm218888/otherdata/jaws-1.0/SenseInventory" />
<!--param key="mode" value="s_Wsi_mrd"/>
<param key="sensesPath" value="/home/cm218888/otherdata/wsi/clustersBin" />
<param key="mapping" value="m_Jaws_senses" />
<param key="mappingFile" value="mapping.txt" /-->
<param key="dictionaryFile" value="/home/cm218888/otherdata/words.ids" />
<param key="bestNNDir" value="knnall" />
<list name="NounContextList">
<item value="COD_V"/>
<!--item value="SUJ_V"/>
<item value="COMPDUNOM"/>
<item value="COMPDUNOM.reverse"/>
<item value="SUBADJPOST.reverse"/>
<item value="ADJPRENSUB.reverse"/>
<item value="window5"/>
<item value="window20"/-->
</list>
<map name="knnsearchConfig">
<entry key="hashedDir" value="/home/cm218888/otherdata/hasheddb"/>
<entry key="totalPermutations" value="10" />
<entry key="beam" value="20" />
<entry key="k" value="50" />
</map>
</group>
Class: HyphenWordAlternatives
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="hyphenWordAlternatives" class="HyphenWordAlternatives">
<param key="dictionary" value="mainDictionary"/>
<param key="charChart" value="flatcharchart"/>
<param key="tokenizer" value="flattokenizer"/>
</group>
Class: GeoEntitiesTagger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="geoEntities" class="GeoEntitiesTagger">
<param key="charChart" value="flatcharchart"/>
<param key="dbms" value="mysql"/>
<param key="dbConnection" value="dbname=GAZETIKI_DB user=gazetiki password=gazpwd"/>
<param key="maxEntityLength" value="10" />
<param key="graph" value="PosGraph"/>
<param key="fieldClass" value="CLASS_3"/>
<map name="Trigger">
<!--entry key="t_capital_1st" value="Status" unlessSatusBefore="t_sentence_brk" unlessMicroBefore="PONCTU_PARAGRAPHE" unlessFirstToken="YES"/-->
<entry key="t_capital_1st" value="Status" unlessSatusBefore="t_sentence_brk" unlessMicroBefore="" unlessFirstToken="YES"/>
<entry key="NP" value="Micro" unlessSatusBefore="" unlessMicroBefore="" unlessFirstToken="NO"/>
</map>
<map name="EndWord">
<!--entry key="PONCTU_PARAGRAPHE" value="Micro" unlessSatusBefore="" unlessMicroBefore="" unlessFirstToken="NO"/-->
<entry key="T_COMMA_NUMBER" value="Status" unlessSatusBefore="" unlessMicroBefore="" unlessFirstToken="NO"/>
</map>
</group>
Class: ApplyRecognizer
Role: The role of this process unit is to apply compiled recognition rules. The specification of the rules source format is described elsewhere.
This kind of process unit and rules is used extensively in LIMA, for things like idiomatic expressions or named entities. But also for parsing and other things.
Inputs:
Parameters | |
---|---|
automaton | |
automatonList | |
useSentenceBounds | |
applyOnGraph | |
updateGraph | |
resolveOverlappingEntities | |
overlappingEntitiesStrategy | |
storeInData | |
testAllVertices | if true, test all vertices, otherwise, skip recognized expressions (default is false) |
stopAtFirstSuccess | if true, stop testing rules on the current node after one rule succeeded (default is true) |
onlyOneSuccessPerType | if true, stop testing rules with same type as a previously successful rule (only used if stopAtFirstSuccess is false) (default is false) |
returnAtFirstSuccess | if true, abort the search as soona rule is successful (if true, stopAtFirstSuccess will be set to true) (default is false) |
applySameRuleWhileSuccess | if true, when a rule succeeds, retry to apply it on same vertex until the rule does not apply (use with care: setting this argument to true may cause loops if rules are not well written). Will not apply if stopAtFirstSuccess. (default is false) |
Outputs:
Preconditions:
Effects:
Class: DefaultProperties
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="defaultProperties" class="DefaultProperties">
<param key="dictionary" value="mainDictionary"/>
<param key="charChart" value="flatcharchart"/>
<param key="defaultPropertyFile" value="LinguisticProcessings/fre/default-fre.dat"/>
<list name="skipUnmarkStatus">
<item value="t_dot_number"/>
<item value="t_capital_1st"/>
</list>
</group>
Class: SimpleDefaultProperties
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="simpleDefaultProperties" class="SimpleDefaultProperties">
<list name="defaultCategories">
<item value="NP NP"/>
</list>
</group>
Class: ViterbiPosTagger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="viterbiPostagger-freq" class="ViterbiPosTagger">
<param key="trigramFile" value="Disambiguation/trigramMatrix-fre.dat"/>
<param key="bigramFile" value="Disambiguation/bigramMatrix-fre.dat"/>
<param key="costFunction" value="FrequencyCost"/>
<param key="defaultCategory" value="PONCTU_FORTE"/>
<list name="stopCategories">
<item value="PONCTU_FORTE" />
</list>
</group>
<group name="viterbiPostagger-int" class="ViterbiPosTagger">
<param key="trigramFile" value="Disambiguation/trigramMatrix-fre.dat"/>
<param key="bigramFile" value="Disambiguation/bigramMatrix-fre.dat"/>
<param key="costFunction" value="IntegerCost"/>
<param key="defaultCategory" value="PONCTU_FORTE"/>
<list name="stopCategories">
<item value="PONCTU_FORTE" />
</list>
</group>
<group name="viterbiPostagger-int-none" class="ViterbiPosTagger">
<param key="trigramFile" value="Disambiguation/trigramMatrix-fre.dat"/>
<param key="bigramFile" value="Disambiguation/bigramMatrix-fre.dat"/>
<param key="costFunction" value="IntegerCost"/>
<param key="defaultCategory" value="NONE_1"/>
<list name="stopCategories">
<item value="PONCTU_FORTE" />
</list>
</group>
Class: SvmToolPosTagger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="SvmToolPosTagger" class="SvmToolPosTagger">
<param key="model" value="Disambiguation/SVMToolModel-fre/lima"/>
<param key="defaultCategory" value="PONCTU_FORTE"/>
<list name="stopCategories">
<item value="PONCTU_FORTE" />
</list>
</group>
Class: DynamicSvmToolPosTagger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="DynamicSvmToolPosTagger" class="DynamicSvmToolPosTagger">
<param key="model" value="Disambiguation/SVMToolModel-fre/lima"/>
<param key="defaultCategory" value="PONCTU_FORTE"/>
<list name="stopCategories">
<item value="PONCTU_FORTE" />
</list>
</group>
Class: SentenceBoundariesFinder
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="sentenceBoundariesFinder" class="SentenceBoundariesFinder">
<param key="graph" value="PosGraph"/>
<list name="micros">
<item value="PONCTU_FORTE" />
</list>
</group>
Class: SyntacticAnalyzerChains
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="syntacticAnalyzerChains" class="SyntacticAnalyzerChains">
<param key="chainMatrix" value="chainMatrix"/>
<param key="maxChainsNbByVertex" value="30"/>
<param key="maxChainLength" value="12"/>
</group>
Class: SyntacticAnalyzerNoChains
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<!-- syntacticAnalyzerNoChains replaces syntacticAnalyzerChains. It is an
experimental module used to test if LIMA analysis works without nominal and
verbal. It allows also to build compounds using verbs and heterosyntagmatic
dependencies. For that, one have to add adequate relations in
CompoundRelations in mm-common. -->
<group name="syntacticAnalyzerNoChains" class="SyntacticAnalyzerNoChains">
<param key="chainMatrix" value="chainMatrix"/>
<param key="disambiguated" value="true"/>
<param key="maxChainsNbByVertex" value="30"/>
<param key="maxChainLength" value="12"/>
</group>
Class: SyntacticAnalyzerDisamb
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="syntacticAnalyzerDisamb" class="SyntacticAnalyzerDisamb">
<param key="depGraphMaxBranchingFactor" value="100"/>
</group>
Class: SyntacticAnalyzerDeps
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="syntacticAnalyzerDeps" class="SyntacticAnalyzerDeps">
<list name="actions">
<item value="pass0HomoSyntagmaticRelationRules"/>
<item value="pass1HomoSyntagmaticRelationRules"/>
<item value="pass2HomoSyntagmaticRelationRules"/>
<item value="pleonasticPronouns"/>
<item value="compoundTensesRules"/>
</list>
<param key="applySameRuleWhileSuccess" value="true"/>
</group>
Class: SyntacticAnalyzerSimplify
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="syntacticAnalyzerSimplifyFirst" class="SyntacticAnalyzerSimplify">
<param key="simplifyAutomaton" value="simplifyAutomatonFirst"/>
</group>
<group name="syntacticAnalyzerSimplify" class="SyntacticAnalyzerSimplify">
<param key="simplifyAutomaton" value="simplifyAutomaton"/>
</group>
<group name="syntacticAnalyzerSimplifyCoord" class="SyntacticAnalyzerSimplify">
<param key="simplifyAutomaton" value="simplifyAutomatonCoord"/>
</group>
<group name="syntacticAnalyzerSimplifyLast" class="SyntacticAnalyzerSimplify">
<param key="simplifyAutomaton" value="simplifyAutomatonLast"/>
</group>
Class: SyntacticAnalyzerDepsHetero
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="syntacticAnalyzerDepsHetero" class="SyntacticAnalyzerDepsHetero">
<param key="rules" value="heteroSyntagmaticRelationRules"/>
<param key="selectionalPreferences" value="selectionalPreferences"/>
<param key="unfold" value="true"/>
<param key="linkSubSentences" value="true"/>
<param key="applySameRuleWhileSuccess" value="true"/>
<map name="subSentencesRules">
<entry key="SubSent" value="heteroSyntagmaticRelationRules"/>
<entry key="SubordRel" value="heteroSyntagmaticRelationRules"/>
<entry key="Parent" value="heteroSyntagmaticRelationRules"/>
<entry key="Quotes" value="heteroSyntagmaticRelationRules"/>
<entry key="VirguleSeule" value="heteroSyntagmaticRelationRules"/>
<entry key="Appos" value="heteroSyntagmaticRelationRules"/>
<entry key="AdvSeul" value="heteroSyntagmaticRelationRules"/>
<entry key="AdvInit" value="heteroSyntagmaticRelationRules"/>
<entry key="CompAdv" value="heteroSyntagmaticRelationRules"/>
<entry key="Adverbe" value="heteroSyntagmaticRelationRules"/>
<entry key="ConjInfSecond" value="heteroSyntagmaticRelationRules"/>
<entry key="CCInit" value="heteroSyntagmaticRelationRules"/>
<entry key="Infinitive" value="heteroSyntagmaticRelationRules"/>
<entry key="SUBSUBJUX" value="heteroSyntagmaticRelationRules"/>
<entry key="CompDuNom1" value="heteroSyntagmaticRelationRules"/>
<entry key="CompDuNom2" value="heteroSyntagmaticRelationRules"/>
<entry key="CompAdj1" value="heteroSyntagmaticRelationRules"/>
<entry key="CompAdj2" value="heteroSyntagmaticRelationRules"/>
<entry key="SubordParticipiale" value="heteroSyntagmaticRelationRules"/>
<entry key="ElemListe" value="heteroSyntagmaticRelationRules"/>
<entry key="ConjSecond" value="heteroSyntagmaticRelationRules"/>
<entry key="InciseNom" value="heteroSyntagmaticRelationRules"/>
<entry key="CompCirc" value="heteroSyntagmaticRelationRules"/>
<entry key="SubordInit" value="heteroSyntagmaticRelationRules"/>
<entry key="ConjNominale" value="heteroSyntagmaticRelationRules"/>
</map>
</group>
<group name="syntacticAnalyzerDummy" class="SyntacticAnalyzerDeps">
<list name="actions">
<item value="l2rDummyRules"/>
</list>
</group>
Class: StatusLogger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="beginStatusLogger" class="StatusLogger">
<param key="outputFile" value="beginStatus-fre.log"/>
<list name="toLog">
<item value="VmSize"/>
<item value="VmData"/>
</list>
</group>
Class: SpecificEntitiesXmlLogger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="specificEntitiesXmlLogger" class="SpecificEntitiesXmlLogger">
<param key="outputSuffix" value=".se.xml"/>
<param key="graph" value="AnalysisGraph"/>
</group>
<group name="specificEntitiesXmlLoggerForLimaserver" class="SpecificEntitiesXmlLogger">
<param key="outputSuffix" value=".se.xml"/>
<param key="graph" value="AnalysisGraph"/>
<param key="compactFormat" value="yes"/>
<param key="handler" value="se"/>
<param key="followGraph" value="true"/>
</group>
Class: FullTokenXmlLogger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="fullTokenXmlLoggerTokenizer" class="FullTokenXmlLogger">
<param key="outputSuffix" value=".tokenizer.xml"/>
</group>
<group name="fullTokenXmlLoggerSimpleWord" class="FullTokenXmlLogger">
<param key="outputSuffix" value=".simpleword.xml"/>
</group>
<group name="fullTokenXmlLoggerHyphen" class="FullTokenXmlLogger">
<param key="outputSuffix" value=".hyphen.xml"/>
</group>
<group name="fullTokenXmlLoggerIdiomatic" class="FullTokenXmlLogger">
<param key="outputSuffix" value=".idiom.xml"/>
</group>
Class: SentenceBoundariesXmlLogger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="sentenceBoundariesXmlLogger" class="SentenceBoundariesXmlLogger">
<param key="outputSuffix" value=".sentences.xml"/>
</group>
<group name="fullTokenXmlLoggerDefaultProperties" class="FullTokenXmlLogger">
<param key="outputSuffix" value=".default.xml"/>
</group>
Class: WordSenseXmlLogger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="wordSenseXmlLogger" class="WordSenseXmlLogger">
<param key="outputSuffix" value=".senses.xml"/>
</group>
Class: DisambiguatedGraphXmlLogger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="disambiguatedGraphXmlLogger" class="DisambiguatedGraphXmlLogger">
<param key="outputSuffix" value=".disambiguated.xml"/>
<param key="dictionaryCode" value="dictionaryCode"/>
</group>
Class: DebugSyntacticAnalysisLogger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="debugSyntacticAnalysisLogger-chains" class="DebugSyntacticAnalysisLogger">
<param key="outputSuffix" value=".syntanal.chains.txt"/>
</group>
<group name="debugSyntacticAnalysisLogger-disamb" class="DebugSyntacticAnalysisLogger">
<param key="outputSuffix" value=".syntanal.disamb.txt"/>
</group>
<group name="debugSyntacticAnalysisLogger-deps" class="DebugSyntacticAnalysisLogger">
<param key="outputSuffix" value=".syntanal.deps.txt"/>
</group>
Class: DotGraphWriter
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="dotGraphWriter-beforepos" class="DotGraphWriter">
<param key="graph" value="AnalysisGraph"/>
<param key="outputSuffix" value=".bp.dot"/>
<param key="trigramMatrix" value="trigramMatrix"/>
<param key="bigramMatrix" value="bigramMatrix"/>
<list name="vertexDisplay">
<item value="text"/>
<item value="inflectedform"/>
<item value="symbolicmicrocategory"/>
<item value="numericmicrocategory"/>
<!--item value="genders"/>
<item value="numbers"/-->
</list>
</group>
<group name="dotGraphWriter" class="DotGraphWriter">
<param key="graph" value="PosGraph"/>
<param key="outputSuffix" value=".dot"/>
<param key="trigramMatrix" value="trigramMatrix"/>
<param key="bigramMatrix" value="bigramMatrix"/>
<list name="vertexDisplay">
<item value="text"/>
<item value="inflectedform"/>
<item value="symbolicmicrocategory"/>
<item value="numericmicrocategory"/>
<!--item value="genders"/>
<item value="numbers"/-->
</list>
</group>
Class: CorefSolvingLogger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="corefLogger" class="CorefSolvingLogger">
<param key="outputSuffix" value=".wh"/>
</group>
<group name="dotGraphWriterAfterSA" class="DotGraphWriter">
<param key="outputSuffix" value=".afterSA.dot"/>
<param key="trigramMatrix" value="trigramMatrix"/>
<param key="bigramMatrix" value="bigramMatrix"/>
<list name="vertexDisplay">
<item value="lemme"/>
<item value="symbolicmicrocategory"/>
<item value="numericmicrocategory"/>
<!--item value="genders"/>
<item value="numbers"/-->
</list>
</group>
Class: DotDependencyGraphWriter
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="dotDepGraphWriter" class="DotDependencyGraphWriter">
<param key="outputMode" value="SentenceBySentence"/> <!-- Valid values: FullGraph,SentenceBySentence -->
<param key="writeOnlyDepEdges" value="false"/>
<param key="outputSuffix" value=".sa.dot"/>
<param key="trigramMatrix" value="trigramMatrix"/>
<param key="bigramMatrix" value="bigramMatrix"/>
<list name="vertexDisplay">
<item value="inflectedform"/>
<item value="symbolicmicrocategory"/>
<item value="numericmicrocategory"/>
<!--item value="genders"/>
<item value="numbers"/-->
</list>
<map name="graphDotOptions">
<entry key="rankdir" value="LR"/>
</map>
<map name="nodeDotOptions">
<entry key="shape" value="box"/>
</map>
</group>
Class: AnnotDotGraphWriter
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="annotDotGraphWriter" class="AnnotDotGraphWriter">
<param key="graph" value="PosGraph"/>
<param key="outputSuffix" value=".ag.dot"/>
</group>
Class: LinearTextRepresentationLogger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="linearTextRepresentationLogger" class="LinearTextRepresentationLogger">
<param key="outputSuffix" value=".ltr"/>
</group>
Class: SyntacticAnalysisXmlLogger
Role: The role of this process unit is to .
Inputs:
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="syntacticAnalysisXmlLogger" class="SyntacticAnalysisXmlLogger">
<param key="outputSuffix" value=".sa.xml"/>
</group>
<group name="depTripletLogger" class="DepTripletLogger">
<param key="outputSuffix" value=".deptrip.txt"/>
<param key="stopList" value="stopList"/>
<param key="useStopList" value="no"/>
<param key="useEmptyMacro" value="no"/>
<param key="useEmptyMicro" value="no"/>
<map name="NEmacroCategories">
<entry key="DateTime.DATE" value="NC"/>
<entry key="Numex.NUMBER" value="NC"/>
<entry key="Numex.UNIT" value="NC"/>
<entry key="Numex.NUMEX" value="NC"/>
<entry key="Organization.ORGANIZATION" value="NP"/>
<entry key="Location.LOCATION" value="NP"/>
<entry key="Person.PERSON" value="NP"/>
<entry key="Product.PRODUCT" value="NP"/>
<entry key="Event.EVENT" value="NP"/>
</map>
<param key="properNounCategory" value="NP"/>
<param key="commonNounCategory" value="NC"/>
<param key="NEnormalization" value="useNENormalizedForm"/>
<list name="selectedDependency">
<item value="ADJPRENSUB"/>
<item value="APPOS"/>
<item value="ATB_O"/>
<item value="ATB_S"/>
<item value="COD_V"/>
<item value="COMPDUNOM"/>
<item value="COMPL"/>
<item value="CPL_V"/>
<item value="SUBADJPOST"/>
<item value="SUBSUBJUX"/>
<item value="SUJ_V"/>
</list>
</group>
Class: AbstractTextualAnalysisDumper
Role: This is an abstract class. You cannot use it directly. Instead, use the other dumper classes. It is documented here because all dumpers can use its parameters.
Inputs:
Parameters | |
---|---|
handler | the name of the handler process unit in the configuration file that will receive and handle the data written by the dumper |
temporaryFileMetadata | the name of the analysis metadata entry that contains the name of the file where to write. It supercedes the outputFile option and the suffix handling options |
outputFile | the name of the file where the dumper will write. It supercedes the suffix handling options |
stripInputSuffix | whether to remove the suffix of the input file before using it (when outputFile is not set) |
outputSuffix | the suffix to add to the name of the input file (when outputFile is not set) |
append | whether we will append content to the output file or erase its content if it exists |
Outputs: No outputs. This is an abstract class. See the other dumpers documentation for details.
Preconditions: None
Effects: None
Class: AnnotationGraphXmlDumper
Role: The role of this process unit is to .
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="agXmlDumper" class="AnnotationGraphXmlDumper">
<param key="handler" value="xmlSimpleStreamHandler"/>
</group>
<group name="normalizationBowDumper" class="BowDumper">
<param key="handler" value="bowTextWriter"/>
<param key="stopList" value="stopList"/>
<param key="useStopList" value="false"/>
<param key="useEmptyMacro" value="false"/>
<param key="useEmptyMicro" value="false"/>
<map name="NEmacroCategories">
<entry key="DateTime.DATE" value="NC"/>
<entry key="Numex.NUMBER" value="NC"/>
<entry key="Numex.UNIT" value="NC"/>
<entry key="Numex.NUMEX" value="NC"/>
<entry key="Organization.ORGANIZATION" value="NP"/>
<entry key="Location.LOCATION" value="NP"/>
<entry key="Person.PERSON" value="NP"/>
<entry key="Product.PRODUCT" value="NP"/>
<entry key="Event.EVENT" value="NP"/>
</map>
<param key="properNounCategory" value="NP"/>
<param key="commonNounCategory" value="NC"/>
<param key="NEnormalization" value="useNENormalizedForm"/>
</group>
Class: BowDumper
Role: The role of this process unit is to .
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="bowDumper" class="BowDumper">
<param key="handler" value="bowTextWriter"/>
<param key="stopList" value="stopList"/>
<param key="useStopList" value="true"/>
<param key="useEmptyMacro" value="true"/>
<param key="useEmptyMicro" value="true"/>
<map name="NEmacroCategories">
<entry key="DateTime.DATE" value="NC"/>
<entry key="Numex.NUMBER" value="NC"/>
<entry key="Numex.UNIT" value="NC"/>
<entry key="Numex.NUMEX" value="NC"/>
<entry key="Organization.ORGANIZATION" value="NP"/>
<entry key="Location.LOCATION" value="NP"/>
<entry key="Person.PERSON" value="NP"/>
<entry key="Product.PRODUCT" value="NP"/>
<entry key="Event.EVENT" value="NP"/>
</map>
<param key="properNounCategory" value="NP"/>
<param key="commonNounCategory" value="NC"/>
<param key="NEnormalization" value="useNENormalizedForm"/>
</group>
<group name="bowTextHandler" class="BowDumper">
<param key="handler" value="bowTextHandler"/>
<param key="stopList" value="stopList"/>
<param key="useStopList" value="true"/>
<param key="useEmptyMacro" value="true"/>
<param key="useEmptyMicro" value="true"/>
<map name="NEmacroCategories">
<entry key="DateTime.DATE" value="NC"/>
<entry key="Numex.NUMBER" value="NC"/>
<entry key="Numex.UNIT" value="NC"/>
<entry key="Numex.NUMEX" value="NC"/>
<entry key="Organization.ORGANIZATION" value="NP"/>
<entry key="Location.LOCATION" value="NP"/>
<entry key="Person.PERSON" value="NP"/>
<entry key="Product.PRODUCT" value="NP"/>
<entry key="Event.EVENT" value="NP"/>
</map>
<param key="properNounCategory" value="NP"/>
<param key="commonNounCategory" value="NC"/>
<param key="NEnormalization" value="useNENormalizedForm"/>
</group>
<group name="textQueryHandler" class="BowDumper">
<param key="handler" value="bowTextHandler"/>
<!-- <param key="handler" value="bowTextWriter"/> -->
<param key="stopList" value="stopList"/>
<param key="useStopList" value="true"/>
<param key="useEmptyMacro" value="true"/>
<param key="useEmptyMicro" value="true"/>
<map name="NEmacroCategories">
<entry key="DateTime.DATE" value="NC"/>
<entry key="Numex.NUMBER" value="NC"/>
<entry key="Numex.UNIT" value="NC"/>
<entry key="Numex.NUMEX" value="NC"/>
<entry key="Organization.ORGANIZATION" value="NP"/>
<entry key="Location.LOCATION" value="NP"/>
<entry key="Person.PERSON" value="NP"/>
<entry key="Product.PRODUCT" value="NP"/>
<entry key="Event.EVENT" value="NP"/>
</map>
<param key="properNounCategory" value="NP"/>
<param key="commonNounCategory" value="NC"/>
<param key="NEnormalization" value="useNENormalizedForm"/>
</group>
<group name="bowDocumentDumper" class="BowDumper">
<param key="handler" value="bowDocumentHandler"/>
<param key="stopList" value="stopList"/>
<param key="useStopList" value="false"/>
<param key="useEmptyMacro" value="false"/>
<param key="useEmptyMicro" value="false"/>
<map name="NEmacroCategories">
<entry key="DateTime.DATE" value="NC"/>
<entry key="Numex.NUMBER" value="NC"/>
<entry key="Numex.UNIT" value="NC"/>
<entry key="Numex.NUMEX" value="NC"/>
<entry key="Organization.ORGANIZATION" value="NP"/>
<entry key="Location.LOCATION" value="NP"/>
<entry key="Person.PERSON" value="NP"/>
<entry key="Product.PRODUCT" value="NP"/>
<entry key="Event.EVENT" value="NP"/>
</map>
<param key="properNounCategory" value="NP"/>
<param key="commonNounCategory" value="NC"/>
<param key="NEnormalization" value="useNENormalizedForm"/>
</group>
<group name="bowTextDumper" class="BowDumper">
<param key="handler" value="bowTextHandler"/>
<param key="stopList" value="stopList"/>
<param key="useStopList" value="false"/>
<param key="useEmptyMacro" value="false"/>
<param key="useEmptyMicro" value="false"/>
<map name="NEmacroCategories">
<entry key="DateTime.DATE" value="NC"/>
<entry key="Numex.NUMBER" value="NC"/>
<entry key="Numex.UNIT" value="NC"/>
<entry key="Numex.NUMEX" value="NC"/>
<entry key="Organization.ORGANIZATION" value="NP"/>
<entry key="Location.LOCATION" value="NP"/>
<entry key="Person.PERSON" value="NP"/>
<entry key="Product.PRODUCT" value="NP"/>
<entry key="Event.EVENT" value="NP"/>
</map>
<param key="properNounCategory" value="NP"/>
<param key="commonNounCategory" value="NC"/>
<param key="NEnormalization" value="useNENormalizedForm"/>
</group>
Class: ConllDumper
Role: The role of this process unit is to dump in a stream the result of the syntaxic analysis of sentences following the CoNLL-U format.
See Universal Dependencies for details on the CoNLL-U format.
The UPOS
and XPOS
values of the LIMA ConllDumper are not exactly compliant with the vanilla format.
For each language, see in lima_linguisticdata/analysisDictionary/<lang>/code/code-<lang>.xml
the possible values of MACRO
and MICRO
tags that LIMA will dump respectively as the UPOS
and XPOS
tags.
The ConllDumper can be configured via the boolean withColsHeader
option to write header lines giving the column names of the CoNLL-U format as a reminder.
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
outputSuffix | default value is '.conll' |
Outputs:
Preconditions:
Effects:
Class: DepTripleDumper
Role: The role of this process unit is to .
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="depTripleDumper" class="DepTripleDumper">
<param key="handler" value="simpleStreamHandler"/>
<list name="selectedDependency">
<item value="ADJPRENSUB"/>
<!--item value="ADVADV"/-->
<!--item value="AdvSub"/-->
<item value="APPOS"/>
<item value="ATB_O"/>
<item value="ATB_S"/>
<item value="COD_V"/>
<!--item value="COMPADJ"/-->
<!--item value="COMPADV"/-->
<!--item value="CompDet"/-->
<item value="COMPDUNOM"/>
<item value="COMPL"/>
<!--item value="COORD1"/-->
<!--item value="COORD2"/-->
<item value="CPL_V"/>
<!--item value="DETSUB"/-->
<!--item value="MOD_A"/-->
<!--item value="MOD_N"/-->
<!--item value="MOD_V"/-->
<!--item value="Neg"/-->
<!--item value="PrepDet"/-->
<!--item value="PrepPron"/-->
<!--item value="PREPSUB"/-->
<item value="SUBADJPOST"/>
<item value="SUBSUBJUX"/>
<item value="SUJ_V"/>
</list>
</group>
Class: EasyXmlDumper
Role: The role of this process unit is to .
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
<group name="easyXmlDumper" class="EasyXmlDumper">
<param key="handler" value="simpleStreamHandler"/>
<map name="typeMapping">
<entry key="COMPDUNOM" value="MOD-N"/>
<entry key="ADJPRENSUB" value="MOD-N"/>
<entry key="SUBADJPOST" value="MOD-N"/>
<entry key="SUBSUBJUX" value="MOD-N"/>
<entry key="TEMPCOMP" value="AUX-V"/>
<entry key="SujInv" value="SUJ-V"/>
<entry key="CodPrev" value="COD-V"/>
<entry key="CoiPrev" value="CPL-V"/>
<entry key="PronSujVerbe" value="SUJ-V"/>
<entry key="ADVADV" value="MOD-R"/>
<entry key="ADVADJ" value="MOD-A"/>
<entry key="NePas2" value="MOD-V"/>
<entry key="AdvVerbe" value="MOD-V"/>
<entry key="COMPADJ" value="MOD-A"/>
<!--entry key="Neg" value="MOD-V"/-->
<!--change '_' to '-' -->
<entry key="SUJ_V" value="SUJ-V"/>
<entry key="SUJ_V_REL" value="SUJ-V"/>
<entry key="COD_V" value="COD-V"/>
<entry key="CPL_V" value="CPL-V"/>
<entry key="CPLV_V" value="CPL-V"/>
<entry key="MOD_V" value="MOD-V"/>
<entry key="MOD_N" value="MOD-N"/>
<entry key="MOD_A" value="MOD-A"/>
<entry key="ATB_S" value="ATB-SO,s-o valeur=sujet"/>
<entry key="ATB_O" value="ATB-SO,s-o valeur=objet"/>
<entry key="COORD1" value="COORD"/>
<entry key="COORD2" value="COORD"/>
<entry key="COMPL" value="COMP"/>
<entry key="JUXT" value="JUXT"/>
</map>
<map name="srcTag">
<entry key="MOD-N" value="modifieur"/>
<entry key="MOD-V" value="modifieur"/>
<entry key="SUJ-V" value="sujet"/>
<entry key="AUX-V" value="auxiliaire"/>
<entry key="COD-V" value="cod"/>
<entry key="CPL-V" value="complement"/>
<entry key="MOD-R" value="modifieur"/>
<entry key="APPOS" value="premier"/>
<entry key="JUXT" value="suivant"/>
<entry key="ATB-SO" value="attribut"/>
<entry key="MOD-A" value="modifieur"/>
<entry key="COMP" value="complementeur"/>
<entry key="COORD" value="coordonnant"/>
</map>
<map name="tgtTag">
<entry key="MOD-N" value="nom"/>
<entry key="MOD-V" value="verbe"/>
<entry key="SUJ-V" value="verbe"/>
<entry key="AUX-V" value="verbe"/>
<entry key="COD-V" value="verbe"/>
<entry key="CPL-V" value="verbe"/>
<entry key="MOD-R" value="adverbe"/>
<entry key="APPOS" value="appose"/>
<entry key="JUXT" value="premier"/>
<entry key="ATB-SO" value="verbe"/>
<entry key="MOD-A" value="adjectif"/>
<entry key="COMP" value="verbe"/>
<entry key="COORD" value="coord-g"/>
</map>
</group>
Class: FullXmlDumper
Role: The role of this process unit is to .
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
Class: GeoDumper
Role: The role of this process unit is to .
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
graph | PosGraph |
Outputs:
Preconditions:
Effects:
Class: LTRDumper
Role: The role of this process unit is to .
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
handler | default is 'simpleStreamHandler' |
Outputs:
Preconditions:
Effects:
Class: NullDumper
Role: The role of this process unit is to dump nothing.
Inputs: See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Outputs: None. Preconditions: None.
Effects: None.
Class: posGraphXmlDumper
Role: The role of this process unit is to .
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
Class: SimpleXmlDumper
Role: The role of this process unit is to .
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
Outputs:
Preconditions:
Effects:
Class: TextDumper
Role: The role of this process unit is to .
Inputs:
See AbstractTextualAnalysisDumper for parameters common to all dumpers.
Parameters | |
---|---|
outputSuffix | defaultValue is '.out' |
Outputs:
Preconditions:
Effects:
Table of Contents generated with DocToc