Skip to content
Benjamin Labbe edited this page Mar 13, 2019 · 19 revisions

Table of Contents generated with DocToc

Process units documentation

The core of LIMA is the execution of process units in pipeline. This page is the technical documentation of each process unit available in the standard LIMA distribution. Thanks to its plugins mechanism, LIMA can be extended with new features. These will have their own documentation.

For each process unit, we describe:

  • class: the identifier used to instantiate the corresponding C++ class;
  • role: the description of this process unit role;
  • inputs: the state of the LIMA data structures necessary to run this process unit and the parameters available to modify the behavior of this processs unit;
  • outputs: the kind of data written to files or standard output;
  • preconditions: the sate of the data structures that must be reached before running this process unit;
  • effects: the changes made to the LIMA data structures by the execution of this process unit.

FlatTokenizer

Class: FlatTokenizer

Role: The role of this process unit is to split the input text in tokens. It uses for this an automaton allowing a rich behavior, far away a simple tokenization on white spaces. It is usually the first element of the pipeline.

Inputs: an AnalysisContent containing the initial text.

Parameters
automatonFile the path to the tokenization automaton file to use, relative to the main resources folder
charChart The name of a group in the Resources module. This defines a resource of class FlatTokenizerCharChart with a parameter named charFile giving the path to the chars chart file to use, relative to the main resources folder

Outputs: an AnalysisContent

Preconditions: the AnalysisContent must contain an AnalysisData of type LimaStringText named "Text"

Effects: the AnalysisContent will contain an AnalysisData of type AnalysisGraph named "AnalysisGraph" which is a linear graph (a string) containing one vertex for each detected token.

EnchantSpellingAlternatives

Class: EnchantSpellingAlternatives

Role: Use the enchant spell checker to find corrections for tokens not found in the dictionary.

Inputs: the AnalysisGraph.

Parameters
dictionary the LIMA dictionary resource (usually mainDictionary) where to search for suggestions by Enchant

Outputs: the same AnalysisGraph enriched with spelling corrections

Preconditions: the AnalysisGraph must already exist

Effects: the AnalysisGraph will tokens that had no linguistic information are enriched with spelling alternatives.

Notes:

  • This process unit is available only if the Enchant spell checker has been found at compile time.

RegexMatcher

Class: RegexMatcher

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="regexmatcher" class="RegexMatcher">
      <map name="regexes">
        <entry key="[\w\-_]+(\.[\w\-_]+)*\@[\w\-_](\.[\w\-_]+)+" value="t_url"/>
        <entry key="((mailto|http|ftp|https):\/\/)?[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:/~\+#]*[\w\-\@?^=%&amp;/~\+#])?" value="t_url"/>
      </map>
    </group>

SimpleWord

Class: SimpleWord

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="simpleWord" class="SimpleWord">
            <param key="dictionary" value="mainDictionary"/>
        <param key="confidentMode" value="true"/>
        <param key="charChart" value="flatcharchart"/>
        <param key="parseConcatenated" value="false"/>
    </group>

CoreferencesSolving

Class: CoreferencesSolving

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="coreferencesSolving" class="CoreferencesSolving">
      <param key="scope" value="3" />
      <param key="threshold" value="60" />
      <param key="Resolve Definites" value="0" />
      <param key="Resolve non third person pronouns" value="0" />
      <map name="MacroCategories">
        <entry key="PronMacroCategory" value="PRON"/>
        <entry key="VerbMacroCategory" value="V" />
        <entry key="PrepMacroCategory" value="PREP" />
        <entry key="NomCommunMacroCategory" value="NC" />
        <entry key="NomPropreMacroCategory" value="NP" />
      </map>
      <list name="LexicalAnaphora">
        <item value="CLR"/>
      </list>
      <list name="UndefinitePronouns">
        <!--item value="PRON_INDEFINI"/>
        <item value="PRON_INDEFINI_VAL_NEG"/-->
      </list>
      <list name="PossessivePronouns">
        <!--item value="PRON_POSSESSIF_SUJET" />
        <item value="PRON_POSSESSIF_COD" />
        <item value="PRON_POSSESSIF_COI" /-->
      </list>
      <list name="PrepRelation">
        <item value="PREPSUB"/>
        <item value="PrepDetInt"/>
        <item value="PrepInf"/>
        <item value="PrepPronRelCa"/>
        <item value="PrepPron"/>
        <item value="PrepPronRel"/>
        <item value="PrepPronCliv"/>
        <item value="PrepAdv"/>
      </list>
    <list name="PleonasticRelation">
      <item value="Pleon"/>
    </list>
    <list name="DefiniteRelation">
      <item value="DETSUB"/>
    </list>
    <list name="SubjectRelation">
      <item value="SUJ_V" />
      <item value="SUJ_V_REL" />
      <item value="PronSujVerbe" />
      <item value="SujInv" />
    </list>
    <list name="AttributeRelation">
      <item value="ATB_S"/>
    </list>
    <list name="CODRelation">
      <item value="COD_V" />
      <item value="CodPrev" />
      <item value="PronReflVerbe" />
    </list>
    <list name="COIRelation">
      <item value="CPL_V" />
      <item value="CoiPrev" />
    </list>
    <list name="AdjunctRelation">
          <item value="CPLV_V" />
          <item value="CC_TEMPS" />
          <item value="CC_LIEU" />
           <item value="CC_BUT" />
          <item value="CC_MOYEN" />
          <item value="CC_MANIERE" />
          <item value="COMPADJ" />
          <item value="COMPADV" />
      </list>
      <list name="AgentRelation">
          <item value="COMPADJ" />
      </list>
      <list name="NPDeterminerRelation">
          <item value="COMPDUNOM" />
           <item value="COMPDUNOM2" />
          <item value="SUBSUBJUX" />
          <item value="COMP_N-N" />
          <item value="COMPDUNOM_INC" />
      </list>
      <!-- Lappin & Leass salience factors -->
      <map name="SalienceFactors">
        <entry key="SentenceRecency" value="90"/>
        <entry key="SubjEmph" value="90"/>
        <entry key="ExistEmph" value="70"/>
        <entry key="CodEmph" value="50"/>
        <entry key="CoiCoblEmph" value="40"/>
        <entry key="HeadEmph" value="80"/>
        <entry key="NonAdvEmph" value="50"/>
        <entry key="IsInSubordinate" value="-70"/>
        <!-- local factors -->
        <entry key="Cataphora" value="-120"/>
        <entry key="SameSlot" value="90"/>
        <entry key="Itself" value="-140"/>
      </map>
      <map name="SlotValues">
        <entry key="SubjectRelation" value="4"/>
        <entry key="AgentRelation" value="3"/>
        <entry key="CODRelation" value="2"/>
        <entry key="COIRelation" value="1"/>
        <entry key="AdjunctRelation" value="1"/>
     </map>
    </group>

WordSenseDisambiguation

Class: WordSenseDisambiguation

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="wordSenseDisambiguation" class="WordSenseDisambiguation" >
      <!--param key="mode" value="b_Romanseval_most_frequent"/>     
      <param key="sensesPath" value="/home/cm218888/opendata/romanseval_data/SenseInventory" /-->      
      <param key="mode" value="b_Jaws_most_frequent"/>
      <param key="sensesPath" value="/home/cm218888/otherdata/jaws-1.0/SenseInventory" />      
      <!--param key="mode" value="s_Wsi_mrd"/>
      <param key="sensesPath" value="/home/cm218888/otherdata/wsi/clustersBin" />      
      <param key="mapping" value="m_Jaws_senses" />      
      <param key="mappingFile" value="mapping.txt" /-->
      <param key="dictionaryFile" value="/home/cm218888/otherdata/words.ids" />
      <param key="bestNNDir" value="knnall" />  
      <list name="NounContextList">
        <item value="COD_V"/>
        <!--item value="SUJ_V"/>
        <item value="COMPDUNOM"/>
        <item value="COMPDUNOM.reverse"/>
        <item value="SUBADJPOST.reverse"/>
        <item value="ADJPRENSUB.reverse"/>
        <item value="window5"/>
        <item value="window20"/-->
      </list>
      <map name="knnsearchConfig">
        <entry key="hashedDir" value="/home/cm218888/otherdata/hasheddb"/>
        <entry key="totalPermutations" value="10" />
        <entry key="beam" value="20" />
        <entry key="k" value="50" />
      </map>
    </group>

HyphenWordAlternatives

Class: HyphenWordAlternatives

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="hyphenWordAlternatives" class="HyphenWordAlternatives">
      <param key="dictionary" value="mainDictionary"/>
      <param key="charChart" value="flatcharchart"/>
      <param key="tokenizer" value="flattokenizer"/>
    </group>

GeoEntitiesTagger

Class: GeoEntitiesTagger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="geoEntities" class="GeoEntitiesTagger">
      <param key="charChart" value="flatcharchart"/>
      <param key="dbms" value="mysql"/>
      <param key="dbConnection" value="dbname=GAZETIKI_DB user=gazetiki password=gazpwd"/>
      <param key="maxEntityLength" value="10" />
      <param key="graph" value="PosGraph"/>
      <param key="fieldClass" value="CLASS_3"/>
      <map name="Trigger">
        <!--entry key="t_capital_1st" value="Status" unlessSatusBefore="t_sentence_brk" unlessMicroBefore="PONCTU_PARAGRAPHE" unlessFirstToken="YES"/-->
        <entry key="t_capital_1st" value="Status" unlessSatusBefore="t_sentence_brk" unlessMicroBefore="" unlessFirstToken="YES"/>
        <entry key="NP" value="Micro" unlessSatusBefore="" unlessMicroBefore="" unlessFirstToken="NO"/>
      </map>
      <map name="EndWord">
        <!--entry key="PONCTU_PARAGRAPHE" value="Micro" unlessSatusBefore="" unlessMicroBefore="" unlessFirstToken="NO"/-->
        <entry key="T_COMMA_NUMBER" value="Status" unlessSatusBefore="" unlessMicroBefore="" unlessFirstToken="NO"/>
      </map>
    </group>

ApplyRecognizer

Class: ApplyRecognizer

Role: The role of this process unit is to apply compiled recognition rules. The specification of the rules source format is described elsewhere.

This kind of process unit and rules is used extensively in LIMA, for things like idiomatic expressions or named entities. But also for parsing and other things.

Inputs:

Parameters
automaton
automatonList
useSentenceBounds
applyOnGraph
updateGraph
resolveOverlappingEntities
overlappingEntitiesStrategy
storeInData
testAllVertices if true, test all vertices, otherwise, skip recognized expressions (default is false)
stopAtFirstSuccess if true, stop testing rules on the current node after one rule succeeded (default is true)
onlyOneSuccessPerType if true, stop testing rules with same type as a previously successful rule (only used if stopAtFirstSuccess is false) (default is false)
returnAtFirstSuccess if true, abort the search as soona rule is successful (if true, stopAtFirstSuccess will be set to true) (default is false)
applySameRuleWhileSuccess if true, when a rule succeeds, retry to apply it on same vertex until the rule does not apply (use with care: setting this argument to true may cause loops if rules are not well written). Will not apply if stopAtFirstSuccess. (default is false)

Outputs:

Preconditions:

Effects:

DefaultProperties

Class: DefaultProperties

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="defaultProperties" class="DefaultProperties">
      <param key="dictionary" value="mainDictionary"/>
      <param key="charChart" value="flatcharchart"/>
      <param key="defaultPropertyFile" value="LinguisticProcessings/fre/default-fre.dat"/>
      <list name="skipUnmarkStatus">
        <item value="t_dot_number"/>
        <item value="t_capital_1st"/>
      </list>
    </group>

SimpleDefaultProperties

Class: SimpleDefaultProperties

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="simpleDefaultProperties" class="SimpleDefaultProperties">
      <list name="defaultCategories">
        <item value="NP NP"/>
      </list>
    </group>

ViterbiPosTagger

Class: ViterbiPosTagger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="viterbiPostagger-freq" class="ViterbiPosTagger">
      <param key="trigramFile" value="Disambiguation/trigramMatrix-fre.dat"/>
      <param key="bigramFile" value="Disambiguation/bigramMatrix-fre.dat"/>
      <param key="costFunction" value="FrequencyCost"/>
      <param key="defaultCategory" value="PONCTU_FORTE"/>
      <list name="stopCategories">
        <item value="PONCTU_FORTE" />
      </list>
    </group>
    <group name="viterbiPostagger-int" class="ViterbiPosTagger">
      <param key="trigramFile" value="Disambiguation/trigramMatrix-fre.dat"/>
      <param key="bigramFile" value="Disambiguation/bigramMatrix-fre.dat"/>
      <param key="costFunction" value="IntegerCost"/>
      <param key="defaultCategory" value="PONCTU_FORTE"/>
      <list name="stopCategories">
        <item value="PONCTU_FORTE" />
      </list>
    </group>
    <group name="viterbiPostagger-int-none" class="ViterbiPosTagger">
      <param key="trigramFile" value="Disambiguation/trigramMatrix-fre.dat"/>
      <param key="bigramFile" value="Disambiguation/bigramMatrix-fre.dat"/>
      <param key="costFunction" value="IntegerCost"/>
      <param key="defaultCategory" value="NONE_1"/>
      <list name="stopCategories">
        <item value="PONCTU_FORTE" />
      </list>
    </group>

SvmToolPosTagger

Class: SvmToolPosTagger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="SvmToolPosTagger" class="SvmToolPosTagger">
      <param key="model" value="Disambiguation/SVMToolModel-fre/lima"/>
      <param key="defaultCategory" value="PONCTU_FORTE"/>
      <list name="stopCategories">
        <item value="PONCTU_FORTE" />
      </list>
    </group>

DynamicSvmToolPosTagger

Class: DynamicSvmToolPosTagger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="DynamicSvmToolPosTagger" class="DynamicSvmToolPosTagger">
      <param key="model" value="Disambiguation/SVMToolModel-fre/lima"/>
      <param key="defaultCategory" value="PONCTU_FORTE"/>
      <list name="stopCategories">
        <item value="PONCTU_FORTE" />
      </list>
    </group>

SentenceBoundariesFinder

Class: SentenceBoundariesFinder

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="sentenceBoundariesFinder" class="SentenceBoundariesFinder">
      <param key="graph" value="PosGraph"/>
      <list name="micros">
        <item value="PONCTU_FORTE" />  
      </list>
    </group>

SyntacticAnalyzerChains

Class: SyntacticAnalyzerChains

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="syntacticAnalyzerChains" class="SyntacticAnalyzerChains">
      <param key="chainMatrix" value="chainMatrix"/>
      <param key="maxChainsNbByVertex" value="30"/>
      <param key="maxChainLength" value="12"/>
    </group>

SyntacticAnalyzerNoChains

Class: SyntacticAnalyzerNoChains

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <!-- syntacticAnalyzerNoChains replaces syntacticAnalyzerChains. It is an
    experimental module used to test if LIMA analysis works without nominal and
    verbal. It allows also to build compounds using verbs and heterosyntagmatic
    dependencies. For that, one have to add adequate relations in
    CompoundRelations in mm-common. -->
    <group name="syntacticAnalyzerNoChains" class="SyntacticAnalyzerNoChains">
      <param key="chainMatrix" value="chainMatrix"/>
      <param key="disambiguated" value="true"/>
      <param key="maxChainsNbByVertex" value="30"/>
      <param key="maxChainLength" value="12"/>
    </group>

SyntacticAnalyzerDisamb

Class: SyntacticAnalyzerDisamb

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="syntacticAnalyzerDisamb" class="SyntacticAnalyzerDisamb">
      <param key="depGraphMaxBranchingFactor" value="100"/>
    </group>

SyntacticAnalyzerDeps

Class: SyntacticAnalyzerDeps

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="syntacticAnalyzerDeps" class="SyntacticAnalyzerDeps">
      <list name="actions">
          <item value="pass0HomoSyntagmaticRelationRules"/>
          <item value="pass1HomoSyntagmaticRelationRules"/>
          <item value="pass2HomoSyntagmaticRelationRules"/>
          <item value="pleonasticPronouns"/>
          <item value="compoundTensesRules"/>
      </list>
      <param key="applySameRuleWhileSuccess" value="true"/>
    </group>

SyntacticAnalyzerSimplify

Class: SyntacticAnalyzerSimplify

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="syntacticAnalyzerSimplifyFirst" class="SyntacticAnalyzerSimplify">
      <param key="simplifyAutomaton" value="simplifyAutomatonFirst"/>
    </group>
    <group name="syntacticAnalyzerSimplify" class="SyntacticAnalyzerSimplify">
      <param key="simplifyAutomaton" value="simplifyAutomaton"/>
    </group>
    <group name="syntacticAnalyzerSimplifyCoord" class="SyntacticAnalyzerSimplify">
      <param key="simplifyAutomaton" value="simplifyAutomatonCoord"/>
    </group>
    <group name="syntacticAnalyzerSimplifyLast" class="SyntacticAnalyzerSimplify">
      <param key="simplifyAutomaton" value="simplifyAutomatonLast"/>
    </group>

SyntacticAnalyzerDepsHetero

Class: SyntacticAnalyzerDepsHetero

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="syntacticAnalyzerDepsHetero" class="SyntacticAnalyzerDepsHetero">
      <param key="rules" value="heteroSyntagmaticRelationRules"/>
      <param key="selectionalPreferences" value="selectionalPreferences"/>
      <param key="unfold" value="true"/>
      <param key="linkSubSentences" value="true"/>
      <param key="applySameRuleWhileSuccess" value="true"/>
      <map name="subSentencesRules">
        <entry key="SubSent" value="heteroSyntagmaticRelationRules"/>
        <entry key="SubordRel" value="heteroSyntagmaticRelationRules"/>
        <entry key="Parent" value="heteroSyntagmaticRelationRules"/>
        <entry key="Quotes" value="heteroSyntagmaticRelationRules"/>
        <entry key="VirguleSeule" value="heteroSyntagmaticRelationRules"/>
        <entry key="Appos" value="heteroSyntagmaticRelationRules"/>
        <entry key="AdvSeul" value="heteroSyntagmaticRelationRules"/>
        <entry key="AdvInit" value="heteroSyntagmaticRelationRules"/>
        <entry key="CompAdv" value="heteroSyntagmaticRelationRules"/>
        <entry key="Adverbe" value="heteroSyntagmaticRelationRules"/>
        <entry key="ConjInfSecond" value="heteroSyntagmaticRelationRules"/>
        <entry key="CCInit" value="heteroSyntagmaticRelationRules"/>
        <entry key="Infinitive" value="heteroSyntagmaticRelationRules"/>
        <entry key="SUBSUBJUX" value="heteroSyntagmaticRelationRules"/>
        <entry key="CompDuNom1" value="heteroSyntagmaticRelationRules"/>
        <entry key="CompDuNom2" value="heteroSyntagmaticRelationRules"/>
        <entry key="CompAdj1" value="heteroSyntagmaticRelationRules"/>
        <entry key="CompAdj2" value="heteroSyntagmaticRelationRules"/>
        <entry key="SubordParticipiale" value="heteroSyntagmaticRelationRules"/>
        <entry key="ElemListe" value="heteroSyntagmaticRelationRules"/>
        <entry key="ConjSecond" value="heteroSyntagmaticRelationRules"/>
        <entry key="InciseNom" value="heteroSyntagmaticRelationRules"/>
        <entry key="CompCirc" value="heteroSyntagmaticRelationRules"/>
        <entry key="SubordInit" value="heteroSyntagmaticRelationRules"/>
        <entry key="ConjNominale" value="heteroSyntagmaticRelationRules"/>
      </map>
    </group>
    <group name="syntacticAnalyzerDummy" class="SyntacticAnalyzerDeps">
      <list name="actions">
        <item value="l2rDummyRules"/>
      </list>
    </group>

StatusLogger

Class: StatusLogger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="beginStatusLogger" class="StatusLogger">
      <param key="outputFile" value="beginStatus-fre.log"/>
      <list name="toLog">
        <item value="VmSize"/>
        <item value="VmData"/>
      </list>
    </group>

SpecificEntitiesXmlLogger

Class: SpecificEntitiesXmlLogger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="specificEntitiesXmlLogger" class="SpecificEntitiesXmlLogger">
      <param key="outputSuffix" value=".se.xml"/>
      <param key="graph" value="AnalysisGraph"/>
    </group>
    <group name="specificEntitiesXmlLoggerForLimaserver" class="SpecificEntitiesXmlLogger">
      <param key="outputSuffix" value=".se.xml"/>
      <param key="graph" value="AnalysisGraph"/>
      <param key="compactFormat" value="yes"/>
      <param key="handler" value="se"/>
      <param key="followGraph" value="true"/>
    </group>

FullTokenXmlLogger

Class: FullTokenXmlLogger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="fullTokenXmlLoggerTokenizer" class="FullTokenXmlLogger">
      <param key="outputSuffix" value=".tokenizer.xml"/>
    </group>
    <group name="fullTokenXmlLoggerSimpleWord" class="FullTokenXmlLogger">
      <param key="outputSuffix" value=".simpleword.xml"/>
    </group>
    <group name="fullTokenXmlLoggerHyphen" class="FullTokenXmlLogger">
      <param key="outputSuffix" value=".hyphen.xml"/>
    </group>
    <group name="fullTokenXmlLoggerIdiomatic" class="FullTokenXmlLogger">
      <param key="outputSuffix" value=".idiom.xml"/>
    </group>

SentenceBoundariesXmlLogger

Class: SentenceBoundariesXmlLogger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="sentenceBoundariesXmlLogger" class="SentenceBoundariesXmlLogger">
      <param key="outputSuffix" value=".sentences.xml"/>
    </group>
    <group name="fullTokenXmlLoggerDefaultProperties" class="FullTokenXmlLogger">
      <param key="outputSuffix" value=".default.xml"/>
    </group>    

WordSenseXmlLogger0

Class: WordSenseXmlLogger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="wordSenseXmlLogger" class="WordSenseXmlLogger">
      <param key="outputSuffix" value=".senses.xml"/>
    </group>

DisambiguatedGraphXmlLogger

Class: DisambiguatedGraphXmlLogger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="disambiguatedGraphXmlLogger" class="DisambiguatedGraphXmlLogger">
      <param key="outputSuffix" value=".disambiguated.xml"/>
      <param key="dictionaryCode" value="dictionaryCode"/>
    </group>

DebugSyntacticAnalysisLogger

Class: DebugSyntacticAnalysisLogger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="debugSyntacticAnalysisLogger-chains" class="DebugSyntacticAnalysisLogger">
      <param key="outputSuffix" value=".syntanal.chains.txt"/>
    </group>
    <group name="debugSyntacticAnalysisLogger-disamb" class="DebugSyntacticAnalysisLogger">
      <param key="outputSuffix" value=".syntanal.disamb.txt"/>
    </group>
    <group name="debugSyntacticAnalysisLogger-deps" class="DebugSyntacticAnalysisLogger">
      <param key="outputSuffix" value=".syntanal.deps.txt"/>
    </group>

DotGraphWriter

Class: DotGraphWriter

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="dotGraphWriter-beforepos" class="DotGraphWriter">
      <param key="graph" value="AnalysisGraph"/>
      <param key="outputSuffix" value=".bp.dot"/>
      <param key="trigramMatrix" value="trigramMatrix"/>
      <param key="bigramMatrix" value="bigramMatrix"/>
      <list name="vertexDisplay">
        <item value="text"/>
        <item value="inflectedform"/>
        <item value="symbolicmicrocategory"/>
        <item value="numericmicrocategory"/>
        <!--item value="genders"/>
      <item value="numbers"/-->
      </list>
    </group>
    <group name="dotGraphWriter" class="DotGraphWriter">
      <param key="graph" value="PosGraph"/>
      <param key="outputSuffix" value=".dot"/>
      <param key="trigramMatrix" value="trigramMatrix"/>
      <param key="bigramMatrix" value="bigramMatrix"/>
      <list name="vertexDisplay">
        <item value="text"/>
        <item value="inflectedform"/>
        <item value="symbolicmicrocategory"/>
        <item value="numericmicrocategory"/>
        <!--item value="genders"/>
      <item value="numbers"/-->
      </list>
    </group>

CorefSolvingLogger

Class: CorefSolvingLogger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="corefLogger" class="CorefSolvingLogger">
      <param key="outputSuffix" value=".wh"/>
    </group>
    <group name="dotGraphWriterAfterSA" class="DotGraphWriter">
      <param key="outputSuffix" value=".afterSA.dot"/>
      <param key="trigramMatrix" value="trigramMatrix"/>
      <param key="bigramMatrix" value="bigramMatrix"/>
      <list name="vertexDisplay">
        <item value="lemme"/>
        <item value="symbolicmicrocategory"/>
        <item value="numericmicrocategory"/>
        <!--item value="genders"/>
        <item value="numbers"/-->
      </list>
    </group>

DotDependencyGraphWriter

Class: DotDependencyGraphWriter

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="dotDepGraphWriter" class="DotDependencyGraphWriter">
      <param key="outputMode" value="SentenceBySentence"/> <!-- Valid values: FullGraph,SentenceBySentence -->
      <param key="writeOnlyDepEdges" value="false"/>
      <param key="outputSuffix" value=".sa.dot"/>
      <param key="trigramMatrix" value="trigramMatrix"/>
      <param key="bigramMatrix" value="bigramMatrix"/>
      <list name="vertexDisplay">
        <item value="inflectedform"/>
        <item value="symbolicmicrocategory"/>
        <item value="numericmicrocategory"/>
        <!--item value="genders"/>
        <item value="numbers"/-->
      </list>
      <map name="graphDotOptions">
        <entry key="rankdir" value="LR"/>
      </map>
      <map name="nodeDotOptions">
        <entry key="shape" value="box"/>
      </map>
    </group>

AnnotDotGraphWriter

Class: AnnotDotGraphWriter

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="annotDotGraphWriter" class="AnnotDotGraphWriter">
      <param key="graph" value="PosGraph"/>
      <param key="outputSuffix" value=".ag.dot"/>
    </group>

LinearTextRepresentationLogger

Class: LinearTextRepresentationLogger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="linearTextRepresentationLogger" class="LinearTextRepresentationLogger">
      <param key="outputSuffix" value=".ltr"/>
    </group>

SyntacticAnalysisXmlLogger

Class: SyntacticAnalysisXmlLogger

Role: The role of this process unit is to .

Inputs:

Parameters

Outputs:

Preconditions:

Effects:

    <group name="syntacticAnalysisXmlLogger" class="SyntacticAnalysisXmlLogger">
      <param key="outputSuffix" value=".sa.xml"/>
    </group>
    <group name="depTripletLogger" class="DepTripletLogger">
      <param key="outputSuffix" value=".deptrip.txt"/>
      <param key="stopList" value="stopList"/>
      <param key="useStopList" value="no"/>
      <param key="useEmptyMacro" value="no"/>
      <param key="useEmptyMicro" value="no"/>
      <map name="NEmacroCategories">
        <entry key="DateTime.DATE" value="NC"/>
        <entry key="Numex.NUMBER" value="NC"/>
        <entry key="Numex.UNIT" value="NC"/>
        <entry key="Numex.NUMEX" value="NC"/>
        <entry key="Organization.ORGANIZATION" value="NP"/>
        <entry key="Location.LOCATION" value="NP"/>
        <entry key="Person.PERSON" value="NP"/>
        <entry key="Product.PRODUCT" value="NP"/>
        <entry key="Event.EVENT" value="NP"/>
      </map>
      <param key="properNounCategory" value="NP"/>
      <param key="commonNounCategory" value="NC"/>
      <param key="NEnormalization" value="useNENormalizedForm"/>
      <list name="selectedDependency">
        <item value="ADJPRENSUB"/>
        <item value="APPOS"/>
        <item value="ATB_O"/>
        <item value="ATB_S"/>
        <item value="COD_V"/>
        <item value="COMPDUNOM"/>
        <item value="COMPL"/>
        <item value="CPL_V"/>
        <item value="SUBADJPOST"/>
        <item value="SUBSUBJUX"/>
        <item value="SUJ_V"/>
      </list>
    </group>

Dumpers

AbstractTextualAnalysisDumper

Class: AbstractTextualAnalysisDumper

Role: This is an abstract class. You cannot use it directly. Instead, use the other dumper classes. It is documented here because all dumpers can use its parameters.

Inputs:

Parameters
handler the name of the handler process unit in the configuration file that will receive and handle the data written by the dumper
temporaryFileMetadata the name of the analysis metadata entry that contains the name of the file where to write. It supercedes the outputFile option and the suffix handling options
outputFile the name of the file where the dumper will write. It supercedes the suffix handling options
stripInputSuffix whether to remove the suffix of the input file before using it (when outputFile is not set)
outputSuffix the suffix to add to the name of the input file (when outputFile is not set)
append whether we will append content to the output file or erase its content if it exists

Outputs: No outputs. This is an abstract class. See the other dumpers documentation for details.

Preconditions: None

Effects: None

AnnotationGraphXmlDumper

Class: AnnotationGraphXmlDumper

Role: The role of this process unit is to .

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters

Outputs:

Preconditions:

Effects:

    <group name="agXmlDumper" class="AnnotationGraphXmlDumper">
      <param key="handler" value="xmlSimpleStreamHandler"/>
    </group>
    <group name="normalizationBowDumper" class="BowDumper">
      <param key="handler" value="bowTextWriter"/>
      <param key="stopList" value="stopList"/>
      <param key="useStopList" value="false"/>
      <param key="useEmptyMacro" value="false"/>
      <param key="useEmptyMicro" value="false"/>
      <map name="NEmacroCategories">
        <entry key="DateTime.DATE" value="NC"/>
        <entry key="Numex.NUMBER" value="NC"/>
        <entry key="Numex.UNIT" value="NC"/>
        <entry key="Numex.NUMEX" value="NC"/>
        <entry key="Organization.ORGANIZATION" value="NP"/>
        <entry key="Location.LOCATION" value="NP"/>
        <entry key="Person.PERSON" value="NP"/>
        <entry key="Product.PRODUCT" value="NP"/>
        <entry key="Event.EVENT" value="NP"/>
      </map>
      <param key="properNounCategory" value="NP"/>
      <param key="commonNounCategory" value="NC"/>
      <param key="NEnormalization" value="useNENormalizedForm"/>
    </group>

BowDumper

Class: BowDumper

Role: The role of this process unit is to .

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters

Outputs:

Preconditions:

Effects:

    <group name="bowDumper" class="BowDumper">
      <param key="handler" value="bowTextWriter"/>
      <param key="stopList" value="stopList"/>
      <param key="useStopList" value="true"/>
      <param key="useEmptyMacro" value="true"/>
      <param key="useEmptyMicro" value="true"/>
      <map name="NEmacroCategories">
        <entry key="DateTime.DATE" value="NC"/>
        <entry key="Numex.NUMBER" value="NC"/>
        <entry key="Numex.UNIT" value="NC"/>
        <entry key="Numex.NUMEX" value="NC"/>
        <entry key="Organization.ORGANIZATION" value="NP"/>
        <entry key="Location.LOCATION" value="NP"/>
        <entry key="Person.PERSON" value="NP"/>
        <entry key="Product.PRODUCT" value="NP"/>
        <entry key="Event.EVENT" value="NP"/>
      </map>
      <param key="properNounCategory" value="NP"/>
      <param key="commonNounCategory" value="NC"/>
      <param key="NEnormalization" value="useNENormalizedForm"/>
    </group>
    <group name="bowTextHandler" class="BowDumper">
      <param key="handler" value="bowTextHandler"/>
      <param key="stopList" value="stopList"/>
      <param key="useStopList" value="true"/>
      <param key="useEmptyMacro" value="true"/>
      <param key="useEmptyMicro" value="true"/>
      <map name="NEmacroCategories">
        <entry key="DateTime.DATE" value="NC"/>
        <entry key="Numex.NUMBER" value="NC"/>
        <entry key="Numex.UNIT" value="NC"/>
        <entry key="Numex.NUMEX" value="NC"/>
        <entry key="Organization.ORGANIZATION" value="NP"/>
        <entry key="Location.LOCATION" value="NP"/>
        <entry key="Person.PERSON" value="NP"/>
        <entry key="Product.PRODUCT" value="NP"/>
        <entry key="Event.EVENT" value="NP"/>
      </map>
      <param key="properNounCategory" value="NP"/>
      <param key="commonNounCategory" value="NC"/>
      <param key="NEnormalization" value="useNENormalizedForm"/>
    </group>
    <group name="textQueryHandler" class="BowDumper">
      <param key="handler" value="bowTextHandler"/>
<!--       <param key="handler" value="bowTextWriter"/> -->
      <param key="stopList" value="stopList"/>
      <param key="useStopList" value="true"/>
      <param key="useEmptyMacro" value="true"/>
      <param key="useEmptyMicro" value="true"/>
      <map name="NEmacroCategories">
        <entry key="DateTime.DATE" value="NC"/>
        <entry key="Numex.NUMBER" value="NC"/>
        <entry key="Numex.UNIT" value="NC"/>
        <entry key="Numex.NUMEX" value="NC"/>
        <entry key="Organization.ORGANIZATION" value="NP"/>
        <entry key="Location.LOCATION" value="NP"/>
        <entry key="Person.PERSON" value="NP"/>
        <entry key="Product.PRODUCT" value="NP"/>
        <entry key="Event.EVENT" value="NP"/>
      </map>
      <param key="properNounCategory" value="NP"/>
      <param key="commonNounCategory" value="NC"/>
      <param key="NEnormalization" value="useNENormalizedForm"/>
    </group>
    <group name="bowDocumentDumper" class="BowDumper">
      <param key="handler" value="bowDocumentHandler"/>
      <param key="stopList" value="stopList"/>
      <param key="useStopList" value="false"/>
      <param key="useEmptyMacro" value="false"/>
      <param key="useEmptyMicro" value="false"/>
      <map name="NEmacroCategories">
        <entry key="DateTime.DATE" value="NC"/>
        <entry key="Numex.NUMBER" value="NC"/>
        <entry key="Numex.UNIT" value="NC"/>
        <entry key="Numex.NUMEX" value="NC"/>
        <entry key="Organization.ORGANIZATION" value="NP"/>
        <entry key="Location.LOCATION" value="NP"/>
        <entry key="Person.PERSON" value="NP"/>
        <entry key="Product.PRODUCT" value="NP"/>
        <entry key="Event.EVENT" value="NP"/>
      </map>
      <param key="properNounCategory" value="NP"/>
      <param key="commonNounCategory" value="NC"/>
      <param key="NEnormalization" value="useNENormalizedForm"/>
    </group>
    <group name="bowTextDumper" class="BowDumper">
      <param key="handler" value="bowTextHandler"/>
      <param key="stopList" value="stopList"/>
      <param key="useStopList" value="false"/>
      <param key="useEmptyMacro" value="false"/>
      <param key="useEmptyMicro" value="false"/>
      <map name="NEmacroCategories">
        <entry key="DateTime.DATE" value="NC"/>
        <entry key="Numex.NUMBER" value="NC"/>
        <entry key="Numex.UNIT" value="NC"/>
        <entry key="Numex.NUMEX" value="NC"/>
        <entry key="Organization.ORGANIZATION" value="NP"/>
        <entry key="Location.LOCATION" value="NP"/>
        <entry key="Person.PERSON" value="NP"/>
        <entry key="Product.PRODUCT" value="NP"/>
        <entry key="Event.EVENT" value="NP"/>
      </map>
      <param key="properNounCategory" value="NP"/>
      <param key="commonNounCategory" value="NC"/>
      <param key="NEnormalization" value="useNENormalizedForm"/>
    </group>

ConllDumper

Class: ConllDumper

Role: The role of this process unit is to dump in a stream the result of the syntaxic analysis of sentences following the CoNLL-U format.

See Universal Dependencies for details on the CoNLL-U format.

The UPOS and XPOS values of the LIMA ConllDumper are not exactly compliant with the vanilla format.

For each language, see in lima_linguisticdata/analysisDictionary/<lang>/code/code-<lang>.xml the possible values of MACRO and MICRO tags that LIMA will dump respectively as the UPOS and XPOS tags.

The ConllDumper can be configured via the boolean withColsHeader option to write header lines giving the column names of the CoNLL-U format as a reminder.

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters
outputSuffix default value is '.conll'

Outputs:

Preconditions:

Effects:

DepTripleDumper

Class: DepTripleDumper

Role: The role of this process unit is to .

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters

Outputs:

Preconditions:

Effects:

    <group name="depTripleDumper" class="DepTripleDumper">
      <param key="handler" value="simpleStreamHandler"/>
      <list name="selectedDependency">
        <item value="ADJPRENSUB"/>
        <!--item value="ADVADV"/-->
        <!--item value="AdvSub"/-->
        <item value="APPOS"/>
        <item value="ATB_O"/>
        <item value="ATB_S"/>
        <item value="COD_V"/>
        <!--item value="COMPADJ"/-->
        <!--item value="COMPADV"/-->
        <!--item value="CompDet"/-->
        <item value="COMPDUNOM"/>
        <item value="COMPL"/>
        <!--item value="COORD1"/-->
        <!--item value="COORD2"/-->
        <item value="CPL_V"/>
        <!--item value="DETSUB"/-->
        <!--item value="MOD_A"/-->
        <!--item value="MOD_N"/-->
        <!--item value="MOD_V"/-->
        <!--item value="Neg"/-->
        <!--item value="PrepDet"/-->
        <!--item value="PrepPron"/-->
        <!--item value="PREPSUB"/-->
        <item value="SUBADJPOST"/>
        <item value="SUBSUBJUX"/>
        <item value="SUJ_V"/>
      </list>
    </group>

EasyXmlDumper

Class: EasyXmlDumper

Role: The role of this process unit is to .

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters

Outputs:

Preconditions:

Effects:

    <group name="easyXmlDumper" class="EasyXmlDumper">
      <param key="handler" value="simpleStreamHandler"/>
      <map name="typeMapping">
        <entry key="COMPDUNOM" value="MOD-N"/>
        <entry key="ADJPRENSUB" value="MOD-N"/>
        <entry key="SUBADJPOST" value="MOD-N"/>
        <entry key="SUBSUBJUX" value="MOD-N"/>
        <entry key="TEMPCOMP" value="AUX-V"/>
        <entry key="SujInv" value="SUJ-V"/>
        <entry key="CodPrev" value="COD-V"/>
        <entry key="CoiPrev" value="CPL-V"/>
        <entry key="PronSujVerbe" value="SUJ-V"/>
        <entry key="ADVADV" value="MOD-R"/>
        <entry key="ADVADJ" value="MOD-A"/>
        <entry key="NePas2" value="MOD-V"/>
        <entry key="AdvVerbe" value="MOD-V"/>
        <entry key="COMPADJ" value="MOD-A"/>
        <!--entry key="Neg" value="MOD-V"/-->
        <!--change '_' to '-' -->
        <entry key="SUJ_V" value="SUJ-V"/>
        <entry key="SUJ_V_REL" value="SUJ-V"/>
        <entry key="COD_V" value="COD-V"/>
        <entry key="CPL_V" value="CPL-V"/>
        <entry key="CPLV_V" value="CPL-V"/>
        <entry key="MOD_V" value="MOD-V"/>
        <entry key="MOD_N" value="MOD-N"/>
        <entry key="MOD_A" value="MOD-A"/>
        <entry key="ATB_S" value="ATB-SO,s-o valeur=sujet"/>
        <entry key="ATB_O" value="ATB-SO,s-o valeur=objet"/>
        <entry key="COORD1" value="COORD"/>
        <entry key="COORD2" value="COORD"/>
        <entry key="COMPL" value="COMP"/>
        <entry key="JUXT" value="JUXT"/>
      </map>
      <map name="srcTag">
        <entry key="MOD-N" value="modifieur"/>
        <entry key="MOD-V" value="modifieur"/>
        <entry key="SUJ-V" value="sujet"/>
        <entry key="AUX-V" value="auxiliaire"/>
        <entry key="COD-V" value="cod"/>
        <entry key="CPL-V" value="complement"/>
        <entry key="MOD-R" value="modifieur"/>
        <entry key="APPOS" value="premier"/>
        <entry key="JUXT" value="suivant"/>
        <entry key="ATB-SO" value="attribut"/>
        <entry key="MOD-A" value="modifieur"/>
        <entry key="COMP" value="complementeur"/>
        <entry key="COORD" value="coordonnant"/>
      </map>
      <map name="tgtTag">
        <entry key="MOD-N" value="nom"/>
        <entry key="MOD-V" value="verbe"/>
        <entry key="SUJ-V" value="verbe"/>
        <entry key="AUX-V" value="verbe"/>
        <entry key="COD-V" value="verbe"/>
        <entry key="CPL-V" value="verbe"/>
        <entry key="MOD-R" value="adverbe"/>
        <entry key="APPOS" value="appose"/>
        <entry key="JUXT" value="premier"/>
        <entry key="ATB-SO" value="verbe"/>
        <entry key="MOD-A" value="adjectif"/>
        <entry key="COMP" value="verbe"/>
        <entry key="COORD" value="coord-g"/>
      </map>
    </group>

FullXmlDumper

Class: FullXmlDumper

Role: The role of this process unit is to .

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters

Outputs:

Preconditions:

Effects:

GeoDumper

Class: GeoDumper

Role: The role of this process unit is to .

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters
graph PosGraph

Outputs:

Preconditions:

Effects:

LTRDumper

Class: LTRDumper

Role: The role of this process unit is to .

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters
handler default is 'simpleStreamHandler'

Outputs:

Preconditions:

Effects:

NullDumper

Class: NullDumper

Role: The role of this process unit is to dump nothing.

Inputs: See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Outputs: None. Preconditions: None.

Effects: None.

posGraphXmlDumper

Class: posGraphXmlDumper

Role: The role of this process unit is to .

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters

Outputs:

Preconditions:

Effects:

SimpleXmlDumper

Class: SimpleXmlDumper

Role: The role of this process unit is to .

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters

Outputs:

Preconditions:

Effects:

TextDumper

Class: TextDumper

Role: The role of this process unit is to .

Inputs:

See AbstractTextualAnalysisDumper for parameters common to all dumpers.

Parameters
outputSuffix defaultValue is '.out'

Outputs:

Preconditions:

Effects:

Clone this wiki locally