Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to enrich ALTO content description #83

Open
cipriandinu opened this issue Jan 27, 2023 · 2 comments
Open

Option to enrich ALTO content description #83

cipriandinu opened this issue Jan 27, 2023 · 2 comments

Comments

@cipriandinu
Copy link
Member

cipriandinu commented Jan 27, 2023

There are printed materials and even handwritten materials that can't be properly described by ALTO for example complex mathematical/chemical formulas or musical notes. There are specialized XML formats for these cases (MathML, MusicML, etc). Maybe we can add a new blocktype "CustomBlock" that will not contain textlines but custom content definition. The idea is to embed these custom definitions as metadata records are embedded in METS for example. Some possible samples:

<CustomBlock ID=... XPOS=... ...>
   <cbWrap MIMETYPE="text/xml" cbType="MathML">
       <xmlData>
                <math xmlns="http://www.w3.org/1998/Math/MathML">
                    <mi>&#x03C0;<!-- π --></mi>
                    <mo>&#x2062;<!-- &InvisibleTimes; --></mo>
                    <msup>
                      <mi>r</mi>
                      <mn>2</mn>
                    </msup>
                </math>
       </xmlData>
   </cbWrap>
</CustomBlock>

<CustomBlock ID=... XPOS=... ...>
   <cbWrap MIMETYPE="text/xml" cbType="MusicXML">
       <xmlData>
                <score-partwise version="4.0">
                  <part-list>
                     <score-part id="P1">
                       <part-name>Music</part-name>
                     </score-part>
                   </part-list>
                   <part id="P1">
                     <measure number="1">
                        <attributes>
                           <divisions>1</divisions>
                           <key>
                              <fifths>0</fifths>
                           </key>
                           <time>
                             <beats>4</beats>
                             <beat-type>4</beat-type>
                           </time>
                           <clef>
                              <sign>G</sign>
                              <line>2</line>
                           </clef>
                         </attributes>
                         <note>
                            <pitch>
                               <step>C</step>
                               <octave>4</octave>
                            </pitch>
                            <duration>4</duration>
                            <type>whole</type>
                         </note>
                      </measure>
                   </part>
             </score-partwise>
       </xmlData>
   </cbWrap>
</CustomBlock>

Alternatively we my extract this from ALTO and use a mechanims based on external files (again inspired by METS, that reffers several external files like ALTO, images, etc via filegrp/file)

@cipriandinu
Copy link
Member Author

... looks like my samples were also processed.... I will try again to keep real xml code

@cipriandinu
Copy link
Member Author

samples.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant