Skip to content

A Python library to support parsing regeXML expressions.

License

Notifications You must be signed in to change notification settings

kjwilcox/regeXML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

regeXML

A Python library to support parsing regeXML expressions.

Tired of trying to wrap your tiny brain around difficult-to-understand regular expressions?

<[a-z]+>.*</[a-z]+>

What does that even mean? It's impossible to know!

What if I told you there was a human-readable method of expressing your regular expression?

<?xml version="1.0" ?>
<expression type="regular" dialect="posix"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:noNamespaceSchemaLocation="regexml.xsd">
 <character encoding="utf-8" locale="en-US"><![CDATA[<]]></character>
 <repeat type="oneormore">
  <characterset>
   <character encoding="utf-8" locale="en-US">a</character>
   <character encoding="utf-8" locale="en-US">z</character>
  </characterset>
 </repeat>
 <character encoding="utf-8" locale="en-US"><![CDATA[>]]></character>
 <repeat type="zeroormore">
  <wildcard />
 </repeat>
 <character encoding="utf-8" locale="en-US"><![CDATA[<]]></character>
 <character encoding="utf-8" locale="en-US">/</character>
 <repeat type="oneormore">
  <characterset>
   <character encoding="utf-8" locale="en-US">a</character>
   <character encoding="utf-8" locale="en-US">z</character>
  </characterset>
 </repeat>
 <character encoding="utf-8" locale="en-US"><![CDATA[>]]></character>
</expression>

Obviously, this example is much easier to read in regeXML format. It has the following additional benefits:

  • Easier for a computer to read, parse, serialize and transmit.
  • Is more extensible than standard regex syntax.
  • Can be validated against a schema to ensure correctness.
  • Can be easily and safely embedded into other XML-based documents for composability.

Usage:

import regexml
regex_object = regexml.re_from_xml(some_xml_string)
regex_object.search(...)
regex_object.match(...)

About

A Python library to support parsing regeXML expressions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages