Releases: caleb531/automata
v8.4.0
A minor release with updates from the recently-completed pyOpenSci review:
pyOpenSci/software-submission#152
New Features
- Added length limits to the
successor
family of functions
Fixes
- Fixed addition of unreachable states for some cases in the creation of partial DFAs, and made behavior more consistent (#228)
- Fixed bug in edge case for minify (#218)
Documentation
- Added documentation making the library easier to use for new users, with more detailed examples
v8.3.0
Notable Changes
- Added a new
DFA.from_substrings
method using the Aho-Corasick algorithm (#167) - Fixed support for Python 3.12
- The contributing guide has been reworked with clearer information on how you can contribute to the Automata library!
Miscellaneous
v8.2.0
New Features
- Added a
show_diagram
method for Push Down Automata (#177)
New Documentation Site!
- The Automata documentation has moved to a brand new documentation website with better navigation and a new search capability that should greatly improve the experience. Check it out at https://caleb531.github.io/automata/
Fixes
- Fixed the handling of the quantifier pattern in regular expressions by switching from greedy matching (
*
) to lazy matching (*?
) (#181)
v8.1.0
New Features
- Added an optional minify step (default-enabled) to the
DFA.to_partial()
method
Announcements
- We are currently in the process of submitting the Automata project to the Journal of Open Source Software (JOSS), which will introduce the project to a broader community of research using open source software
v8.0.0
Automata v8 is a performance and polish-focused release, providing improvements
under the hood that should improve the quality of life for most package users.
Huge thanks to @khoda81, and @EduardoGoulart1 for the new features in this release!
New Features
Jupyter Notebook Integration (#129)
All finite automata (subclasses of the FA class) now have native support for diagram
creation in Jupyter notebooks! This is enabled by installing the optional visual
dependency, which includes the
pygraphviz
and coloraide
dependencies used in the show_diagram()
methods for DFAs/NFAs.
The styling for the diagrams has been changed from previous releases and is
modeled after the diagrams used in the
visual automata package.
The show_diagram()
now returns an AGraph
corresponding to the given FA,
and the FA will render a diagram automatically inside of a Juyter notebook. See the
FA documentation for more details.
Type Hints (#131)
Type annotations have been added to the library, providing greater information
when using code completion tools like Pylance. This also makes it easier to type-check
application code using the library.
Better Partial DFA Support (#147)
Methods in the DFA class now have greater support for partial DFAs (ones with
some missing transitions), including support for more efficient binary operations
with partial DFAs. Partial DFAs have smaller description complexity, and thus
some algorithms are more efficient when operating on them.
In addition, the from_prefix
, from_finite_language
, and from_nfa
DFA creation
functions now return partial DFAs by default. The new to_partial
and to_complete
methods can be used to convert a DFA into an equivalent partial or complete DFA
respectively.
my_dfa.to_partial() # get equivalent partial DFA
my_dfa.to_complete() # get equivalent partial DFA
Please see the DFA documentation for the full details on these new
methods (outlined below).
Smaller Changes
Regex Changes
- Quantifiers were added to the regular expression parsing, representing the repetition
of the preceding expression. For example, the regexa{1,2}
would matcha
between 1
and 2 times (#121). - A pair of parenthesis
()
can now be used to represent the empty string (#136). - The default set of input symbols for
NFA.from_regex
was changed to all ascii letters and digits (#121). - State names of NFAs constructed from regexes no longer persist between runs of the regex engine (#130).
Pickling Support (#125)
All automata now natively support pickling.
Examples (#146)
A section has been added to the documentation demonstrating example uses of the package, can
be found here.
Breaking Changes
Dependency Changes
Python 3.7 support has been dropped. Please upgrade to Python 3.8 or later to
use Automata v8.
Diagrams are no longer being generated using pydot
; this dependency has been
dropped in favor of using the visual
optional dependency, which will install
pygraphviz
and coloraide
used for generating figures. You should install
this optional dependency if you wish to generate figures. This change was to
allow for native support for displaying finite automaton in Jupyter notebooks.
The style of the diagrams has been lifted from the visual automata package,
so you should take a look at the diagrams generated and see if they are still
satisfactory.
Other new dependencies have been added, but these will be installed automatically
along with v8 of the package.
Greater Support for Partial DFAs
There is now greater support for partial DFAs, which included changing the
DFA.from_nfa()
function to return a partial DFA instead of a complete one.
To obtain a complete DFA, you must now call DFA.from_nfa().to_complete(trap_state_name)
,
where trap_state_name
will be used as the name for a trap state if one needs to
be added.
Type Hints
Type hints have now been added, meaning that code which previously called functions
with incorrect types may not have been flagged. See output from your typechecker
for more information.
NFA.from_regex default input symbols
The default set of input symbols for NFA.from_regex
was changed to all ASCII letters and digits.
If needing to use a specific set of input symbols, use the input_symbols
parameter.
v7.1.0
- Added new
enable_mutable_automata
global configuration option if you need to maximize the performance of automaton creation- See the documentation for the Automaton class
- Optimizations to automaton creation even if you leave immutable automata enabled
- Added a new Shuffle operator (denoted with the caret symbol
^
) for regular expressions (e.g.a^b
represents all permutations ofa
andb
) (#112) - Fixed a bug with the conversion from GNFA to regular expression (#122 and #123)
- Minor improvements to the error messaging for the
automata.regex.parser
module - Other small optimizations and improvements
v7.0.1
- Minor style/performance improvements to GNFA class
- Corrections to documentation
DFA.cardinality()
actually raises anInfiniteLanguageException
for
infinite languages, rather than returningfloat('inf')
DFA.maximum_word_length()
actually returnsNone
for infinite languages,
rather than returningfloat('inf')
- Added missing documentation for
EmptyLanguageException
v7.0.0
Automata v7 is a packed release, introducing immutable automata instances, new
DFA/NFA operations, and performance optimizations across the board to make
automaton machines easier to construct, validate, and work with. The
documentation has also been reorganized to make browsing much easier.
Huge thanks to @eliotwrobson and @Tagl for their massive code contributions to
this release, as well as their feedback on v7's development!
New Features
Immutability
All Automaton instances are now fully immutable to protect against common
pitfalls, such as mutating an automaton to an invalid state after it's already
been validated.
This means that if you wish to make a change to an automaton instance, you must
retrieve its attributes as a dictionary (using the new input_parameters
property), make your desired change, then pass those parameters to the relevant
constructor. For example:
from automata.fa.dfa import DFA
dfa1 = DFA(
states={'q0', 'q1', 'q2'},
input_symbols={'0', '1'},
transitions={
'q0': {'0': 'q0', '1': 'q1'},
'q1': {'0': 'q0', '1': 'q2'},
'q2': {'0': 'q2', '1': 'q1'}
},
initial_state='q0',
final_states={'q1'}
)
# You can still copy an automaton just fine
dfa2 = dfa.copy()
# If you want to make a change, you must create a new instance; please note
# that dfa2.input_parameters is always a deep copy of the input parameters for
# dfa2 (in other words, mutating dfa2.input_parameters will not actually mutate
# dfa2)
params = dfa2.input_parameters
params['final_states'] = {'q2'}
dfa3 = DFA(**params)
DFA and NFA Equivalence via ==
You can now use ==
to check if two DFA accept the same language. You can also
use ==
to check if two NFA instances accept the same language.
dfa1 == dfa2
nfa1 == nfa2
New DFA Methods
DFAs now include more methods for working with accepted languages. For example,
the library can now build you a DFA from a language represented by any sequence
of strings. Similarly, you can now iterate over a DFA, which will yield a string
accepted by that DFA at each iteration.
len(my_dfa) # get the cardinality of a DFA
for word in my_dfa: # loop over every string accepted by the DFA
print(word)
Please see the DFA documentation for the full details on these new
methods (outlined below).
Click to expand new DFA methods
DFA.from_finite_language(cls, input_symbols, language)
Constructs the minimal DFA corresponding to the given finite language and input symbols.
DFA.from_finite_language(
language={'aa', 'aaa', 'aaba', 'aabbb', 'abaa', 'ababb', 'abbab',
'baa', 'babb', 'bbaa', 'bbabb', 'bbbab'},
input_symbols={'a', 'b'})
DFA.complement(self, retain_names=False, minify=True)
Returns a new DFA which accepts all input rejected by the DFA on which
complement()
is called.
minimal_complement_dfa = ~dfa
complement_dfa = dfa.complement(minify=False)
DFA.predecessor(self, input_str, strict=True, key=None)
Returns the first string accepted by the DFA that comes before the input string
in lexicographical order.
prev_word = dfa.predecessor('0110')
same_word = dfa.predecessor(prev_word, strict=False)
DFA.predecessors(self, input_str, strict=True, key=None)
Generates all strings that come before the input string in lexicographical
order.
# Generates all strings in a language in lexographical order
for word in dfa.predecessors(None):
print(word)
DFA.successor(self, input_str, strict=True, key=None)
Returns the first string accepted by the DFA that comes after the input string
in lexicographical order.
next_word = dfa.successor('0110')
same_word = dfa.predecessor(next_word, strict=False)
DFA.successors(self, input_str, strict=True, key=None, reverse=False)
Generates all strings that come after the input string in lexicographical order
# Generates all strings in a language in lexographical order
for word in dfa.successors(None):
print(word)
DFA.minimum_word_length(self)
Returns the length of the shortest word in the language represented by the DFA.
dfa.minimum_word_length()
DFA.maximum_word_length(self)
Returns the length of the longest word in the language represented by the DFA.
In the case of infinite languages, float('inf')
is returned.
dfa.maximum_word_length()
DFA.count_words_of_length(self, k)
Counts words of length k
accepted by the DFA.
dfa.count_words_of_length(3)
DFA.words_of_length(self, k)
Generates words of length k
accepted by the DFA.
for word in dfa.words_of_length(3):
print(word)
You can also iterate through all words accepted by the DFA.
Be aware that the generator may be infinite.
for word in dfa:
if len(word) > 10:
break
print(word)
DFA.cardinality(self)
Returns the cardinality of the language represented by the DFA. Note that
len(dfa)
raises a ValueError
for infinite languages, whereas
DFA.cardinality
will return float('inf')
.
dfa.cardinality()
len(dfa)
DFA.from_prefix(cls, input_symbols, prefix, contains=True)
Directly computes the minimal DFA recognizing strings with the given prefix.
contains_prefix_nano = DFA.from_prefix({'a', 'n', 'o', 'b'}, 'nano')
avoids_prefix_nano = DFA.from_prefix({'a', 'n', 'o', 'b'}, 'nano', contains=False)
DFA.from_suffix(cls, input_symbols, suffix, contains=True)
Directly computes the minimal DFA recognizing strings with the given prefix.
contains_suffix_nano = DFA.from_suffix({'a', 'n', 'o', 'b'}, 'nano')
avoids_suffix_nano = DFA.from_suffix({'a', 'n', 'o', 'b'}, 'nano', contains=False)
DFA.from_substring(cls, input_symbols, substring, contains=True, must_be_suffix=False)
Directly computes the minimal DFA recognizing strings containing the given
substring.
contains_substring_nano = DFA.contains_substring({'a', 'n', 'o', 'b'}, 'nano')
avoids_substring_nano = DFA.contains_substring({'a', 'n', 'o', 'b'}, 'nano', contains=False)
DFA.from_subsequence(cls, input_symbols, subsequence, contains=True)
Directly computes the minimal DFA recognizing strings containing the given
subsequence.
contains_substring_dcba = DFA.contains_subsequence({'a', 'b', 'c', 'd'}, 'dcba')
avoids_substring_dcba = DFA.contains_subsequence({'a', 'b', 'c', 'd'}, 'dcba', contains=False)
DFA.of_length(cls, input_symbols, min_length=0, max_length=None, symbols_to_count=None)
Directly computes the minimal DFA which accepts all words whose length is between min_length
and max_length
, inclusive. To allow infinitely long words, the value None
can be
passed in for max_length
. If symbols_to_count
is None
(default behavior), then counts
all symbols. Otherwise, only counts symbols present in the set symbols_to_count
and
ignores other symbols.
DFA.count_mod(cls, input_symbol, k, remainders=None, symbols_to_count=None)
Directly computes a DFA that counts given symbols and accepts all strings where
the remainder of division by k
is in the set of remainders
given. The
default value of remainders
is {0}
and all symbols are counted by default.
even_length_strings = DFA.count_mod({'0', '1'}, 2)
odd_number_of_ones = DFA.count_mod({'0', '1'}, 2, remainders={1}, symbols_to_count={'1'})
DFA.nth_from_start(cls, input_symbols, symbol, n)
Directly computes the minimal DFA which accepts all words whose n
-th character from the end is symbol
, where n
is a positive integer.
dfa = DFA.nth_from_start({'0', '1'}, '1', 4)
DFA.nth_from_end(cls, input_symbols, symbol, n)
Directly computes the minimal DFA which accepts all words whose n
-th character from the end is symbol
, where n
is a positive integer.
dfa = DFA.nth_from_end({'0', '1'}, '1', 4)
DFA.count_words_of_length()
Counts words (of the given length) that are accepted by the DFA
dfa.count_words_of_length(3)
DFA.universal_language(cls, input_symbols)
Returns a new DFA which accepts all strings formed from the given input symbols.
DFA.universal_language(input_symbols={'a', 'b'})
DFA.empty_language(cls, input_symbols)
Returns a new DFA which rejects all strings formed from the given input symbols.
DFA.empty_language(input_symbols={'a', 'b'})
New NFA Methods
Please see the NFA documentation for the full details on these new
methods.
Click to expand new NFA methods
NFA.left_quotient(self, other)
Returns the left quotient of one NFA with respect to another.
new_nfa = nfa1.left_quotient(nfa2)
NFA.right_quotient(self, other)
Returns the right quotient of one NFA with respect to another.
new_nfa = nfa1.right_quotient(nfa2)
NFA.shuffle_product(self, other)
Returns shuffle product of two NFAs. This produces an NFA that accepts all
interleavings of strings in the input NFAs.
new_nfa = nfa1.shuffle_product(nfa2)
NFA.edit_distance(cls, input_symbols, reference_str, max_edit_distance, insertion=True, deletion=True, substitution=True)
Constructs the NFA for the given reference_str for the given Leve...
v6.0.2
- Fixed a critical bug caused by v6.0.1, where v6.0.1 attempted to implement the missing
NFA.to_regex()
method. However, the addition of the method caused a circular import error that was forced to be remove for the sake of the library's stability. So sorry about that.