- Change: Significantly updated hyphenation patterns for:
- German
- German (Traditional)
- German (Swiss Traditional)
- Change: Minimum PHP version increased to 7.4.0.
- Change: The following public methods have been deprecated:
Strings::mb_str_split
Strings::uchr
- Bugfix: No more deprecation warnings when running on PHP 8.1.
- Bugfix: The ASCII string functions actually get used on pure ASCII fragments
(instead of falling back to the slower
mb_*
functions).
- Feature: Use native
mb_str_split
on PHP 7.4 and above. - Change: Significantly updated hyphenation patterns for:
- Amharic
- Chinese pinyin (Latin)
- German
- German (Traditional)
- German (Swiss Traditional)
- Latin (Classical)
- Latin (Liturgical)
- Spanish
- Bugfix: Ambiguous
blase
removed from smart diacritics list forde-DE
.`
- Bugfix: PHP 7.4 compatibility.
- Bugfix: Parsing does not break anymore when the returned
DOMDocument
is invalid.
- Bugfix: The smart area and volume units fix now handles missing whitespace
as well (e.g.
5m2
is transformed into5 m²
).
- Feature: Use non-breaking hyphen for connecting one-letter-words and when an elision is followed by a comma.
- Feature: All special unicode characters can now be remapped using the
Settings
constructor or the newSettings::remap_character()
method. By default,U::APOSTROPHE
is remapped toU::SINGLE_QUOTE_CLOSE
andU::NO_BREAK_NARROW_SPACE
toU::NO_BREAK_SPACE
, keeping compatibility with previous versions. - Feature: A new dash style (
Dash_Styles::INTERNATIONAL_NO_HAIR_SPACES
) has been added, following the Duden convention of not having hair spaces around interval dashes. - Feature: Metric area and volume units can now be prettified (e.g.
m2
tom²
). - Change: All settings keys now have named constants. Going forward, please only use those.
- Bugfix: Decades in the English (
'60s
) and German ('80er
) styles are now rendered with an apostrophe.
- Bugfix: To prevent common false positives for single-letter Roman ordinals
(especially in French and Dutch), Roman numeral matching is now only enabled
when
Settings::set_smart_ordinal_suffix_match_roman_numerals
is set totrue
. In addition, onlyI
,V
, andX
are accepted as single-letter Roman numbers.
- Bugfix: The Unicode hyphen character (
‐
) is recognized as a valid word combiner.
- Bugfix: Parts of hyphenated words should not be detected as Roman numerals anymore.
- Feature: French (1ère) and "Latin" (1o) ordinal numbers are now supported by the smart ordinals feature (also with Roman numerals, e.g. XIXème).
- Bugfix: Unit spacing was not applied to monetary symbols ($, €, etc.).
- Bugfix: Certain entities (e.g.
&
) were not encoded correctly when modifying a node.
- Feature: The comma can now be used as a decimal separator (e.g.
1,5
, in addition to1.5
). - Change: PHP-Typography now uses the Unicode hyphen character (
‐
) instead of the hyphen-minus (-
). - Change: Smart dashes has been refactored into a separate token fix.
- Bugfix: Smart maths properly handles 2-digit years in dates.
- Bugfix: Smart diacritics won't try to "correct" the spelling of
Uber
anymore.
- Bugfix: French punctuation is now correctly applied to quotes preceded or followed by round and square brackets.
- Feature: A narrow no-break space is now inserted between adjacent primary and secondary quotes.
- Feature: The list of "apostrophe exceptions" (like
'tain't
,'til
) can now be adjusted. - Change: Significantly updated hyphenation patterns for:
- Bulgarian
- German
- German (Traditional)
- German (Swiss Traditional)
- Latin (Liturgical)
- Thai
- Bugfix: Smart quotes replacement could result in invalid unicode sequences in rare cases.
- Bugfix: The French spacing rules were not applied to closing guillemets followed by a comma.
- Bugfix: 50/50 (and x/x except 1/1) are not treated as fractions anymore.
- Bugfix: Smart fractions were not matched correctly if the were followed by a comma (i.e.
1/4,
).
- Bugfix: In rare cases, UTF-8 characters were broken by a missing 'u' flag in a regular expression.
- Bugfix: The
Quotes
class was missing from the signature ofSettings::set_smart_quotes_*
.
- Bugfix: < and > where silently dropped when replacing nodes due to HTML injection.
- Feature: New hyphenation languages
- Assamese,
- Belarusian,
- Bengali,
- Church Slavonic,
- Esperanto,
- Friulan,
- Gujarati,
- Kannada,
- Kurmanji,
- Malayalam,
- Norwegian (Bokmål)
- Norwegian (Nynorsk)
- Piedmontese,
- Romansh,
- Upper Sorbian.
- Bugfix:
Default_Registry::get_default_node_fixes
andDefault_Registry::get_default_token_fixes
were missing thestatic
keyword.
- Feature: New
Registry
class introduced to allow for custom fixes collections. - Feature: CSS classes for the virtual
<body>
node can now be set when processing strings. - Change: "French punctuation spacing" is now off by default.
- Change: The deprecated class
Hyphenator_Cache
has been removed. - Change: The deprecated properties
Settings::inappropriate_tags
andSettings::self_closing_tags
have been removed. - Bugfix: Numbers are treated like characters for the purpose of wrapping emails.
- Bugfix: Sometimes, the French double quotes style generated spurious ».
- Bugfix: Remove some ambiguous diacritics replacements from the German language file.
- Bugfix: Dewidow didn't honor narrow spaces.
- Change:
Hyphenator_Cache
has been moved toHyphenator\Cache
. - Change: New method
has_changed
forHyphenator\Cache
. - Change: Properties
Settings::inappropriate_tags
andSettings::self_closing_tags
have been deprecated. - Bugfix: Settings hash omitted some properties (props @shimikano).
- Feature: New hyphenation language "German (Swiss Traditional)" added.
- Feature: Dewidowing can now be applied to the final 1–3 words.
- Change: Started adding some benchmarks.
- Change: Updated HTML5 parser (html5-php) to 2.3.0:
- Tokenizer performance improved by 20 percent.
- Various small bugfixes.
- Bugfix: Fatal error on PHP 5.6.x (caused by using
__METHOD__
as a variable function) fixed.
- Bugfix: Hyphenator caching was not really working.
- Feature: Use Composer for dependencies.
- Change: API refactoring:
- Core API refactored and minimum PHP version increased to 5.6.0
- PHP_Typography broken into smaller classes (one for each "fix").
- Default Settings removed.
- Bugfix: French punctuation spacing after links (and other inline tags) fixed.
- Bugfix: Lone ampersands are treated as single-character words.
- Bugfix: Hyphenated words are properly de-widowed.
- Bugfix: Prevent crash on PHP 5.x when building the hyphenation trie.
- Feature: Prevent line-breaks in numbered abbreviations (e.g.
ISO 9001
). - Change: Core API refactored and minimum PHP version increased to 5.4.0.
- Change: Updated hyphenation patterns:
- German
- German (Traditional)
- Latin
- Latin (Liturgical)
- Change: Updated list of valid top-level domains.
- Bugfix: Hyphenation patterns at the end of word were accidentally ignored.
- Bugfix: Diacritics replacement does not count soft hyphens as word boundaries anymore.
- Bugfix: Performance issue accidentally introduced in 4.1.0 fixed.
- Feature: Hyphenator instance has been made cacheable.
- Bugfix: Incorrect replacement of initial hyphens fixed.
- Bugfix: French spacing rules improved.
- Bugfix: Proper dashes for German date intervals.
- Bugfix: Workaround for PHP 5.3 issue in
dewidow
callback.
- Feature: New Settings API added.
- Feature: New hyphenation languages
- Hindi,
- Marathi,
- Occitan,
- Oriya,
- Panjabi,
- Tamil,
- Telugu.
- Change: Updated list of valid top-level domains.
- Bugfix: Remove ambiguous entries from German diacritics replacement file.
Skipped.
- Bugfix: Quotes ending in numbers were sometimes interpreted as primes.
- Feature: Added "Latin (Liturgical)" as a new hyphenation language.
- Change: Updated list of valid top-level domains.
- Change: Updated HTML5 parser (html5-php) to 2.2.2.
- Bugfix: Custom hyphenations with more than one hyphenation point were not working properly.
- Bugfix: The
min_after
hyphenation setting was off by one. - Bugfix: Fractions did not play nice with prime symbols.
- Store hyphenation patterns as JSON files instead of PHP.
- Updated list of valid top-level domains.
- Updated HTML parser (html5-php) to 2.2.1.
- Updated list of valid top-level domains.
- Prevent references to US non-profit organizations like
501(c)(3)
being replaced with the copyright symbol (props @randybruder). - Added CSS classes for smart fractions ("numerator", "denominator") and ordinal suffixes ("ordinal").
- Fixed « and » spacing when French punctuation style is enabled.
- "Duplicate ID" warnings should be gone now, regardless of the installed libXML version.
Skipped.
- Added support for the French punctuation style (thin non-breakable space before
;:?!
). - Added proper hyphenation of hyphenated compound words (e.g.
editor-in-chief
). - Added partial support for styling hanging punctuation.
- Prevent incorrect replacement of straight quotes with primes (e.g.
"number 6"
is not replaced with“number 6″
but with“number 6”
). - Fixed a bug that prevented header tags (
<h1>
…<h6>
) that were set as “tags to ignore” from actually being left alone by the plugin.
Skipped.
- Fixed fatal error when running on PHP 5.3 (use of $this in anonymous function).
- Minimum PHP version updated to 5.3.4 (from 5.3.0) to ensure consistent handling of UTF-8 regular expressions.
- Fixed diacritics replacement for UTF-8 strings
- Date-like values (e.g. "during the fiscal year 2015/2016") are not converted to smart fractions anymore.
- Added ability to switch between dash styles: both traditional US (em dash without spacing) and international usage (en dash with spaces) can be selected.
- Various white-space fixes related to dash styling.
- Fixed a bug where block-level tags where not detected corrected.
- Added workaround for duplicate ID warnings generated by some versions of libXML.
- Updated all hyphenation files and added the following new languages:
- Afrikaans,
- Armenian,
- Dutch,
- Georgian,
- German (Traditional),
- Latin (Classical),
- Latvian,
- Thai, and
- Turkmen.
- Prevent accidentally invalid XPath queries from being fatal on the frontend.
- Fixed a bug in the XPath expression for ignoring tags by CSS ID.
- A typo prevented custom quote styles from working.
Skipped.
- DOM-based HTML parsing with HTML5-PHP
- Added German as a diacritics language (mainly for French words).
- Various optimizations (hyphenation is still slow, though)
- Fixed custom hyphenation patterns.
- Adopted semantic versioning for the project.
- Simplified acronym identification to not include some obscure uppercase characters. This will reduce support for some non-English languages, but it resolves an issue of catastrophic failure (where the entire page fails to load) with certain server configurations.
- Security Fix: Prevented comments with exceptionally long strings from causing fatal PHP error.
- Fixed bug that caused occasional hyphenation errors for non-English languages.
- Fixed bug in custom diacritic handling
- Resolved uninitialized variable
- Added HTML5 elements to parsing algorithm for greater contextual awareness
- Fixed bug where dewidow functionality would add broken no-break spaces to the end of texts, and smart_exponents would drop some of the resulting text.
- Declared encoding in all instances of mb_substr to avoid conflicts
- Corrected a few instances of undeclared variables.
- Added Norwegian Hyphenation Patterns
- Fixed bug in diacritic handling.
- Added automated diacritic replacements (i.e. "creme brulee" becomes "crème brûlée").
- Improved smart quotes and smart dashes with sensitivity to adjacent diacritic characters.
- Replaced quotation language styles with individual selection of primary and secondary quotation styles.
- Improved space collapse functionality.
- Corrected bug in smart quote and single character word handling where the "0" character may be improperly duplicated
- Added option to collapse adjacent space characters to a single character
- Corrected multibyte character handling error that could cause some text to not display properly
- Added language specific quote handling (for single quotes, not just double) for English, German and French quotation styles
- Added language specific quote handling for English, German and French quotation styles
- Corrected multibyte character handling error that could cause some text to not display properly
- Expanded the multibyte character set recognized as valid word characters for improved hyphenation
- Added option to force single character words to wrap to new line (unless they are widows).
- Fixed bug where hyphenation pattern settings were not initialized with multiple phpTypography class instances.
- Corrected math and dash handling of dates
- Styling of uppercase words now plays nicely with soft-hyphens
- Reformatted language files for increased stability and to bypass a false positive from Avira's free antivirus software
- Efficiency Optimizations ( approximately 25% speed increase ). Thanks Jenny!
- Added the ability to exclude hyphenation of capitalized (title case) words to help protect proper nouns
- Added Hungarian hyphenation patterns
- Fixed an instance where pre-hyphenated words were hyphenated again
- Removed two uses of create_function() for improved performance
- Corrected many uninitialized variables
- Corrected two variables that were called out of scope
- moved the processing of widow handling after hyphenation so that max-pull would not be compared to the length of the adjacent word, but rather the length of the adjacent word segment (i.e. that after a soft hyphen)
- By default, when class phpTypography is constructed, set_defaults is called. However, if you are going to manually set all settings, you can now bypass the set_defaults call for slightly improved performance. Just call
$typo ## new phpTypography(FALSE)
. - Decoded special HTML characters (for feeds only) to avoid invalid character injection (according to XML's specs)
- Reverted use of the hyphen character to the basic minus-hyphen in words like "mother-in-law" because of poor support in IE6
- Fixed smart math handling so it can be turned off.
- Corrected smart math handling to not convert slashes in URLs to division signs
- Corrected label in admin interface that indicated pretty fractions were part of basic math handling.
- Added test to phpTypography methods
process()
andprocess_feed()
to skip processing if$isTitle
parameter isTRUE
andh1
orh2
is an excluded HTML tag
- Added catch-all quote handling, now any quotes that escape previous filters will be assumed to be closing quotes
- Changed thin space injection behavior so that for text such as "...often-always?-judging...", the second dash will be wrapped in thin spaces
- Corrected error where fractions were not being styled because of a zero-space insertion with the wrap hard hyphens functionality
- Added default class to exclude:
noTypo
- Added "/" as a valid word character so we could capture "this/that" as a word for processing (similar to "mother-in-law")
- Corrected error where characters from the Latin 1 Supplement Block were not recognized as word characters
- Corrected smart quote handling for strings of numbers
- Added smart guillemet conversion:
<<
and>>
to«
and»
- Added smart Single Low 9 Quote conversion as part of smart quotes: comma followed by non-space becomes Single Low 9 Quote
- Added Single Low 9 Quote, Double Low 9 Quote and » to style_initial_character functionality
- Added a new phpTypography method smart_math that assigns proper characters to minus, multiplication and division characters
- Depreciated the phpTypography method smart_multiplication in favor of smart_math
- Cleaned up some smart quote functionality
- Added ability to wrap after "/" if set_wrap_hard_hyphen is TRUE (like "this/that")
- Critical bug fix: RSS feeds were being disabled by previous versions. This has been corrected.
- Corrected error where requiring Em/En dash thin spacing "word-" would become "word –" instead of "word–"
- Added default encoding value to smart_quote handling to avoid PHP warning messages
- Corrected curling quotes at the end of block level elements
- Corrected multibyte character conflict in smart-quote handling that caused infrequent dropping of text
- Thin space injection included for en-dashes
- Initial release