Replaces busted characters carried over from legacy text encodings with the proper UTF-8 characters.
yarn add utfu || npm install utfu
Say you've got a string that looks like this:
There’s no way I’m paying €30 for that!
Pass it to either method, hex
, htx
, or txt
, and you'll hopefully get back:
There’s no way I’m paying €30 for that!
-
hex
substitutes unicode hex values (ie.,\u20ac
) -
htx
substitutes the HTML escape sequence (ie.,€
) -
txt
substitutes the actual character (ie.,€
) -
See substitution chart here for mappings, more or less
import { hex, htx, txt } from 'utfu'
const dirty = 'On a certain level, it�s like shouting �fire� in a crowded theater.'
const cleanHex = hex(dirty)
// --> 'On a certain level, it\u2019s like shouting \u201Cfire\u201D in a crowded theater.'
const cleanHTML = htx(dirty)
// --> 'On a certain level, it’s like shouting “fire” in a crowded theater.'
const cleanTxt = txt(dirty)
// --> 'On a certain level, it’s like shouting “fire” in a crowded theater.'
yarn test
👤 Daniel Sieradski hello@self.agency
- Website: self.agency
- Twitter: @selfagency_llc
- GitLab: @selfagency
Gracious thanks to Mathias Bynens, upon whose he and windows-1252 packages this project depends.
Contributions, issues and feature requests are welcome!
Feel free to check issues page.
Give a ⭐️ if this project helped you!