Skip to content
/ sbd Public
forked from Tessmore/sbd

Split text into sentences using vanilla sentence boundary detection (SBD) strategy

License

Notifications You must be signed in to change notification settings

chirrapp/sbd

 
 

Repository files navigation

Sentence boundary detection

The library is a fork of @Tessmore's sbd. Unlike the original version, the fork's focused on a single use case and removes extra options.

Split text into sentences with the vanilla strategy (i.e working ~95% of the time).

  • Split a text based on period, question- and exclamation marks.
  • Skips (most) abbreviations (Mr., Mrs., PhD.)
  • Skips numbers/currency.
  • Skips urls, websites, email addresses, phone nr.
  • Counts ellipsis and ?! as single punctuation.

Installation

The library is available as an npm package published at the GitHub Registry. To install @chirrapp/sbd run:

npm install @chirrapp/sbd --save
# Or using Yarn:
yarn add @chirrapp/sbd

Using

import { sentences } from "@chirrapp/sbd";

sentences(
  "On Jan. 20, former Sen. Barack Obama became the 44th President of the U.S. Millions attended the Inauguration."
);
//=> [
//=>   "On Jan. 20, former Sen. Barack Obama became the 44th President of the U.S.",
//=>   "Millions attended the Inauguration.",
//=> ]

License

MIT © Fabiën Tesselaar

About

Split text into sentences using vanilla sentence boundary detection (SBD) strategy

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages

  • TypeScript 99.7%
  • JavaScript 0.3%