Skip to content
Tim Down edited this page May 14, 2015 · 17 revisions

Documentation home

TextRange Module

Rangy's TextRange module provides various methods for navigating and manipulating the visible text (see below for a definition of this) on a page by character or word.

This module also provides an implementation of innerText (see this article by kangax for some background). It isn't strictly Rangy's area of concern but it comes virtually free from this module's functionality: an element's innerText can be considered the visible text of a range that encompasses the contents of the element, so that's what is provided.

Demo page

Visible text

The algorithm used to determine visible text is based on Aryeh Gregor's aborted innerText specification from 2011. This is no longer available on its original URL so here is a copy from the Rangy repository:

https://rawgit.com/timdown/rangy/master/fiddlings/spec/innerText.htm

Summary:

  • Text inside a <script> or <style> element is not included
  • Text inside any element hidden via CSS display: none or visibility: hidden is not included
  • Collapsed white space (for example in HTML such as <span>One   two</span>) is considered as a single space character
  • White space implied by block elements and <br> elements is included

Factors that are not taken into consideration:

  • CSS text-transform. For example, the text inside <span id="foo" style="text-transform: uppercase">hello</span> will be rendered on the page as "HELLO" but will be considered as "hello" by Rangy. rangy.innerText( document.getElementById("foo") ) will return "hello".
  • Text hidden via overflow or clip CSS rules.
  • CSS generated content. This content is generated by CSS content property in conjunction with :before and :after pseudo-elements. For example, with the style rule #foo:after { content: "Two"; }, the text inside <span id="foo">One</span>Three will be rendered on the page as "OneTwoThree" but will be considered as "OneThree" by Rangy.

API

wordOptions

Optional object used in options in several methods below. It governs how words are identified and may have any combination of the following properties:

  • includeTrailingSpace: Boolean specifying whether to include trailing space after a word within the word. Default is false.
  • wordRegex: regular expression object used to identify words. Default is /[a-z0-9]+('[a-z0-9]+)*/gi.
  • tokenizer: Function used to tokenize text when identifying word boundaries (to be documented).

characterOptions

Optional object used in options in several methods below. It governs treatment of space characters and may have any combination of the following properties:

  • includeBlockContentTrailingSpace: Boolean specifying whether to include a trailing space within a block.
  • includeSpaceBeforeBr: Boolean specifying whether to include an inline space immediately preceding a <br> element. Default is true.
  • includePreLineTrailingSpace: Boolean specifying whether to include a trailing space immediately preceding a line break within an element whose white-space CSS property is set to pre-line. Default is true.
  • ignoreCharacters: String containing characters that should be ignored. This can be used to ignore zero-width space characters, for example. Default is "".

Extensions to Range

The TextRange module adds the following to all Rangy Range objects. The API is based on methods from Internet Explorer's TextRange object.


moveStart(String unit, Number count[, Object options])

moveEnd(String unit, Number count[, Object options])

Moves the start or end of the range by the number of units specified by count. unit must be one of "word" and "character".

options is an optional object parameter that governs how the range boundary move handles particular cases. It may have any combination of the following properties:

  • wordOptions: See above.
  • characterOptions: See above.

move(String unit, Number count[, Object options])

Collapses the range to a single point and moves it by the number of units specified by count. unit must be one of "word" and "character".

If count is negative, the range is collapsed to the start, otherwise it is collapsed to the end.

options as per moveStart() and moveEnd() above.


expand(String unit[, Object options])

Expands the range to completely encompass all units that it currently contains or partially contains. unit must be one of "word" and "character", although this particular method is only really useful for words. If "character" is specified, this method is identical to calling moveEnd("character", 1). If "word" is specified, the range is expanded to encompass all partially-selected word or non-word units, as defined by the wordRegex option.

options is an optional object parameter that governs how the range boundary move handles particular cases. It may have any combination of the following properties:

  • wordOptions: See above.
  • characterOptions: See above.
  • trim: Boolean specifying whether to trim trailing and leading spaces from the final range. Default is false.
  • trimStart: Boolean specifying whether to trim leading spaces from the expanded range. Only comes into effect if trim property is true. Default is true.
  • trimEnd: Boolean specifying whether to trim trailing spaces from the expanded range. Only comes into effect if trim property is true. Default is true.

text()

Returns the visible text contained in the range.


selectCharacters(Node containerNode, Number startIndex, Number endIndex)

Moves the range to contain text within containerNode specified by character indices startIndex and endIndex within the visible text of containerNode.


toCharacterRange(Node containerNode)

Returns the range as a pair of character indices relative to the start of the visible text of containerNode. The returned value is an object with properties start and end.


findText(mixed searchTerm[, Object options])

Provides a means of searching text on a page, including using regular expressions. In conjunction with the Class Applier module, this can be used to create a custom page search facility (as demonstrated on the demo page).

This method searches the visible text of the document for the text specified by searchTerm, which may be either a string or a regular expression object. The search starts from the start or end of the range (depending on search direction). The range moves to encompass the first match.

options is an optional object parameter that allows flexible searching. Any combination of the following properties may be supplied:

  • caseSensitive: Default is false
  • withinRange: Specifies the scope of the search. If supplied, only the text within this range is searched. Default is null.
  • wholeWordsOnly: Whether to match only whole words. Default is false.
  • wrap: Whether the search should wrap around if the start or end of the search scope is reached without finding a match. Default is false.
  • direction: String that specifies the direction of the search. Set it to "backward" to perform a backward search. Default is "forward".
  • wordOptions: See above.
  • characterOptions: See above.

pasteHtml(String html)

Replaces the contents of the range with HTML specified by html.


Extensions to Selection


move(String unit, Number count[, Object options])

Collapses the selection to a single point and moves it by the number of units specified by count. unit must be one of "word" and "character".

If count is negative, the selection is collapsed to the start, otherwise it is collapsed to the end.

See Range move() documentation for details.


expand(String unit[, Object options])

Expands the selection to completely encompass all units that it currently contains or partially contains. See Range expand() documentation for details.

If the selection was originally backward then the expanded selection will also be backward, if programmatic creation of backward ranges is supported by the browser (in practice this means all major browsers except IE).


selectCharacters(Node containerNode, Number startIndex, Number endIndex[, String direction])

Selects a range of characters within the visible text of containerNode specified by startIndex and endIndex. The selection direction is governed by direction (although in IE, the selection will always be created forwards).

direction may be any of the strings "forward", "forwards", "backward" or "backwards" or a Boolean (in which case true corresponds to "backwards").


saveCharacterRanges(Node containerNode)

Returns an object specifying the selection as character indices within the visible text of containerNode. This object can later be used to restore the selection by passing it into restoreCharacterRanges().


restoreCharacterRanges(Node containerNode, Object characterRanges)

Restores a selection previously saved using saveCharacterRanges().

These two methods provide a character index-based selection save and restore which is not vulnerable to formatting changes (unlike Rangy's existing selection save/restore module).


Extensions to the main rangy object


innerText(Element el)

Returns the visible text for the element el.

Clone this wiki locally