In Files

Methods

Class Index [+]

Quicksearch

Ferret::Analysis::AsciiLetterTokenizer

Summary

A LetterTokenizer is a tokenizer that divides text at non-ASCII letters. That is to say, it defines tokens as maximal strings of adjacent letters, as defined by the regular expression _/[A-Za-z]+/_.

Example

  "Dave's résumé, at http://www.davebalmain.com/ 1234"
    => ["Dave", "s", "r", "sum", "at", "http", "www", "davebalmain", "com"]

Public Class Methods

new() → tokenizer click to toggle source

Create a new AsciiLetterTokenizer

static VALUE
frb_a_letter_tokenizer_init(VALUE self, VALUE rstr) 
{
    return get_wrapped_ts(self, rstr, letter_tokenizer_new());
}

Disabled; run with --debug to generate this.

[Validate]

Generated with the Darkfish Rdoc Generator 1.1.6.