require ‘facets/string/cmp’ require ‘facets/blank’ require ‘facets/string/natcmp‘
Conveniently turn a string into a tuple.
Taken from O’Reilly’s Perl Cookbook 6.23. Regular Expression Grabbag.
Interpolate. Provides a means of extenally using Ruby string interpolation mechinism.
try = "hello" str = "\#{try}!!!" String.interpolate{ str } #=> "hello!!!" NOTE: The block neccessary in order to get then binding of the caller.
CREDIT: Trans
# File lib/core/facets/string/interpolate.rb, line 15 15: def self.interpolate(&str) 16: eval "%{#{str.call}}", str.binding 17: end
Removes occurances of a string or regexp.
"HELLO HELLO" - "LL" #=> "HEO HEO"
CREDIT: Benjamin David Oakes
# File lib/core/facets/string/op_sub.rb, line 9 9: def -(pattern) 10: self.gsub(pattern, '') 11: end
Treats self and path as representations of pathnames, joining thme together as a single path.
'home'/'trans' #=> 'home/trans'
# File lib/core/facets/string/op_div.rb, line 9 9: def /(path) 10: File.join(self, path) 11: end
Binary XOR of two strings.
puts "\000\000\001\001" ^ "\000\001\000\001" puts "\003\003\003" ^ "\000\001\002"
produces
"\000\001\001\000" "\003\002\001"
# File lib/core/facets/string/xor.rb, line 13 13: def ^(aString) 14: a = self.unpack('C'*(self.length)) 15: b = aString.unpack('C'*(aString.length)) 16: if (b.length < a.length) 17: (a.length - b.length).times { b << 0 } 18: end 19: xor = "" 20: 0.upto(a.length-1) { |pos| 21: x = a[pos] ^ b[pos] 22: xor << x.chr() 23: } 24: return(xor) 25: end
# File lib/core/facets/string/align.rb, line 3 3: def align(direction, n, sep="\n", c=' ') 4: case direction 5: when :right 6: align_right(n, sep="\n", c=' ') 7: when :left 8: align_left(n, sep="\n", c=' ') 9: when :center 10: align_center(n, sep="\n", c=' ') 11: else 12: raise ArgumentError 13: end 14: end
Centers each line of a string.
The default alignment separation is a new line (“n“). This can be changed as can be the padding string which defaults to a single space (’ ’).
s = <<-EOS This is a test and so on EOS puts s.align_center(14)
produces
This is a test and so on
CREDIT: Trans
# File lib/core/facets/string/align.rb, line 98 98: def align_center(n, sep="\n", c=' ') 99: return center(n.to_i,c.to_s) if sep==nil 100: q = split(sep.to_s).collect { |line| 101: line.center(n.to_i,c.to_s) 102: } 103: q.join(sep.to_s) 104: end
Align a string to the left.
The default alignment separation is a new line (“n“). This can be changed as can be the padding string which defaults to a single space (’ ’).
s = <<-EOS This is a test and so on EOS puts s.align_left(20, "\n", '.')
produces
This is a test...... and................. so on...............
CREDIT: Trans
# File lib/core/facets/string/align.rb, line 68 68: def align_left(n, sep="\n", c=' ') 69: return ljust(n.to_i,c.to_s) if sep==nil 70: q = split(sep.to_s).map do |line| 71: line.strip.ljust(n.to_i,c.to_s) 72: end 73: q.join(sep.to_s) 74: end
Align a string to the right.
The default alignment separation is a new line (“n“). This can be changed as can be the padding string which defaults to a single space (’ ’).
s = <<-EOS This is a test and so on EOS puts s.align_right(14)
produces
This is a test and so on
CREDIT: Trans
# File lib/core/facets/string/align.rb, line 38 38: def align_right(n, sep="\n", c=' ') 39: return rjust(n.to_i,c.to_s) if sep==nil 40: q = split(sep.to_s).map do |line| 41: line.rjust(n.to_i,c.to_s) 42: end 43: q.join(sep.to_s) 44: end
Is this string just whitespace?
"abc".blank? #=> false " ".blank? #=> true
# File lib/core/facets/blank.rb, line 50 50: def blank? 51: self !~ /\S/ 52: end
Return a new string embraced by given brakets. If only one bracket char is given it will be placed on either side.
"wrap me".bracket('{') #=> "{wrap me}" "wrap me".bracket('--','!') #=> "--wrap me!"
CREDIT: Trans
# File lib/core/facets/string/bracket.rb, line 14 14: def bracket(bra, ket=nil) 15: #ket = String.bra2ket[$&] if ! ket && /^[\[({<]$/ =~ bra 16: ket = BRA2KET[bra] unless ket 17: "#{bra}#{self}#{ket ? ket : bra}" 18: end
Inplace version of #.
CREDIT: Trans
# File lib/core/facets/string/bracket.rb, line 24 24: def bracket!(bra, ket=nil) 25: self.replace(bracket(bra, ket)) 26: end
Upacks string into bytes.
Note, this is not 100% compatible with 1.8.7+ which returns an enumerator instead of an array.
# File lib/core/facets/string/bytes.rb, line 10 10: def bytes(&blk) 11: if block_given? 12: self.unpack('C*').each(&blk) 13: else 14: self.unpack('C*') 15: end 16: end
# File lib/core/facets/string/camelcase.rb, line 20 20: def camelcase(first_letter=nil) 21: case first_letter 22: when :upper, true 23: upper_camelcase 24: when :lower, false 25: lower_camelcase 26: else 27: str = dup 28: str.gsub!(/\/(.?)/){ "::#{$1.upcase}" } # NOT SO SURE ABOUT THIS 29: str.gsub!(/(?:_+|-+)([a-z])/){ $1.upcase } 30: #str.gsub!(/(\A|\s)([a-z])/){ $1 + $2.upcase } 31: str 32: end 33: end
Return true if the string is capitalized, otherwise false.
"THIS".capitalized? #=> true "This".capitalized? #=> true "this".capitalized? #=> false
CREDIT: Phil Tomson
# File lib/core/facets/string/capitalized.rb, line 11 11: def capitalized? 12: self =~ /^[A-Z]/ 13: end
Returns an array of characters.
"abc".chars #=> ["a","b","c"]
# File lib/core/facets/string/chars.rb, line 13 13: def chars 14: split(//) 15: end
Returns an Enumerator for iterating over each line of the string, stripped of whitespace on either side.
# File lib/core/facets/string/cleanlines.rb, line 9 9: def cleanlines(&block) 10: if block 11: scan(/^.*?$/) do |line| 12: block.call(line.strip) 13: end 14: else 15: Enumerator.new(self) do |output| 16: scan(/^.*?$/) do |line| 17: output.yield(line.strip) 18: end 19: end 20: end 21: end
Cleave a string. Break a string in two parts at the nearest whitespace.
CREDIT: Trans
# File lib/core/facets/string/cleave.rb, line 8 8: def cleave(threshold=nil, len=nil) 9: l = (len || size / 2) 10: t = threshold || size 11: 12: h1 = self[0...l] 13: h2 = self[l..1] 14: 15: i1 = h1.rindex(/\s/) || 0 16: d1 = (i1 - l).abs 17: 18: d2 = h2.index(/\s/) || l 19: i2 = d2 + l 20: 21: d1 = (i1-l).abs 22: d2 = (i2-l).abs 23: 24: if [d1, d2].min > t 25: i = t 26: elsif d1 < d2 27: i = i1 28: else 29: i = i2 30: end 31: 32: #dup.insert(l, "\n").gsub(/^\s+|\s+$/, '') 33: return self[0..i].to_s.strip, self[i+1..1].to_s.strip 34: end
Compare method that takes length into account. Unlike #<=>, this is compatible with #.
"abc".cmp("abc") #=> 0 "abcd".cmp("abc") #=> 1 "abc".cmp("abcd") #=> -1 "xyz".cmp("abc") #=> 1
CREDIT: Peter Vanbroekhoven
# File lib/core/facets/comparable/cmp.rb, line 31 31: def cmp(other) 32: return 1 if length < other.length 33: return 1 if length > other.length 34: self <=> other # alphabetic compare 35: end
Matches any whitespace (including newline) and replaces with a single space
@example
<<-QUERY.compress_lines SELECT name FROM users QUERY => "SELECT name FROM users"
# File lib/core/facets/string/compress_lines.rb, line 12 12: def compress_lines(spaced = true) 13: split($/).map { |line| line.strip }.join(spaced ? ' ' : '') 14: end
Remove quotes from string.
"'hi'".dequite #=> "hi"
CREDIT: Trans
# File lib/core/facets/string/bracket.rb, line 88 88: def dequote 89: s = self.dup 90: 91: case self[0,1] 92: when "'", '"', '`' 93: s[0] = '' 94: end 95: 96: case self[1,1] 97: when "'", '"', '`' 98: s[1] = '' 99: end 100: 101: return s 102: end
Breaks a string up into an array based on a regular expression. Similar to scan, but includes the matches.
s = "<p>This<b>is</b>a test.</p>" s.divide( /\<.*?\>/ )
produces
["<p>This", "<b>is", "</b>a test.", "</p>"]
CREDIT: Trans
# File lib/core/facets/string/divide.rb, line 15 15: def divide( re ) 16: re2 = /#{re}.*?(?=#{re}|\Z)/ 17: scan(re2) #{re}(?=#{re})/) 18: end
Return true if the string is lowercase (downcase), otherwise false.
"THIS".downcase? #=> false "This".downcase? #=> false "this".downcase? #=> true
CREDIT: Phil Tomson
# File lib/core/facets/string/capitalized.rb, line 23 23: def downcase? 24: downcase == self 25: end
Yields a single-character string for each character in the string. When $KCODE = ‘UTF8’, multi-byte characters are yielded appropriately.
# File lib/core/facets/string/each_char.rb, line 22 22: def each_char 23: scanner, char = StringScanner.new(self), /./u 24: loop { yield(scanner.scan(char) || break) } 25: end
Iterate through each word of a string.
"a string".each_word { |word| ... }
# File lib/core/facets/string/each_word.rb, line 9 9: def each_word(&block) 10: words.each(&block) 11: end
Levenshtein distance algorithm implementation for Ruby, with UTF-8 support.
The Levenshtein distance is a measure of how similar two strings s and t are, calculated as the number of deletions/insertions/substitutions needed to transform s into t. The greater the distance, the more the strings differ.
The Levenshtein distance is also sometimes referred to as the easier-to-pronounce-and-spell ‘edit distance’.
Calculate the Levenshtein distance between two strings self and str2. self and str2 should be ASCII, UTF-8, or a one-byte-per character encoding such as ISO-8859-*.
The strings will be treated as UTF-8 if $KCODE is set appropriately (i.e. ‘u’). Otherwise, the comparison will be performed byte-by-byte. There is no specific support for Shift-JIS or EUC strings.
When using Unicode text, be aware that this algorithm does not perform normalisation. If there is a possibility of different normalised forms being used, normalisation should be performed beforehand.
CREDIT: Paul Battley
# File lib/core/facets/string/edit_distance.rb, line 26 26: def edit_distance(str2) 27: str1 = self 28: if $KCODE =~ /^U/ 29: unpack_rule = 'U*' 30: else 31: unpack_rule = 'C*' 32: end 33: s = str1.unpack(unpack_rule) 34: t = str2.unpack(unpack_rule) 35: n = s.length 36: m = t.length 37: return m if (0 == n) 38: return n if (0 == m) 39: 40: d = (0..m).to_a 41: x = nil 42: 43: (0...n).each do |i| 44: e = i+1 45: (0...m).each do |j| 46: cost = (s[i] == t[j]) ? 0 : 1 47: x = [ 48: d[j+1] + 1, # insertion 49: e + 1, # deletion 50: d[j] + cost # substitution 51: ].min 52: d[j] = e 53: e = x 54: end 55: d[m] = x 56: end 57: 58: return x 59: end
Does a string end with the given suffix?
"hello".end_with?("lo") #=> true "hello".end_with?("to") #=> false
CREDIT: Lucas Carlson, Blaine Cook
# File lib/core/facets/string/start_with.rb, line 27 27: def end_with?(suffix) 28: self.rindex(suffix) == size - suffix.size 29: end
The inverse of include?.
# File lib/core/facets/string/exclude.rb, line 5 5: def exclude?(str) 6: !include?(str) 7: end
Expands tabs to n spaces. Non-destructive. If n is 0, then tabs are simply removed. Raises an exception if n is negative.
"\t\tHey".expand_tabs(2) #=> " Hey"
Thanks to GGaramuno for a more efficient algorithm. Very nice.
CREDIT: Gavin Sinclair, Noah Gibbs, GGaramuno
TODO: Don’t much care for the name String#expand_tabs. What about a more concise name like #
# File lib/core/facets/string/expand_tab.rb, line 16 16: def expand_tabs(n=8) 17: n = n.to_int 18: raise ArgumentError, "n must be >= 0" if n < 0 19: return gsub(/\t/, "") if n == 0 20: return gsub(/\t/, " ") if n == 1 21: str = self.dup 22: while 23: str.gsub!(/^([^\t\n]*)(\t+)/) { |f| 24: val = ( n * $2.size - ($1.size % n) ) 25: $1 << (' ' * val) 26: } 27: end 28: str 29: end
Use fluent notation for making file directives.
'~/trans/Desktop/notes.txt'.file.mtime
# File lib/core/facets/string/file.rb, line 9 9: def file 10: f = self 11: Functor.new do |op, *a| 12: File.send(op, f, *a) 13: end 14: end
Returns a new string with all new lines removed from adjacent lines of text.
s = "This is\na test.\n\nIt clumps\nlines of text." s.fold
produces
"This is a test.\n\nIt clumps lines of text. "
One arguable flaw with this, that might need a fix: if the given string ends in a newline, it is replaced with a single space.
CREDIT: Trans
# File lib/core/facets/string/fold.rb, line 19 19: def fold(ignore_indented=false) 20: ns = '' 21: i = 0 22: br = self.scan(/(\n\s*\n|\Z)/) do |m| 23: b = $~.begin(1) 24: e = $~.end(1) 25: nl = $& 26: tx = slice(i...b) 27: if ignore_indented and slice(i...b) =~ /^[ ]+/ 28: ns << tx 29: else 30: ns << tx.gsub(/[ ]*\n+/,' ') 31: end 32: ns << nl 33: i = e 34: end 35: ns 36: end
Like index but returns an array of all index locations. The reuse flag allows the trailing portion of a match to be reused for subsquent matches.
"abcabcabc".index_all('a') #=> [0,3,6] "bbb".index_all('bb', false) #=> [0] "bbb".index_all('bb', true) #=> [0,1]
TODO: Culd probably be defined for Indexable in general too.
# File lib/core/facets/string/index_all.rb, line 14 14: def index_all(s, reuse=false) 15: s = Regexp.new(Regexp.escape(s)) unless Regexp===s 16: ia = []; i = 0 17: while (i = index(s,i)) 18: ia << i 19: i += (reuse ? 1 : $~[0].size) 20: end 21: ia 22: end
Left chomp.
"help".lchomp("h") #=> "elp" "help".lchomp("k") #=> "help"
CREDIT: Trans
# File lib/core/facets/string/chomp.rb, line 10 10: def lchomp(match) 11: if index(match) == 0 12: self[match.size..1] 13: else 14: self.dup 15: end 16: end
In-place left chomp.
"help".lchomp("h") #=> "elp" "help".lchomp("k") #=> "help"
CREDIT: Trans
# File lib/core/facets/string/chomp.rb, line 25 25: def lchomp!(match) 26: if index(match) == 0 27: self[0...match.size] = '' 28: self 29: end 30: end
Line wrap at width.
puts "1234567890".line_wrap(5)
produces
12345 67890
CREDIT: Trans
# File lib/core/facets/string/line_wrap.rb, line 14 14: def line_wrap(width, tabs=4) 15: s = gsub(/\t/,' ' * tabs) # tabs default to 4 spaces 16: s = s.gsub(/\n/,' ') 17: r = s.scan( /.{1,#{width}}/ ) 18: r.join("\n") << "\n" 19: end
Returns an array of characters.
"abc\n123".lines #=> ["abc\n","123"]
# File lib/core/facets/string/lines.rb, line 9 9: def lines(&blk) 10: if block_given? 11: each_line(&blk) #scan(/$.*?\n/).each(&blk) 12: else 13: Enumerator.new(self, :lines) #.split(/\n/) 14: end 15: end
# File lib/core/facets/string/camelcase.rb, line 44 44: def lower_camelcase 45: str = dup 46: str.gsub!(/\/(.?)/){ "::#{$1.upcase}" } # NOT SO SURE ABOUT THIS 47: str.gsub!(/(?:_+|-+)([a-z])/){ $1.upcase } 48: str.gsub!(/(\A|\s)([A-Z])/){ $1 + $2.downcase } 49: str 50: end
Downcase first letter.
# File lib/core/facets/string/uppercase.rb, line 17 17: def lowercase 18: str = to_s 19: str[0,1].downcase + str[1..1] 20: end
Provides a margin controlled string.
x = %Q{ | This | is | margin controlled! }.margin
NOTE: This may still need a bit of tweaking.
TODO: describe its limits and caveats and edge cases
CREDIT: Trans
# File lib/core/facets/string/margin.rb, line 17 17: def margin(n=0) 18: #d = /\A.*\n\s*(.)/.match( self )[1] 19: #d = /\A\s*(.)/.match( self)[1] unless d 20: d = ((/\A.*\n\s*(.)/.match(self)) || 21: (/\A\s*(.)/.match(self)))[1] 22: return '' unless d 23: if n == 0 24: gsub(/\n\s*\Z/,'').gsub(/^\s*[#{d}]/, '') 25: else 26: gsub(/\n\s*\Z/,'').gsub(/^\s*[#{d}]/, ' ' * n) 27: end 28: end
Translate a (class or module) name to a suitable method name.
My::CoolClass.name.methodize => "my__cool_class"
# File lib/core/facets/string/methodize.rb, line 17 17: def methodize 18: gsub(/([A-Z]+)([A-Z])/,'\1_\2'). 19: gsub(/([a-z])([A-Z])/,'\1_\2'). 20: gsub('/' ,'__'). 21: gsub('::','__'). 22: downcase 23: end
Converts a string to module name representation.
This is essentially #. It also converts ’/’ to ’::’ which is useful for converting paths to namespaces.
Examples
"method_name".modulize #=> "MethodName" "method/name".modulize #=> "Method::Name"
# File lib/core/facets/string/modulize.rb, line 21 21: def modulize 22: gsub('__','/'). 23: gsub(/\/(.?)/){ "::#{$1.upcase}" }. 24: gsub(/(?:_+|-+)([a-z])/){ $1.upcase }. 25: gsub(/(\A|\s)([a-z])/){ $1 + $2.upcase } 26: end
Like # but returns MatchData ($~) rather then matched string ($&).
CREDIT: Trans
# File lib/core/facets/string/mscan.rb, line 8 8: def mscan(re) #:yield: 9: if block_given? 10: scan(re) { yield($~) } 11: else 12: m = [] 13: scan(re) { m << $~ } 14: m 15: end 16: end
‘Natural order’ comparison of strings, e.g.
"my_prog_v1.1.0" < "my_prog_v1.2.0" < "my_prog_v1.10.0"
which does not follow alphabetically. A secondary parameter, if set to true, makes the comparison case insensitive.
"Hello.10".natcmp("Hello.1") #=> -1 TODO: Invert case flag?
CREDIT: Alan Davies, Martin Pool
# File lib/core/facets/string/natcmp.rb, line 46 46: def natcmp(str2, caseInsensitive=false) 47: str1 = self.dup 48: str2 = str2.dup 49: compareExpression = /^(\D*)(\d*)(.*)$/ 50: 51: if caseInsensitive 52: str1.downcase! 53: str2.downcase! 54: end 55: 56: # remove all whitespace 57: str1.gsub!(/\s*/, '') 58: str2.gsub!(/\s*/, '') 59: 60: while (str1.length > 0) or (str2.length > 0) do 61: # Extract non-digits, digits and rest of string 62: str1 =~ compareExpression 63: chars1, num1, str1 = $1.dup, $2.dup, $3.dup 64: str2 =~ compareExpression 65: chars2, num2, str2 = $1.dup, $2.dup, $3.dup 66: # Compare the non-digits 67: case (chars1 <=> chars2) 68: when 0 # Non-digits are the same, compare the digits... 69: # If either number begins with a zero, then compare alphabetically, 70: # otherwise compare numerically 71: if (num1[0] != 48) and (num2[0] != 48) 72: num1, num2 = num1.to_i, num2.to_i 73: end 74: case (num1 <=> num2) 75: when 1 then return 1 76: when 1 then return 1 77: end 78: when 1 then return 1 79: when 1 then return 1 80: end # case 81: end # while 82: 83: # strings are naturally equal. 84: return 0 85: end 86: 87: end
Returns n characters of the string. If n is positive the characters are from the beginning of the string. If n is negative from the end of the string.
Alternatively a replacement string can be given, which will replace the n characters.
str = "this is text" str.nchar(4) #=> "this" str.nchar(4, 'that') #=> "that" str #=> "that is text"
# File lib/core/facets/string/nchar.rb, line 15 15: def nchar(n, replacement=nil) 16: if replacement 17: s = self.dup 18: n > 0 ? (s[0...n] = replacement) : (s[n..1] = replacement) 19: return s 20: else 21: n > 0 ? self[0...n] : self[n..1] 22: end 23: end
Returns an Enumerator for iterating over each line of the string, void of the termining newline character, in contrast to # which retains it.
# File lib/core/facets/string/newlines.rb, line 9 9: def newlines(&block) 10: if block 11: scan(/^.*?$/) do |line| 12: block.call(line.chomp) 13: end 14: else 15: Enumerator.new(self) do |output| 16: scan(/^.*?$/) do |line| 17: output.yield(line.chomp) 18: end 19: end 20: end 21: end
# File lib/core/facets/kernel/object_state.rb, line 37 37: def object_state(data=nil) 38: data ? replace(data) : dup 39: end
Outdent just indents a negative number of spaces.
CREDIT: Noah Gibbs
# File lib/core/facets/string/indent.rb, line 20 20: def outdent(n) 21: indent(-n) 22: end
Converts a (class or module) name to a unix path.
My::CoolClass.name.pathize #=> "my/cool_class"
# File lib/core/facets/string/pathize.rb, line 17 17: def pathize 18: gsub(/([A-Z]+)([A-Z])/,'\1_\2'). 19: gsub(/([a-z])([A-Z])/,'\1_\2'). 20: gsub('__','/'). 21: gsub('::','/'). 22: downcase 23: end
Return a new string embraced by given quotes. If no quotes are specified, then assumes single quotes.
"quote me".quote #=> "'quote me'" "quote me".quote(2) #=> "\"quote me\""
CREDIT: Trans
# File lib/core/facets/string/bracket.rb, line 69 69: def quote(type=:s) 70: case type.to_s.downcase 71: when 's', 'single' 72: bracket("'") 73: when 'd', 'double' 74: bracket('"') 75: when 'b', 'back' 76: bracket('`') 77: else 78: bracket("'") 79: end 80: end
Like # but returns a Range.
"This is a test!".range('test') #=> 10..13
CREDIT: Trans
# File lib/core/facets/string/range.rb, line 9 9: def range(s, offset=0) 10: if index(s, offset) 11: return ($~.begin(0))..($~.end(0)-1) 12: end 13: nil 14: end
Like # but returns an array of Ranges.
"abc123abc123".range_all('abc') #=> [0..2, 6..8] TODO: Add offset, perhaps ?
CREDIT: Trans
# File lib/core/facets/string/range.rb, line 24 24: def range_all(s, reuse=false) 25: r = []; i = 0 26: while i < self.length 27: rng = range(s, i) 28: if rng 29: r << rng 30: i += reuse ? 1 : rng.end + 1 31: else 32: break 33: end 34: end 35: r.uniq 36: end
Returns an array of ranges mapping the characters per line.
"this\nis\na\ntest".range_of_line #=> [0..4, 5..7, 8..9, 10..13]
CREDIT: Trans
# File lib/core/facets/string/range.rb, line 46 46: def range_of_line 47: offset=0; charmap = [] 48: each_line do |line| 49: charmap << (offset..(offset + line.length - 1)) 50: offset += line.length 51: end 52: charmap 53: end
Apply a set of rules (regular expression matches) to the string.
The rules must be applied in order! So we cannot use a hash because the ordering is not guaranteed! we use an array instead.
The array containing rule-pairs (match, write).
The rewritten string.
CREDIT: George Moschovitis
# File lib/core/facets/string/rewrite.rb, line 18 18: def rewrite(rules) 19: raise ArgumentError.new('The rules parameter is nil') unless rules 20: rewritten_string = dup 21: rules.each do |match,write| 22: rewritten_string.gsub!(match,write) 23: end 24: return rewritten_string 25: end
Considers string a Roman numeral numeral, and converts it to the corresponding integer.
# File lib/more/facets/roman.rb, line 47 47: def roman 48: roman = self 49: raise unless roman? 50: last = roman[1,1] 51: roman.reverse.split('').inject(0) do |result, c| 52: if ROMAN_VALUES[c] < ROMAN_VALUES[last] 53: result -= ROMAN_VALUES[c] 54: else 55: last = c 56: result += ROMAN_VALUES[c] 57: end 58: end 59: end
Returns true iif the subject is a valid Roman numeral.
# File lib/more/facets/roman.rb, line 62 62: def roman? 63: ROMAN =~ self 64: end
Breaks a string up into an array based on a regular expression. Similar to scan, but includes the matches.
s = "<p>This<b>is</b>a test.</p>" s.shatter( /\<.*?\>/ )
produces
["<p>", "This", "<b>", "is", "</b>", "a test.", "</p>"]
CREDIT: Trans
# File lib/core/facets/string/shatter.rb, line 15 15: def shatter( re ) 16: r = self.gsub( re ){ |s| "\11"" + s + "\11"" } 17: while r[0,1] == "\11"" ; r[0] = '' ; end 18: while r[1,1] == "\11"" ; r[1] = '' ; end 19: r.split("\11"") 20: end
A fuzzy matching mechanism. Returns a score from 0-1, based on the number of shared edges. To be effective, the strings must be of length 2 or greater.
"Alexsander".fuzzy_match( "Aleksander" ) #=> 0.9
The way it works:
Converts each string into a “graph like” object, with edges
"alexsander" -> [ alexsander, alexsand, alexsan ... lexsand ... san ... an, etc ] "aleksander" -> [ aleksander, aleksand ... etc. ]
Perform match, then remove any subsets from this matched set (i.e. a hit on “san” is a subset of a hit on “sander”)
Above example, once reduced -> [ ale, sander ]
See’s how many of the matches remain, and calculates a score based on how many matches, their length, and compare to the length of the larger of the two words.
Still a bit rough. Any suggestions for improvement are welcome.
CREDIT: Derek Lewis.
# File lib/core/facets/string/similarity.rb, line 29 29: def similarity(str_in) 30: return 0 if str_in == nil 31: return 1 if self == str_in 32: 33: # Make a graph of each word (okay, so its not a true graph, but is similar) 34: graph_A = Array.new 35: graph_B = Array.new 36: 37: # "graph" self 38: last = self.length 39: (0..last).each do |ff| 40: loc = self.length 41: break if ff == last - 1 42: wordB = (1..(last-1)).to_a.reverse! 43: if (wordB != nil) 44: wordB.each do |ss| 45: break if ss == ff 46: graph_A.push( "#{self[ff..ss]}" ) 47: end 48: end 49: end 50: 51: # "graph" input string 52: last = str_in.length 53: (0..last).each{ |ff| 54: loc = str_in.length 55: break if ff == last - 1 56: wordB = (1..(last-1)).to_a.reverse! 57: wordB.each do |ss| 58: break if ss == ff 59: graph_B.push( "#{str_in[ff..ss]}" ) 60: end 61: } 62: 63: # count how many of these "graph edges" we have that are the same 64: matches = graph_A & graph_B 65: #matches = Array.new 66: #graph_A.each do |aa| 67: # matches.push( aa ) if( graph_B.include?( aa ) ) 68: #end 69: 70: # For eliminating subsets, we want to start with the smallest hits. 71: matches.sort!{|x,y| x.length <=> y.length} 72: 73: # eliminate any subsets 74: mclone = matches.dup 75: mclone.each_index do |ii| 76: reg = Regexp.compile( Regexp.escape(mclone[ii]) ) 77: count = 0.0 78: matches.each{|xx| count += 1 if xx =~ reg} 79: matches.delete(mclone[ii]) if count > 1 80: end 81: 82: score = 0.0 83: matches.each{ |mm| score += mm.length } 84: self.length > str_in.length ? largest = self.length : largest = str_in.length 85: return score/largest 86: end
The reverse of camelcase. Makes an underscored of a camelcase string.
Changes ’::’ to ’/’ to convert namespaces to paths.
Examples
"SnakeCase".snakecase #=> "snake_case" "Snake-Case".snakecase #=> "snake_case" "SnakeCase::Errors".snakecase #=> "snake_case/errors"
# File lib/core/facets/string/snakecase.rb, line 12 12: def snakecase 13: gsub(/::/, '/'). # NOT SO SURE ABOUT THIS 14: gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2'). 15: gsub(/([a-z\d])([A-Z])/,'\1_\2'). 16: tr("-", "_"). 17: downcase 18: end
This is basically the same as #, but it acts like slice! when given only one argument.
Essentlay #, but writes rather than reads.
a = "HELLO" a.splice("X", 1) a #=> "HXLLO" a = "HELLO" a.splice(1) #=> "E" a #=> "HLLO"
CREDIT: Trans
# File lib/core/facets/string/splice.rb, line 20 20: def splice(idx, sub=nil) 21: if sub 22: store(idx, sub) 23: else 24: case idx 25: when Range 26: slice!(idx) 27: else 28: slice!(idx,1) 29: end 30: end 31: end
Does a string start with the given prefix?
"hello".start_with?("he") #=> true "hello".start_with?("to") #=> false
CREDIT: Lucas Carlson, Blaine Cook
# File lib/core/facets/string/start_with.rb, line 12 12: def start_with?(prefix) 13: self.index(prefix) == 0 14: end
Allows # to take n step increments.
"abc".succ #=> "abd" "abc".succ(4) #=> "abg" "abc".succ(24) #=> "aca"
CREDIT: Trans
# File lib/more/facets/succ.rb, line 37 37: def succ(n=1) 38: s = self 39: n.times { s = s.succ1 } 40: s 41: end
Aligns each line n spaces.
CREDIT: Gavin Sinclair
# File lib/core/facets/string/tab.rb, line 9 9: def tab(n) 10: gsub(/^ */, ' ' * n) 11: end
Preserves relative tabbing. The first non-empty line ends up with n spaces before nonspace.
CREDIT: Gavin Sinclair
# File lib/core/facets/string/tabto.rb, line 10 10: def tabto(n) 11: if self =~ /^( *)\S/ 12: indent(n - $1.length) 13: else 14: self 15: end 16: end
Title case.
"this is a string".titlecase => "This Is A String"
CREDIT: Eliazar Parra
# File lib/core/facets/string/titlecase.rb, line 10 10: def titlecase 11: gsub(/\b\w/){ $`[1,1] == "'" ? $& : $&.upcase } 12: end
Interpret common affirmative string meanings as true, otherwise nil or false. Blank space and case are ignored. The following strings that will return true:
true yes on t 1 y == The following strings will return nil: nil null
All other strings return false.
Examples:
"true".to_b #=> true "yes".to_b #=> true "no".to_b #=> false "123".to_b #=> false
# File lib/core/facets/boolean.rb, line 115 115: def to_b 116: case self.downcase.strip 117: when 'true', 'yes', 'on', 't', '1', 'y', '==' 118: return true 119: when 'nil', 'null' 120: return nil 121: else 122: return false 123: end 124: end
Parse data from string.
# File lib/more/facets/date.rb, line 425 425: def to_date 426: #::Date::civil(*ParseDate.parsedate(self)[0..2]) 427: ::Date.new(*::Date._parse(self, false).values_at(:year, :mon, :mday)) 428: end
Convert string to DateTime.
# File lib/more/facets/date.rb, line 419 419: def to_datetime 420: date = ::Date._parse(self, false).values_at(:year, :mon, :mday, :hour, :min, :sec).map { |arg| arg || 0 } 421: ::DateTime.civil(*date) 422: end
Turns a string into a regular expression.
"a?".to_re #=> /a?/
CREDIT: Trans
# File lib/core/facets/string/to_re.rb, line 9 9: def to_re(esc=false) 10: Regexp.new((esc ? Regexp.escape(self) : self)) 11: end
Turns a string into a regular expression. By default it will escape all characters. Use false argument to turn off escaping.
"[".to_rx #=> /\[/
CREDIT: Trans
# File lib/core/facets/string/to_re.rb, line 21 21: def to_rx(esc=true) 22: Regexp.new((esc ? Regexp.escape(self) : self)) 23: end
Translates a string in the form on a set of numerical and/or alphanumerical characters separated by non-word characters (eg W+) into a Tuple. The values of the tuple will be converted to integers if they are purely numerical.
'1.2.3a'.to_t #=> [1,2,"3a"]
It you would like to control the interpretation of each value as it is added to the tuple you can supply a block.
'1.2.3a'.to_t { |v| v.upcase } #=> ["1","2","3A"]
This method calls Tuple.cast_from_string.
# File lib/more/facets/tuple.rb, line 309 309: def to_t( &yld ) 310: Tuple.cast_from_string( self, &yld ) 311: end
# File lib/more/facets/date.rb, line 414 414: def to_time(form = :utc) 415: ::Time.__send__("#{form}_time", *::Date._parse(self, false).values_at(:year, :mon, :mday, :hour, :min, :sec).map{|arg| arg || 0 }) 416: end
Return a new string embraced by given brakets. If only one bracket char is given it will be placed on either side.
"{unwrap me}".debracket('{') #=> "unwrap me" "--unwrap me!".debracket('--','!') #=> "unwrap me!"
CREDIT: Trans
# File lib/core/facets/string/bracket.rb, line 37 37: def unbracket(bra=nil, ket=nil) 38: if bra 39: ket = BRA2KET[bra] unless ket 40: ket = ket ? ket : bra 41: s = self.dup 42: s.gsub!(%[^#{Regexp.escape(bra)}], '') 43: s.gsub!(%[#{Regexp.escape(ket)}$], '') 44: return s 45: else 46: if m = BRA2KET[ self[0,1] ] 47: return self.slice(1...1) if self[1,1] == m 48: end 49: end 50: return self.dup # if nothing else 51: end
Inplace version of #.
CREDIT: Trans
# File lib/core/facets/string/bracket.rb, line 57 57: def unbracket!(bra=nil, ket=nil) 58: self.replace( unbracket(bra, ket) ) 59: end
The reverse of camelcase. Makes an underscored of a camelcase string.
Changes ’::’ to ’/’ to convert namespaces to paths.
Examples
"SnakeCase".underscore #=> "snake_case" "Snake-Case".underscore #=> "snake_case" "SnakeCase::Errors".underscore #=> "snake_case/errors"
# File lib/core/facets/string/underscore.rb, line 12 12: def underscore 13: gsub(/::/, '/'). 14: gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2'). 15: gsub(/([a-z\d])([A-Z])/,'\1_\2'). 16: tr("-", "_"). 17: downcase 18: end
Unfold paragrpahs.
FIXME: Sometimes adds one too many blank lines. TEST!!!
# File lib/core/facets/string/unfold.rb, line 7 7: def unfold 8: blank = false 9: text = '' 10: split(/\n/).each do |line| 11: if /\S/ !~ line 12: text << "\n\n" 13: blank = true 14: else 15: if /^(\s+|[*])/ =~ line 16: text << (line.rstrip + "\n") 17: else 18: text << (line.rstrip + " ") 19: end 20: blank = false 21: end 22: end 23: text = text.gsub(/(\n){3,}/,"\n\n") 24: text.rstrip 25: end
Is the string upcase/uppercase?
"THIS".upcase? #=> true "This".upcase? #=> false "this".upcase? #=> false
CREDIT: Phil Tomson
# File lib/core/facets/string/capitalized.rb, line 38 38: def upcase? 39: upcase == self 40: end
# File lib/core/facets/string/camelcase.rb, line 36 36: def upper_camelcase 37: str = dup 38: str.gsub!(/\/(.?)/){ "::#{$1.upcase}" } # NOT SO SURE ABOUT THIS 39: str.gsub!(/(?:_+|-+)([a-z])/){ $1.upcase } 40: str.gsub!(/(\A|\s)([a-z])/){ $1 + $2.upcase } 41: str 42: end
Upcase first letter.
NOTE: One might argue that this method should behave the same as # and rather this behavior should be in place of #. Probably so, but since Matz has already defined # the way it is, this name seems most fitting to the missing behavior.
# File lib/core/facets/string/uppercase.rb, line 10 10: def uppercase 11: str = to_s 12: str[0,1].upcase + str[1..1] 13: end
Prepend an “@” to the beginning of a string to make a instance variable name. This also replaces non-valid characters with underscores.
# File lib/core/facets/string/variablize.rb, line 7 7: def variablize 8: v = gsub(/\W/, '_') 9: "@#{v}" 10: end
Word wrap a string not exceeding max width.
puts "this is a test".word_wrap(4)
produces
this is a test
This is basic implementation of word wrap, but smart enough to suffice for most use cases.
CREDIT: Gavin Kistner, Dayne Broderson
# File lib/core/facets/string/word_wrap.rb, line 18 18: def word_wrap( col_width=80 ) 19: self.dup.word_wrap!( col_width ) 20: end
As with #, but modifies the string in place.
CREDIT: Gavin Kistner, Dayne Broderson
# File lib/core/facets/string/word_wrap.rb, line 26 26: def word_wrap!( col_width=80 ) 27: self.gsub!( /(\S{#{col_width}})(?=\S)/, '\1 ' ) 28: self.gsub!( /(.{1,#{col_width}})(?:\s+|$)/, "\\1\n" ) 29: self 30: end
Disabled; run with --debug to generate this.
Generated with the Darkfish Rdoc Generator 1.1.6.