Array
Once you’ve matched a list of elements, you will often need to handle them as a group. Or you may want to perform the same action on each of them. Hpricot::Elements is an extension of Ruby’s array class, with some methods added for altering elements contained in the array.
If you need to create an element array from regular elements:
Hpricot::Elements[ele1, ele2, ele3]
Assuming that ele1, ele2 and ele3 contain element objects (Hpricot::Elem, Hpricot::Doc, etc.)
Usually the Hpricot::Elements you’re working on comes from a search you’ve done. Well, you can continue searching the list by using the same at and search methods you can use on plain elements.
elements = doc.search("/div/p") elements = elements.search("/a[@href='http://hoodwink.d/']") elements = elements.at("img")
When you’re altering elements in the list, your changes will be reflected in the document you started searching from.
doc = Hpricot("That's my <b>spoon</b>, Tyler.") doc.at("b").swap("<i>fork</i>") doc.to_html #=> "That's my <i>fork</i>, Tyler."
If you can’t find a method here that does what you need, you may need to loop through the elements and find a method in Hpricot::Container::Trav which can do what you need.
For example, you may want to search for all the H3 header tags in a document and grab all the tags underneath the header, but not inside the header. A good method for this is next_sibling:
doc.search("h3").each do |h3| while ele = h3.next_sibling ary << ele # stuff away all the elements under the h3 end end
Most of the useful element methods are in the mixins Hpricot::Traverse and Hpricot::Container::Trav.
Given two elements, attempt to gather an Elements array of everything between (and including) those two elements.
# File lib/hpricot/elements.rb, line 315 315: def self.expand(ele1, ele2, excl=false) 316: ary = [] 317: offset = excl ? 1 : 0 318: 319: if ele1 and ele2 320: # let's quickly take care of siblings 321: if ele1.parent == ele2.parent 322: ary = ele1.parent.children[ele1.node_position..(ele2.node_position+offset)] 323: else 324: # find common parent 325: p, ele1_p = ele1, [ele1] 326: ele1_p.unshift p while p.respond_to?(:parent) and p = p.parent 327: p, ele2_p = ele2, [ele2] 328: ele2_p.unshift p while p.respond_to?(:parent) and p = p.parent 329: common_parent = ele1_p.zip(ele2_p).select { |p1, p2| p1 == p2 }.flatten.last 330: 331: child = nil 332: if ele1 == common_parent 333: child = ele2 334: elsif ele2 == common_parent 335: child = ele1 336: end 337: 338: if child 339: ary = common_parent.children[0..(child.node_position+offset)] 340: end 341: end 342: end 343: 344: return Elements[*ary] 345: end
# File lib/hpricot/elements.rb, line 270 270: def self.filter(nodes, expr, truth = true) 271: until expr.empty? 272: _, *m = *expr.match(/^(?:#{ATTR_RE}|#{BRACK_RE}|#{FUNC_RE}|#{CUST_RE}|#{CATCH_RE})/) 273: break unless _ 274: 275: expr = $' 276: m.compact! 277: if m[0] == '@' 278: m[0] = "@#{m.slice!(2,1).join}" 279: end 280: 281: if m[0] == '[' && m[1] =~ /^\d+$/ 282: m = [":", "nth", m[1].to_i-1] 283: end 284: 285: if m[0] == ":" && m[1] == "not" 286: nodes, = Elements.filter(nodes, m[2], false) 287: elsif "#{m[0]}#{m[1]}" =~ /^(:even|:odd)$/ 288: new_nodes = [] 289: nodes.each_with_index {|n,i| new_nodes.push(n) if (i % 2 == (m[1] == "even" ? 0 : 1)) } 290: nodes = new_nodes 291: elsif "#{m[0]}#{m[1]}" =~ /^(:first|:last)$/ 292: nodes = [nodes.send(m[1])] 293: else 294: meth = "filter[#{m[0]}#{m[1]}]" unless m[0].empty? 295: if meth and Traverse.method_defined? meth 296: args = m[2..1] 297: else 298: meth = "filter[#{m[0]}]" 299: if Traverse.method_defined? meth 300: args = m[1..1] 301: end 302: end 303: args << 1 304: nodes = Elements[*nodes.find_all do |x| 305: args[1] += 1 306: x.send(meth, *args) ? truth : !truth 307: end] 308: end 309: end 310: [nodes, expr] 311: end
Adds the class to all matched elements.
(doc/"p").add_class("bacon")
Now all paragraphs will have class=“bacon”.
# File lib/hpricot/elements.rb, line 222 222: def add_class class_name 223: each do |el| 224: next unless el.respond_to? :get_attribute 225: classes = el.get_attribute('class').to_s.split(" ") 226: el.set_attribute('class', classes.push(class_name).uniq.join(" ")) 227: end 228: self 229: end
Just after each element in this list, add some HTML. Pass in an HTML str, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 150 150: def after(str = nil, &blk) 151: each { |x| x.parent.insert_after x.make(str, &blk), x } 152: end
Add to the end of the contents inside each element in this list. Pass in an HTML str, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 132 132: def append(str = nil, &blk) 133: each { |x| x.html(x.children + x.make(str, &blk)) } 134: end
Searches this list for the first element (or child of these elements) matching the CSS or XPath expression expr. Root is assumed to be the element scanned.
See Hpricot::Container::Trav.at for more.
# File lib/hpricot/elements.rb, line 67 67: def at(expr, &blk) 68: search(expr, &blk).first 69: end
Gets and sets attributes on all matched elements.
Pass in a key on its own and this method will return the string value assigned to that attribute for the first elements. Or nil if the attribute isn’t found.
doc.search("a").attr("href") #=> "http://hacketyhack.net/"
Or, pass in a key and value. This will set an attribute for all matched elements.
doc.search("p").attr("class", "basic")
You may also use a Hash to set a series of attributes:
(doc/"a").attr(:class => "basic", :href => "http://hackety.org/")
Lastly, a block can be used to rewrite an attribute based on the element it belongs to. The block will pass in an element. Return from the block the new value of the attribute.
records.attr("href") { |e| e['href'] + "#top" }
This example adds a # anchor to each link.
# File lib/hpricot/elements.rb, line 201 201: def attr key, value = nil, &blk 202: if value or blk 203: each do |el| 204: el.set_attribute(key, value || blk[el]) 205: end 206: return self 207: end 208: if key.is_a? Hash 209: key.each { |k,v| self.attr(k,v) } 210: return self 211: else 212: return self[0].get_attribute(key) 213: end 214: end
Add some HTML just previous to each element in this list. Pass in an HTML str, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 144 144: def before(str = nil, &blk) 145: each { |x| x.parent.insert_before x.make(str, &blk), x } 146: end
Empty the elements in this list, by removing their insides.
doc = Hpricot("<p> We have <i>so much</i> to say.</p>") doc.search("i").empty doc.to_html => "<p> We have <i></i> to say.</p>"
# File lib/hpricot/elements.rb, line 126 126: def empty 127: each { |x| x.inner_html = nil } 128: end
# File lib/hpricot/elements.rb, line 347 347: def filter(expr) 348: nodes, = Elements.filter(self, expr) 349: nodes 350: end
Returns an HTML fragment built of the contents of each element in this list.
If a HTML string is supplied, this method acts like inner_html=.
# File lib/hpricot/elements.rb, line 82 82: def inner_html(*string) 83: if string.empty? 84: map { |x| x.inner_html }.join 85: else 86: x = self.inner_html = string.pop || x 87: end 88: end
Replaces the contents of each element in this list. Supply an HTML string, which is loaded into Hpricot objects and inserted into every element in this list.
# File lib/hpricot/elements.rb, line 95 95: def inner_html=(string) 96: each { |x| x.inner_html = string } 97: end
Returns an string containing the text contents of each element in this list. All HTML tags are removed.
# File lib/hpricot/elements.rb, line 103 103: def inner_text 104: map { |x| x.inner_text }.join 105: end
# File lib/hpricot/elements.rb, line 352 352: def not(expr) 353: if expr.is_a? Traverse 354: nodes = self - [expr] 355: else 356: nodes, = Elements.filter(self, expr, false) 357: end 358: nodes 359: end
Add to the start of the contents inside each element in this list. Pass in an HTML str, which is turned into Hpricot elements.
# File lib/hpricot/elements.rb, line 138 138: def prepend(str = nil, &blk) 139: each { |x| x.html(x.make(str, &blk) + x.children) } 140: end
Remove all elements in this list from the document which contains them.
doc = Hpricot("<html>Remove this: <b>here</b></html>") doc.search("b").remove doc.to_html => "<html>Remove this: </html>"
# File lib/hpricot/elements.rb, line 115 115: def remove 116: each { |x| x.parent.children.delete(x) } 117: end
Remove an attribute from each of the matched elements.
(doc/"input").remove_attr("disabled")
# File lib/hpricot/elements.rb, line 235 235: def remove_attr name 236: each do |el| 237: next unless el.respond_to? :remove_attribute 238: el.remove_attribute(name) 239: end 240: self 241: end
Removes a class from all matched elements.
(doc/"span").remove_class("lightgrey")
Or, to remove all classes:
(doc/"span").remove_class
# File lib/hpricot/elements.rb, line 251 251: def remove_class name = nil 252: each do |el| 253: next unless el.respond_to? :get_attribute 254: if name 255: classes = el.get_attribute('class').to_s.split(" ") 256: el.set_attribute('class', (classes - [name]).uniq.join(" ")) 257: else 258: el.remove_attribute("class") 259: end 260: end 261: self 262: end
Searches this list for any elements (or children of these elements) matching the CSS or XPath expression expr. Root is assumed to be the element scanned.
See Hpricot::Container::Trav.search for more.
# File lib/hpricot/elements.rb, line 58 58: def search(*expr,&blk) 59: Elements[*map { |x| x.search(*expr,&blk) }.flatten.uniq] 60: end
Convert this group of elements into a complete HTML fragment, returned as a string.
# File lib/hpricot/elements.rb, line 74 74: def to_html 75: map { |x| x.output("") }.join 76: end
Wraps each element in the list inside the element created by HTML str. If more than one element is found in the string, Hpricot locates the deepest spot inside the first element.
doc.search("a[@href]"). wrap(%{<div class="link"><div class="link_inner"></div></div>})
This code wraps every link on the page inside a div.link and a div.link_inner nest.
# File lib/hpricot/elements.rb, line 162 162: def wrap(str = nil, &blk) 163: each do |x| 164: wrap = x.make(str, &blk) 165: nest = wrap.detect { |w| w.respond_to? :children } 166: unless nest 167: raise "No wrapping element found." 168: end 169: x.parent.replace_child(x, wrap) 170: nest = nest.children.first until nest.empty? 171: nest.html([x]) 172: end 173: end
Disabled; run with --debug to generate this.
Generated with the Darkfish Rdoc Generator 1.1.6.