Skip to content

Supported XPath Expressions

DenisKnauf edited this page Sep 13, 2010 · 2 revisions

Here are some samples of XPath use in Hpricot:

 require 'hpricot'
 require 'open-uri'
 doc = Hpricot(URI.parse("http://google.com/").read)

 doc.search("/html/body//p")
 doc.search("//p")
 doc.search("//p/a")
 doc.search("//a[@src]")
 doc.search("//a[@src='google.com']")

Location Paths

Absolute Paths

 doc.search("/html/body//p")
 doc.search("/*/body//p")
 doc.search("//p/../div")

Relative Paths

 doc.search("a",this)
 doc.search("p/a",this)

Supported Axes

descendant

Element has a descendant element.

 doc.search("//div/descendant::p")

Identical to doc.search("//div//p") .

child

Element has a child element.

 doc.search("//div/child::p")

Which is identical to: doc.search("//div/p") .

preceding-sibling

Element has an element before it, on the same axes.

 doc.search("//div/preceding-sibling::form")

parent

Selects the parent element of the element

 doc.search("//div/parent::div")

Which is identical to doc.search("//div/../div") .

self

Selects the element itself.

Supported Predicates

  • [@*] Has an attribute
    doc.search(“//div[@*]”))
  • [foo] Has an attribute of foo
    doc.search(“//input[@checked]”))
  • [foo=‘test’] Attribute foo is equal to test
    doc.search(“//a[@ref=‘nofollow’]”))
  • [Nodelist] Element contains a node list, for example:
    doc.search(“//div[p]”)
    doc.search(“//div[p/a]”)

Supported Predicates, but differently

  • [last()] or [position()=last()] becomes :last
    doc.search(“p:last”))
  • [0] or [position()=0] becomes :eq(0) or :first
    doc.search(“p:first”)
    doc.search(“p:eq(0)”)
  • [position() < 5] becomes :lt(5)
    doc.search(“p:lt(5)”))
  • [position() > 2] becomes :gt(2)
    doc.search(“p:gt(2)”))