Attributes

XPattern: Attributes

HTML nodes can have attributes. Xpattern can collect their values into result variables, and it can require that a given attribute exists, and even that it contains a given value.

Attributes are enclosed in square brackets. They come after the cardinality and assignment.

Attribute names start with ‘@’, as in XPath.

Collecting values

Often we want to extract the attribute value, for example the URL from an A tag.

  A [ @href $url ]

Matching attribute names

Some times we want to match only nodes that have a given attribute.

  <span class="caps">SPAN</span> [ @highlight ]
will match any SPAN that has a highlight attribute, no matter what value it has

Matchign attribute values

Often we want to match only nodes that have a given value in an attribute

  <span class="caps">SPAN</span> [ @class="title" ]

or with a regular expression:

  A [ @href ~ "author" ]
  A [ @href ~ "indexdata.com/search.cgi\?title=[A-Z]" ]