Cardinality
Xpattern: Cardinality
XPattern can handle nodes that are optional, repeating, or both.
Optional: ?
A question mark indicates the node is optional.
by <i> First author </i>
<b>1999</b> <p/>
<a href="...">Second title</a>
<b>1999</b> <p/>
<a href="...">Third title</a>
by <i> Author </i>
and <i> Another Author </i>
<b>1999</b> <p/>
Repeating: +
A plus indicates a repeating node. There has to be at least one of them.
Optional repeating: *
An asterisk indicates that a node is both optional and repeating. That is, there can be zero or more of them.
Greediness: +? and *?
By default all repeated patterns are greedy, meaning that they match as much as possible. Sometimes it is desirable to match as little as possible instead. This can also be much more effective, especially with ANY, which can try to match the rest of the document, before backtracking to only a few nodes.
As an example
