XPath (XML Path Language) is a query language used to navigate and extract data from XML documents. It allows developers to select elements, attributes, and values within an XML structure using path expressions.

Example XML:

<books>
    <book>
        <title>Learning XPath</title>
        <author>John Doe</author>
        <price>29.99</price>
    </book>
    <book>
        <title>Mastering XML</title>
        <author>Jane Smith</author>
        <price>39.99</price>
    </book>
</books>

Example XPath Query:

(: Find all book titles :)
/books/book/title

Result

<title>Learning XPath</title>
<title>Mastering XML</title>

Why Use It?

  • Extract specific data from XML documents.
  • Navigate XML trees efficiently.
  • Used in XSLT, XQuery, and Selenium for web automation.
  • Supports complex queries with conditions and functions.

Syntax & Expressions

Basic Syntax

ExpressionDescriptionExample
/Root node/books
//Selects nodes anywhere//title (all titles)
.Current node. (self-reference)
..Parent node../author (parent’s author)
@Attribute selection//@id (all id attributes)

Predicates (Conditions in [])

ExpressionDescriptionExample
[index]Selects element by index (1-based)/books/book[1] (first book)
[text()]Filters by text content//book[title='Learning XPath']
[@attr]Filters by attribute//book[@price='29.99']

Advanced XPath Functions

String and Math functions

FunctionDescriptionExample
contains()Checks if a string contains a substring//book[contains(title, 'XML')]
starts-with()Checks string prefix//book[starts-with(title, 'Learning')]
normalize-space()Removes extra spacesnormalize-space(//title)
sum()Sums numeric valuessum(//book/price)
count()Counts nodescount(//book)

Logical Operators

OperatorDescriptionExample
andBoth conditions must be true//book[price>30 and price<40]
orAt least one condition must be true//book[price=29.99 or price=39.99]

Real-World Application Examples

Web Scraping (Using Selenium in Python)

from selenium import webdriver
 
driver = webdriver.Chrome()
driver.get("https://example.com")
 
# Find element using XPath
element = driver.find_element("xpath", "//h1")
print(element.text)
 
driver.quit()

XSLT (Transforming XML with XPath)

<xsl:for-each select="//book">
    <xsl:value-of select="title"/>
</xsl:for-each>

API Responses (Parsing XML in Python)

import xml.etree.ElementTree as ET
 
xml_data = '''<books><book><title>Learning XPath</title></book></books>'''
tree = ET.fromstring(xml_data)
 
# XPath-like selection
titles = tree.findall(".//title")
for title in titles:
    print(title.text)