# GET PDF

Malware Analysis Category :&#x20;

## Get PDF&#x20;

Note about PDF :&#x20;

1.Each PDF File has its header.&#x20;

2.Each PDF File can have Meta Data (It can be removed as well).

3.Data is stored in Objects and Streams (Investigation Points).

Using `pdfid` tool to show the object in pdf

Export file on wireshark -> notice about the success transfer packer : 200., application/file\_name.

pdf-parser -> analysze obj stream

cmd no know the localtion file to config tool : which commands , example : which pdf-parser

dump obj filter :&#x20;

`pdf-parser --raw -o 7 -f fcexploit.pdf -d obj7` do same with obj 3,9,5,10

<figure><img src="/files/n5foJMCycLw0ZPkjieFI" alt=""><figcaption><p>Dump obj3 note the refering to some other obj</p></figcaption></figure>

Dump obj 8 and 6 :&#x20;

<figure><img src="/files/MRjc2CqGad1BgDNn9nTn" alt=""><figcaption></figcaption></figure>

Some ofuscate code in obj5 :&#x20;

<br>

<figure><img src="/files/0RhUyfmi2lewRNB91hIz" alt=""><figcaption></figcaption></figure>

Beautiful code with JavaScript beautifier , Cipher cheff :thumbsup:

&#x20;

<figure><img src="/files/ouVtQ8sj0QAAqq9zwyCc" alt=""><figcaption></figcaption></figure>

Data Processing obj10 : &#x20;

<figure><img src="/files/Pa67C6NwV8AHD8OTHxSO" alt=""><figcaption></figcaption></figure>

Obj 7 and 9 contain shellcode after processing the code : <br>

<figure><img src="/files/LY46ljTMEjNKK0vvpZSv" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/RVFXk007ASAI9eYSg095" alt=""><figcaption></figcaption></figure>

Config libemu to instal pylibemu lib in python :&#x20;

git clone <https://github.com/buffer/libemu.git>

-> checking the README.md file to know how to install and build

File ShellCodeExtract :&#x20;

```
from binascii import unhexlify
import pylibemu
import os
import argparse

def get_payloads(in_file):
 result = []
 with open(in_file, 'r') as f:
    for line in f.readlines():
        if 'unescape' in line:
            result.append(line.split("="))
    f.close()
    big = [i for i in result if i and len(i[1]) >= 80]
    for l in big:
       l[:] = [x[x.find("(")+1:x.find(")")] for x in l]
       l[:] = [x.replace('\t', '') for x in l]
       l[:] = [x.replace('"', '') for x in l]
       l[:] = [x.lstrip('var ') for x in l]
       l[1] = l[1].replace('%u','').encode('unicode-escape')
       l[1] = unhexlify(l[1])
       l[1] = bytes([c for t in zip(l[1][1::2], l[1][::2]) for c in t])
    return big

def sctest_save(list):
    for l in list:
     do_sctest_item(l)

def do_files(ls):
 for l in ls:
   filename = "%s" % l[0]
   i = 0
   while os.path.exists(f"{filename}{i}.sc"):
    i += 1
   outfile = open(f"{filename}{i}.sc", "wb")
   outfile.write(l[1])
   outfile.close

def do_sctest(file):
        item = open(file, 'rb').read()
        maxSteps = 10000000
        emu = pylibemu.Emulator(2048)
        shellcodeOffset = 8
        emu.prepare(item, shellcodeOffset)
        emu.test(maxSteps)
        output = emu.emu_profile_output
        print('sctest ', output)

def do_sctest_item(el):
        item = el[1]
        maxSteps = 10000000
        emu = pylibemu.Emulator(2048)
        shellcodeOffset = 0
        emu.prepare(item, shellcodeOffset)
        emu.test(maxSteps)
        output = emu.emu_profile_output
        filename = "%s" % el[0]
        i = 0
        while os.path.exists(f"{filename}{i}.txt"):
            i += 1
        outfile = open(f"{filename}{i}.txt", "wb")
        outfile.write(output)
        outfile.close

def main():
    parser = argparse.ArgumentParser(description='Extract Shellcode')
    parser.add_argument('-f', '--file', nargs=1, help='path to PS1', required=True)
    parser.add_argument('-x',  '--export', help='export shellcodes', required=False, action='store_true')
    
    args = parser.parse_args()
    
    file = args.file[0]
    export = 0
    if args.export is not None:
     export = 1
    
    list = get_payloads(file)
    sctest_save(list)
    
    if export :
     do_files(list)
     
if __name__ == "__main__":
    main()
```

Reference : <https://github.com/forensicskween/CyberDefenders/blob/main/GetPDF/ShellCodeExtract.py>

add obj 7 and 9 -> to 1 file :thumbsup:

cat obj7 obj9 > file.out

Running the script with this file -> check log to know the result .

Last 2 question -> checking file malware on file pcap and research have 5 CVE contain in PDF file.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://kayiyan.gitbook.io/hacking-note/cyber-defenders/get-pdf.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
