# GET PDF

Malware Analysis Category :&#x20;

## Get PDF&#x20;

Note about PDF :&#x20;

1.Each PDF File has its header.&#x20;

2.Each PDF File can have Meta Data (It can be removed as well).

3.Data is stored in Objects and Streams (Investigation Points).

Using `pdfid` tool to show the object in pdf

Export file on wireshark -> notice about the success transfer packer : 200., application/file\_name.

pdf-parser -> analysze obj stream

cmd no know the localtion file to config tool : which commands , example : which pdf-parser

dump obj filter :&#x20;

`pdf-parser --raw -o 7 -f fcexploit.pdf -d obj7` do same with obj 3,9,5,10

<figure><img src="https://1038241181-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fbr7avii8O2bCJtM7fDhm%2Fuploads%2FUG4eghvZIZqNSsNldCfQ%2Fimage.png?alt=media&#x26;token=8884da9a-a3fc-4ed2-8fc9-3b5bd4cf3453" alt=""><figcaption><p>Dump obj3 note the refering to some other obj</p></figcaption></figure>

Dump obj 8 and 6 :&#x20;

<figure><img src="https://1038241181-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fbr7avii8O2bCJtM7fDhm%2Fuploads%2F1cjVlonHcFBneg9D5F0N%2Fimage.png?alt=media&#x26;token=2698bc85-2397-403b-99f2-26d90642b746" alt=""><figcaption></figcaption></figure>

Some ofuscate code in obj5 :&#x20;

<br>

<figure><img src="https://1038241181-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fbr7avii8O2bCJtM7fDhm%2Fuploads%2FOfcjgWutbAepw1kiNHTU%2Fimage.png?alt=media&#x26;token=c1eccdd8-59c5-473a-8d6d-87a258b88c9e" alt=""><figcaption></figcaption></figure>

Beautiful code with JavaScript beautifier , Cipher cheff :thumbsup:

&#x20;

<figure><img src="https://1038241181-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fbr7avii8O2bCJtM7fDhm%2Fuploads%2F3nTPUjWhVmywEGez0pMq%2Fimage.png?alt=media&#x26;token=cc81e629-5e7c-4f3a-9344-aef1dbdead2c" alt=""><figcaption></figcaption></figure>

Data Processing obj10 : &#x20;

<figure><img src="https://1038241181-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fbr7avii8O2bCJtM7fDhm%2Fuploads%2FFWPTTx4yszoU9o7XuKHg%2Fimage.png?alt=media&#x26;token=f0038ac8-21f1-4e01-8003-a20156f7dc26" alt=""><figcaption></figcaption></figure>

Obj 7 and 9 contain shellcode after processing the code : <br>

<figure><img src="https://1038241181-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fbr7avii8O2bCJtM7fDhm%2Fuploads%2FvDVgn2Xw4clKN9mnAFJu%2Fimage.png?alt=media&#x26;token=7002477d-2649-4166-8aa1-de579981fd09" alt=""><figcaption></figcaption></figure>

<figure><img src="https://1038241181-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fbr7avii8O2bCJtM7fDhm%2Fuploads%2FX0YyjCfigJqivQ4u5jjP%2Fimage.png?alt=media&#x26;token=459dd696-fba9-4073-86bc-f16fef56fdf2" alt=""><figcaption></figcaption></figure>

Config libemu to instal pylibemu lib in python :&#x20;

git clone <https://github.com/buffer/libemu.git>

-> checking the README.md file to know how to install and build

File ShellCodeExtract :&#x20;

```
from binascii import unhexlify
import pylibemu
import os
import argparse

def get_payloads(in_file):
 result = []
 with open(in_file, 'r') as f:
    for line in f.readlines():
        if 'unescape' in line:
            result.append(line.split("="))
    f.close()
    big = [i for i in result if i and len(i[1]) >= 80]
    for l in big:
       l[:] = [x[x.find("(")+1:x.find(")")] for x in l]
       l[:] = [x.replace('\t', '') for x in l]
       l[:] = [x.replace('"', '') for x in l]
       l[:] = [x.lstrip('var ') for x in l]
       l[1] = l[1].replace('%u','').encode('unicode-escape')
       l[1] = unhexlify(l[1])
       l[1] = bytes([c for t in zip(l[1][1::2], l[1][::2]) for c in t])
    return big

def sctest_save(list):
    for l in list:
     do_sctest_item(l)

def do_files(ls):
 for l in ls:
   filename = "%s" % l[0]
   i = 0
   while os.path.exists(f"{filename}{i}.sc"):
    i += 1
   outfile = open(f"{filename}{i}.sc", "wb")
   outfile.write(l[1])
   outfile.close

def do_sctest(file):
        item = open(file, 'rb').read()
        maxSteps = 10000000
        emu = pylibemu.Emulator(2048)
        shellcodeOffset = 8
        emu.prepare(item, shellcodeOffset)
        emu.test(maxSteps)
        output = emu.emu_profile_output
        print('sctest ', output)

def do_sctest_item(el):
        item = el[1]
        maxSteps = 10000000
        emu = pylibemu.Emulator(2048)
        shellcodeOffset = 0
        emu.prepare(item, shellcodeOffset)
        emu.test(maxSteps)
        output = emu.emu_profile_output
        filename = "%s" % el[0]
        i = 0
        while os.path.exists(f"{filename}{i}.txt"):
            i += 1
        outfile = open(f"{filename}{i}.txt", "wb")
        outfile.write(output)
        outfile.close

def main():
    parser = argparse.ArgumentParser(description='Extract Shellcode')
    parser.add_argument('-f', '--file', nargs=1, help='path to PS1', required=True)
    parser.add_argument('-x',  '--export', help='export shellcodes', required=False, action='store_true')
    
    args = parser.parse_args()
    
    file = args.file[0]
    export = 0
    if args.export is not None:
     export = 1
    
    list = get_payloads(file)
    sctest_save(list)
    
    if export :
     do_files(list)
     
if __name__ == "__main__":
    main()
```

Reference : <https://github.com/forensicskween/CyberDefenders/blob/main/GetPDF/ShellCodeExtract.py>

add obj 7 and 9 -> to 1 file :thumbsup:

cat obj7 obj9 > file.out

Running the script with this file -> check log to know the result .

Last 2 question -> checking file malware on file pcap and research have 5 CVE contain in PDF file.
