Analyze office files
Last updated
Was this helpful?
Last updated
Was this helpful?
There are two generations of Office file format:
the OLE formats (file extensions like RTF, DOC, XLS, PPT),
the "Office Open XML" formats (file extensions that include DOCX, XLSX, PPTX).
Both formats are structured, compound file binary formats that enable Linked or Embedded content (Objects).
OOXML files are actually zip file containers, meaning that one of the easiest ways to check for hidden data is to simply unzip
the document
unzip file.docx
# install basic tools
sudo pip3 install -U oletools
# oleid : analyze OLE files to detect specific characteristics usually found in malicious files
oleid file.xls
# upload the file to virustotal and see...
# https://www.virustotal.com/gui/home/upload
#
# oledump
#
wget https://raw.githubusercontent.com/DidierStevens/DidierStevensSuite/master/oledump.py
python3 oledump.py -h
# List all OLE2 streams present in file.xls
python3 oledump.py file.xls -i
# Extract VBA source code from stream 3 in file.xls
python3 oledump.py file.xls -s 3 -v
# Find obfuscated URLs in file.xls macros
python3 oledump.py file.xls -p plugin_http_heuristics
#
# olevba
#
# Extract VBA macros in clear text with deobfuscation and analysis
olevba file.doc