Origami 1.0 released!

Tue 24 May 2011 by guillaume

I am pleased to announce the first stable release of Origami, the PDF manipulation framework! A lot of new cool features has been added since the last beta and I consider the framework has become stable enough now. This release introduces the support for AES256 encryption/decryption, partial support for CCITTFax streams and TIFF predictors. Origami now also comes with a Ruby shell for inline usage and a set of command-line tools for PDF documents analysis.

What is Origami?

Origami is a framework for PDF documents manipulation written in pure Ruby. It can be used to analyze or create malicious PDF documents. Being written in Ruby, the core engine of Origami is totally scriptable and can be used for automated tasks on large sets of documents. A GTK graphical interface is also available for manually browsing through the inner objects of a PDF document. The philosophy behind Origami is the following:

  • Support for both reading and writing to PDF documents. Origami is able to create documents from scratch, read existing documents and modify them. Each new feature added must be compatible with reading and writing.
  • Handling a large subset of the PDF specification. Origami focuses on features from the PDF specification which can be used to obfuscate documents or provide offensive capabilities.
  • Being flexible and extensible. Origami can be used in many ways, even if you are new to the Ruby language.

Origami supports many advanced features of the PDF specification, such as:

  • Compression filters and predictor functions
  • Encryption
  • Digital signatures
  • Object streams
  • File attachments
  • AcroForms and XFA forms
  • PDF actions and annotations (including Flash)

Origami is open-source and distributed under the LGPL license.

New features

Here is the list of new features added in this version:

  • Support for AES256 encryption/decryption of documents.
  • Support for G3 unidimensional encoding/decoding of CCITTFax streams.
  • Support for TIFF predictor functions.
  • Enhanced support for Ruby 1.9.
  • Can now be installed as a gem.
  • Added methods for browsing pages and name trees.
  • Added a Ruby shell for quick document analysis.
  • Added a set of useful tools built upon Origami (pdfdecrypt, pdfencrypt, pdfdecompress, pdfextract, pdfmetadata, pdfcocoon, pdfcop, pdf2graph, pdf2ruby...)
  • Lots of bug fixes.

Installation

Origami can now be installed as a Ruby gem. If you have Rubygems installed, just run:

gem install origami

You can also directly fetch the latest development version using the Mercurial repository at GoogleCode:

hg clone https://origami-pdf.googlecode.com/hg/ origami

Usage

There are several ways to use Origami, depending on what you intend to do. If you are a novice about the PDF file format or Ruby, you can rely on the ready-to-run scripts or the GTK interface to discover PDF file structures. Advanced tasks can also be performed from the Origami shell or by writing custom Origami scripts.

Tools

The directory bin/ contains a set of useful Origami scripts which can instantly be used over PDF documents.

Document analysis

  • pdfdecompress: Strips out all compression filters from a document.
  • pdfdecrypt: Allows you to decrypt an existing document.
  • pdfcop: Automated engine for malicious documents analysis.
  • pdfmetadata: Extracts the metadata from a document.
  • pdfextract: Extracts scripts, fonts, streams, attached files from a document.

Document shielding

  • pdfencrypt: Allows you to encrypt an existing document.
  • pdfcocoon: Embeds a document into another one and makes it run when the top-level PDF is open.

Misc

  • pdf2graph: Generates a graph (DOT or GraphML) out of PDF objects.
  • pdf2ruby: Generates a Ruby script from a document, that Origami can recompile into the same original document.

Examples

Decrypting and decompressing an empty password encrypted document:

pdfdecrypt encrypted.pdf | pdfdecompress > plain.pdf

Encrypting a document with AES256 and embedding it into a sane document:

pdfencrypt -c aes -s 256 document.pdf | pdfcocoon > cocooned.pdf

Running an automated analysis on a suspicious document:

$ pdfcop malicious.pdf
PDFcop is running on target `malicious.pdf', policy = `standard'
   File size: 142237 bytes
   MD5: bd936ee3ba0b6dd467a2620f0d8275c7
 > Inspecting document structure...
   . Encryption = YES
 > Inspecting document catalog...
   . OpenAction entry = YES
   >> Inspecting action...
 > Inspecting JavaScript names directory...
 > Inspecting attachment names directory...
 > Inspecting document pages...
   >> Inspecting page...
   >> Inspecting page...
   >> Inspecting page...
     .. Page has an action dictionary.
     >>> Inspecting action...
       ... Found /JavaScript action.
 Document rejected by policy `standard', caused by [:allowJSAtOpening].

Quickly extracting metadata:

$ pdfmetadata rapportousacalipari.pdf
[*] Document information dictionary:
Producer            : Acrobat Distiller 6.0 (Windows)
_AuthorEmail        : robert.potter@iraq.centcom.mil
CreationDate        : D:20050430124604+04'00'
Creator             : Acrobat PDFMaker 6.0 for Word
Title               : TABLE OF CONTENTS
Author              : richard.thelin
ModDate             : D:20050430233208+02'00'
_AdHocReviewCycleID : -553148013
_EmailSubject       : Another Redact Job For You
SourceModified      : D:20050430084305
Company             : USCENTCOM
_AuthorEmailDisplayName: Potter Robert A COL MNFI STRATCOM

[*] Metadata stream:
MetadataDate        : 2005-04-30T23:32:08+02:00
Producer            : Acrobat Distiller 6.0 (Windows)
ModifyDate          : 2005-04-30T23:32:08+02:00
CreateDate          : 2005-04-30T12:46:04+04:00
title               : TABLE OF CONTENTS
creator             : richard.thelin
CreatorTool         : Acrobat PDFMaker 6.0 for Word

Graphical interface

You can quickly browse PDF objects using the GTK interface. For this you will need the gtk2 gem installed:

gem install gtk2

Then you just run:

pdfwalker

PDF

Just a hint for quick browsing: double-clicking on an object reference makes you go forward to the desired object. Use ESC to go back to your previous location.

Shell

The shell offers a classic Ruby shell with Origami namespace being automatically included. With some knowledge about the Origami API, it is possible to perform quick tasks on a PDF document. The following session uncovers the presence of a document nested into another.

$ pdfsh
# Welcome to the PDF shell (Origami release 1.0.2)

>>> pdf = PDF.read('malicious.pdf')

>>> pdf.ls_names(Names::Root::EMBEDDEDFILES)
{(metadata.pdf)=>48 0 R}

>>> pdf[48]
48 0 obj
<<
        /EF <<
                /F 49 0 R
        >>
        /F (metadata.pdf)
        /Type /Filespec
>>
endobj

>>> pdf[49].class
Origami::Stream

>>> pdf[49]
0000000000  25 50 44 46 2D 31 2E 35 0A 25 D4 C0 BC 8F 0A 31  %PDF-1.5.%.....1
0000000010  20 30 20 6F 62 6A 20 0A 3C 3C 0A 2F 50 61 67 65   0 obj .<<./Page
0000000020  73 20 32 20 30 20 52 0A 2F 54 79 70 65 20 2F 43  s 2 0 R./Type /C
0000000030  61 74 61 6C 6F 67 0A 2F 4F 70 65 6E 41 63 74 69  atalog./OpenActi
0000000040  6F 6E 20 31 31 20 30 20 52 0A 2F 41 63 72 6F 46  on 11 0 R./AcroF

>>> File.open('nested.pdf','w') {|f| f.write pdf[49].data }

Infecting a document in one-line:

>>> PDF.read('sane.pdf').onDocumentOpen(Action::JavaScript.new(gets)).save('infected.pdf')

Scripting

The full potential of the Origami framework can be exploited using custom Ruby scripts. Origami scripts can be used to generate exploits, automate complex tasks over a set of documents, etc. Details about the Origami API can be found in the RDoc documentation, the Origami cheatsheet or by looking at the sample scripts located in the samples/ directory. The following little script simply adds an action to each page of a given document. Each time a page is opened, the PDF reader is instructed to jump to the next page, thus making the document endlessly scrolling.

#!/usr/bin/env ruby

require 'origami'
include Origami

pdf = PDF.read(ARGV[0], :verbosity => Parser::VERBOSE_QUIET )

pages = pdf.pages

pages.each do |page|
  page.onOpen(Action::Named.new(Action::Named::NEXTPAGE)) unless page == pages.last
end
pages.last.onOpen(Action::Named.new(Action::Named::FIRSTPAGE))

pdf.save("looping.pdf")

This other example will read a given document, add a JavaScript to execute at run-time and will save it as encrypted (with an empty password).

#!/usr/bin/env ruby

require 'origami'
include Origami

EXPLOIT = <<JS
app.alert("Your malicious exploit here")
JS

pdf = PDF.read(ARGV[0])

stream = Stream.new(EXPLOIT, :Filter => :FlateDecode)
pdf.pages.first.onOpen Action::JavaScript.new(stream)

pdf.encrypt.save('malicious.pdf')

Suggestions / bug reports

You can contact me at origami(at)security-labs.org for any questions or remarks. Please report issues on the GoogleCode project page.