Playing with origami in PDF

Fri 19 June 2009 by fred

PDF file format is now very common. It is regarded as secure because most people believe it is static. It is not. In order to prove it, we have developed a Ruby framework, origami designed to play with PDF files.

Some code being usually more helpful than long writing, let's learn how to play with origami.

PDF terminology

Here we just explain some words you will see/hear when dealing with a PDF file.

  • object: in a PDF file, everything is an object. Open any file, and you will see something like:

       42 0 obj <<
       ...
       >>
    
    What is between the <<...>> depends on the kind of object.
    
  • Dictionary: it is the most common kind of object. You can consider it as an associative array, with Key Value pair. For instance, the code below describes a font.

    42 0 obj
    <<
    /Encoding /MacRomanEncoding
    /Subtype /Type1
    /BaseFont /Helvetica
    /Type /Font
    /Name /F1
    >>
    endobj
    

    Note the Type key, which is almost always present in a dictionary.

  • stream: it is a dictionary object containing additional data. The example below is a text which will be displayed on a page:

    5 0 obj
    <<
    /Length 101
    >>stream
    BT
    1 Tr /F1 30 Tf 350 750 Td (base.pdf) Tj
    ET
    BT
    0 Tr /F1 15 Tf 186 690 Td (Empty test file) Tj
    ET
    endstream
    endobj
    

Creating a PDF from scratch

Easy ! We start by creating a stream containing the text which will be printed in the file. Then, we put the text into the file, and save it.

# Create a simple PDF document.
contents = ContentStream.new
contents.write 'I AM EMPTY',
:x => 350, :y => 750,
:rendering => PS::Text::Rendering::STROKE,
:size => 15
PDF.new.append_page(Page.new.setContents(contents))
pdf.saveas('out.pdf')

Adding a JavaScript to a PDF

Adding a JavaScript is really easy: you read the PDF you want to modify, and add an OpenAction which executes the JavaScript.

pdf = PDF.read( ARGV[0])
jscript = File.open("fooscript.js").read
jsaction = Action::JavaScript.new(
Stream.new(jscript,
:Filter => :FlateDecode) )
pdf.onDocumentOpen(jsaction)
pdf.saveas("out.pdf")

The above example is really simple as we should take care before adding the OpenAction in case another one is already there.

Hiding a file into a PDF

One can add attachment to PDF files. We provide the method attach_file to do it:

pdf = PDF.read( ARGV[0] )
pdf.attach_file(path, :EmbeddedName => ARGV[1])
pdf.saveas("out.pdf")

More to come ...