PDF Succinctly^®
by Ryan Hodson

CHAPTER 5

Navigation and Annotations

We’ve seen how PDFs can accurately represent a physical document in a digital file, but they also provide powerful features that take advantage of their medium. Whereas interactive navigation and editable comments are not possible with a physical book, PDFs make it easy to take notes, share them with others, and bookmark important locations.

This chapter explores the three most important types of user interaction: the document outline, hyperlinks, and text annotations.

Preparations

Before exploring the internal navigation scheme of a PDF, we need a document long enough to demonstrate these interactive features. For our example, all we need to do is add another page. This will also serve as a relevant review of the core PDF objects.

Let’s start by adding the page to the document root. The only change here is to add 6 0 R to the /Kids entry.

1 0 obj
<< /Type /Pages
/Kids [2 0 R 6 0 R]
/Count 2
>>
endobj

Next, we need to create the page object and give it an ID of 6 0. Objects can occur in any order, so you can put this anywhere in the document body.

6 0 obj
<< /Type /Page
   /MediaBox [0 0 612 792]
   /Resources 3 0 R
   /Parent 1 0 R
   /Contents [7 0 R]
>>
endobj

This looks exactly like our other page (2 0 obj), but it points to a different content stream (7 0 R). This page will contain a little bit of textual data.

7 0 obj
<< >>
stream
1 0 0 1 50 706 cm
BT
    24 TL
    /F0 36 Tf
    (Page Two) Tj T*
    /F0 12 Tf
    (This is the second page of our document.) Tj
ET
endstream
endobj

And that’s all we have to do to create another page.

The Document Outline

Complex PDFs usually come with an interactive table of contents for user-friendly navigation. Internally, this is called a document outline. PDF readers typically present this outline as a nested tree that the user can open and close.

Screenshot of a document outline in Adobe Acrobat Pro

Figure 22: Screenshot of a document outline in Adobe Acrobat Pro

The structure of such a tree is maintained separately from the page objects and content streams of the document. But, like these components, a document outline begins in the catalog object. Add an /Outlines entry to our existing catalog.

5 0 obj
<< /Type /Catalog
/Pages 1 0 R
/Outlines 8 0 R
>>
endobj

This points to the root of the document outline. We’re going to create a very simple outline that looks exactly like the one shown in the previous figure. It contains a single root node.

8 0 obj
<< /First 9 0 R
/Last 9 0 R
>>
endobj

The /First and /Last entries are a reference to the only top-level node in the outline. In the real world, a PDF would probably have more than one top-level node, but you get the idea. Next, we need to create the following node.

9 0 obj
<< /Parent 8 0 R
   /Title (Part I)
   /First 10 0 R
   /Last 11 0 R
   /Dest [2 0 R /Fit]
>>
endobj

/Parent points back to the document root. /Title is a string literal containing the section title displayed by the PDF reader. /First and /Last are the same as in the 8 0 obj—they point to this node’s first and last children. Since this node will have two children, /First and /Last are different.

Finally, the /Dest entry defines the destination of the navigation item. A destination is a specific location in the document, specified as a page number, position on the page, and magnification. In this case, we want to display the first page (2 0 R) and zoom to fit the entire page in the reader’s window (no position can be specified when a page is zoomed to fit). There are several keywords besides /Fit that can be used for fine-grained control over a user’s interaction with the document. A few of these will be covered shortly.

Next, we need to add the two child nodes to “Part I”. The first one will navigate to the top of the second page.

10 0 obj
<< /Parent 9 0 R
   /Title (Chapter 1)
   /Next 11 0 R
   /Dest [6 0 R /FitH 792]
>>
endobj

This looks very similar to its parent node, but it has no sub-nodes, so /First and /Last can be omitted. Instead, it needs a /Next entry to point to its sibling. The /FitH keyword instructs the PDF reader to zoom just enough to make the width of the page fill the width of the window. After /FitH is the vertical coordinate to display at the top of the window. Since we wanted to navigate to the top of the page, we specified the height of the page; however, passing a lower value would let you scroll partway down the page. There is a corresponding /FitV keyword that fills vertically and offsets from the left of the page.

Finally, we arrive at the last navigation item. This one will point to a destination halfway down the second page.

11 0 obj
<< /Parent 9 0 R
   /Title (Chapter 2)
   /Prev 10 0 R
   /Dest [6 0 R /XYZ 0 396 2]
>>
endobj

Again, this is just like the previous node, except it has a /Prev pointing back to its previous sibling. And, instead of zooming to fit, we manually specified a location (0, 396) and a magnification (2) using the /XYZ keyword.

You should now be able to compile your PDF with pdftk and see the document outline (you may need to open the bookmarks panel to see it). You’ll notice that the “Part I” node is always closed by default. If you’d like to open it, add a /Count 2 entry to the top-level node (9 0 obj). The /Count entry contains the number of visible child nodes. Omitting it hides all child nodes.

To summarize, the document outline consists of a series of navigation items. The /First, /Last, /Next, /Prev, and /Parent dictionary entries relate items to each other and define the structure of the outline as a whole. Each item also contains a destination to navigate to, which is defined as a page, location, and magnification.

The Initial Destination

In addition to defining a user-controlled navigation tree, the catalog object can control the initial page to display. This can be accomplished by passing a destination to the /OpenAction entry in the catalog object.

5 0 obj
<< /Type /Catalog
   /Pages 1 0 R
   /Outlines 8 0 R
   /OpenAction [6 0 R /Fit]
>>
endobj

Now, when you open the document, the second page (6 0 obj) will be displayed and the viewer will zoom to fit the entire page.

Hyperlinks

It’s also possible to create hyperlinks within the document to jump to another destination. PDF hyperlinks aren’t like HTML links where the link is directly connected with the text—they are merely rectangular areas placed on top of the page, much like a graphic. They work more like buttons than true hyperlinks.

Hyperlinks are one of many types of annotations. Annotations are extra information associated with a particular page. Pages cannot share annotations. The second most common type of annotation is a comment, which we’ll look at in a moment.

Annotations are stored in an array under the /Annots entry in a page object. Our link will be on the second page (6 0 obj):

6 0 obj
<< /Type /Page
   /MediaBox [0 0 612 792]
   /Resources 3 0 R
   /Parent 1 0 R
   /Contents [7 0 R]
   /Annots [12 0 R]
>>
endobj

Next we need to create the annotation.

12 0 obj
<< /Type /Annot
   /Subtype /Link
   /Dest [2 0 R /Fit]
   /Rect [195 695 248 677]
>>
endobj

The /Subtype entry tells the PDF reader that this is a hyperlink and not a comment, or one of the other kinds of annotations. Like navigation items, /Dest is the destination to jump to when the user clicks the link. And finally, /Rect is a rectangle defining the area of the hyperlink. Again, links are not directly associated with the text—they are just an area on the page.

If you don’t like the visible border around the hyperlink rectangle, you can get rid of it with: /Border [0 0 0].

Text Annotations

Text annotations are user-defined comments associated with a location on a page. They are commonly displayed as “sticky notes” that the user can open and close.

Like hyperlinks, text annotations reside in the /Annots array of the page object to which they belong. First, add another object to the /Annots array of the second page:

6 0 obj
<< /Type /Page
   /MediaBox [0 0 612 792]
   /Resources 3 0 R
   /Parent 1 0 R
   /Contents [7 0 R]
   /Annots [12 0 R 13 0 R]
>>
endobj

Then, create the annotation.

13 0 obj
<< /Type /Annot
   /Subtype /Text
   /Contents (Hey look! A comment!)
   /Rect [570 0 0 700]
>>
endobj

Again, /Subtype defines the type of annotation. /Contents is the textual content of the annotation, and /Rect is the location. This rectangle should place the comment in the upper-right margin of the second page.

Text annotations have a few additional properties that give you more control over their appearance. For example, you can add an /Open entry with the value of true to the annotation object to make it open by default. You can also change the icon displayed with /Name /Help. Other supported icons are: /Insert, /Key, /NewParagraph, /Note, and /Paragraph.

Aside from /Link and /Text, there are many other forms of annotations. Some, like /Line annotations, are simply more advanced versions of text annotations. But others, like /Movie annotations, can associate arbitrary media with a page.

Summary

This chapter presented document outlines, hyperlinks, and text annotations, but this is only a small fraction of the interactive features available in a PDF document. The specification includes more than 20 types of annotations, including everything from printer’s marks to file attachments. The complete list of annotations can be found in chapter 8 of Adobe’s PDF Reference.

Build apps 2X faster

using Syncfusion Essential Studio^® suite

1800+ high-performance UI components.
Includes popular controls such as Grid, Chart, Scheduler, and more.
24x5 unlimited support by developers.

Get Your Free Trial Now