[BUG] Outline destinations appear to be `null` for some types of PDFs #482

jonkgrimes · 2022-02-25T16:52:08Z

Description

We're attempting to parse the outline of a few PDF documents and it appears that the page number is lost for some of the documents using the GetOutlines method on the PdfReader object.

The attached BadOutline.pdf document seems to have the Dest field on the OutlineItem set to null and thus the page numbers are lost. The attached GoodOutline.pdf does not have that problem and is parsed correctly. Additionally, using pypdf2 Python package the correct page numbers are parsed and can be displayed when parsing the BadOutline.pdf (happy to provide that code as well).

Expected Behavior

$ go run main.go BadOutline.pdf
Input file: BadOutline.pdf
{
    "entries": [
        {
            "title": "Basic networkx instructions",
            "dest": {
                "page": 1,
                "mode": "XYZ",
                "x": 125.798,
                "y": 434.577,
                "zoom": 0
            }
        },
        {
            "title": "Assignment",
            "dest": {
                "page": 4,
                "mode": "XYZ",
                "x": 125.798,
                "y": 226.939,
                "zoom": 0
            }
        }
    ]
}

Actual Behavior

go run main.go BadOutline.pdf
Input file: BadOutline.pdf
{
    "entries": [
        {
            "title": "Basic networkx instructions",
            "dest": {
                "page": 0,
                "mode": "",
                "x": 0,
                "y": 0,
                "zoom": 0
            }
        },
        {
            "title": "Assignment",
            "dest": {
                "page": 0,
                "mode": "",
                "x": 0,
                "y": 0,
                "zoom": 0
            }
        }
    ]
}

Attachments

I can reproduce the issue by copying the code from here:

// main.go

package main

import (
        "encoding/json"
        "fmt"
        "os"

        "github.com/unidoc/unipdf/v3/common/license"
        "github.com/unidoc/unipdf/v3/model"
)

func init() {
        err := license.SetMeteredKey(os.Getenv(`UNIDOC_LICENSE_API_KEY`))
        if err != nil {
                panic(err)
        }
}

func main() {
        if len(os.Args) < 2 {
                fmt.Printf("Usage:  go run main.go input.pdf\n")
                os.Exit(1)
        }

        inputPath := os.Args[1]

        fmt.Printf("Input file: %s\n", inputPath)

        pdfReader, f, err := model.NewPdfReaderFromFile(inputPath, nil)
        if err != nil {
                fmt.Printf("Error: %v\n", err)
                os.Exit(1)
        }
        defer f.Close()

        outlines, err := pdfReader.GetOutlines()
        if err != nil {
                fmt.Printf("Error: %v\n", err)
                os.Exit(1)
        }

        data, err := json.MarshalIndent(outlines, "", "    ")
        if err != nil {
                fmt.Printf("Error: %v\n", err)
                os.Exit(1)
        }
        fmt.Printf("%s\n", data)
}

BadOutline.pdf
GoodOutline.pdf

The text was updated successfully, but these errors were encountered:

github-actions · 2022-02-25T16:52:45Z

Welcome! Thanks for posting your first issue. The way things work here is that while customer issues are prioritized, other issues go into our backlog where they are assessed and fitted into the roadmap when suitable. If you need to get this done, consider buying a license which also enables you to use it in your commercial products. More information can be found on https://unidoc.io/

sampila · 2022-05-18T10:21:09Z

Hi @jonkgrimes,
Thank you for reporting the issue and providing us with the details.
This issue should be fixed on UniPDF v3.33.0 https://github.com/unidoc/unipdf-src/releases/tag/v3.33.0

sampila closed this as completed May 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Outline destinations appear to be `null` for some types of PDFs #482

[BUG] Outline destinations appear to be `null` for some types of PDFs #482

jonkgrimes commented Feb 25, 2022

github-actions bot commented Feb 25, 2022

sampila commented May 18, 2022

[BUG] Outline destinations appear to be null for some types of PDFs #482

[BUG] Outline destinations appear to be null for some types of PDFs #482

Comments

jonkgrimes commented Feb 25, 2022

Description

Expected Behavior

Actual Behavior

Attachments

github-actions bot commented Feb 25, 2022

sampila commented May 18, 2022

[BUG] Outline destinations appear to be `null` for some types of PDFs #482

[BUG] Outline destinations appear to be `null` for some types of PDFs #482