Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Outline destinations appear to be null for some types of PDFs #482

Closed
jonkgrimes opened this issue Feb 25, 2022 · 2 comments
Closed

Comments

@jonkgrimes
Copy link

Description

We're attempting to parse the outline of a few PDF documents and it appears that the page number is lost for some of the documents using the GetOutlines method on the PdfReader object.

The attached BadOutline.pdf document seems to have the Dest field on the OutlineItem set to null and thus the page numbers are lost. The attached GoodOutline.pdf does not have that problem and is parsed correctly. Additionally, using pypdf2 Python package the correct page numbers are parsed and can be displayed when parsing the BadOutline.pdf (happy to provide that code as well).

Expected Behavior

$ go run main.go BadOutline.pdf
Input file: BadOutline.pdf
{
    "entries": [
        {
            "title": "Basic networkx instructions",
            "dest": {
                "page": 1,
                "mode": "XYZ",
                "x": 125.798,
                "y": 434.577,
                "zoom": 0
            }
        },
        {
            "title": "Assignment",
            "dest": {
                "page": 4,
                "mode": "XYZ",
                "x": 125.798,
                "y": 226.939,
                "zoom": 0
            }
        }
    ]
}

Actual Behavior

go run main.go BadOutline.pdf
Input file: BadOutline.pdf
{
    "entries": [
        {
            "title": "Basic networkx instructions",
            "dest": {
                "page": 0,
                "mode": "",
                "x": 0,
                "y": 0,
                "zoom": 0
            }
        },
        {
            "title": "Assignment",
            "dest": {
                "page": 0,
                "mode": "",
                "x": 0,
                "y": 0,
                "zoom": 0
            }
        }
    ]
}

Attachments

I can reproduce the issue by copying the code from here:

// main.go

package main

import (
        "encoding/json"
        "fmt"
        "os"

        "github.com/unidoc/unipdf/v3/common/license"
        "github.com/unidoc/unipdf/v3/model"
)

func init() {
        err := license.SetMeteredKey(os.Getenv(`UNIDOC_LICENSE_API_KEY`))
        if err != nil {
                panic(err)
        }
}

func main() {
        if len(os.Args) < 2 {
                fmt.Printf("Usage:  go run main.go input.pdf\n")
                os.Exit(1)
        }

        inputPath := os.Args[1]

        fmt.Printf("Input file: %s\n", inputPath)

        pdfReader, f, err := model.NewPdfReaderFromFile(inputPath, nil)
        if err != nil {
                fmt.Printf("Error: %v\n", err)
                os.Exit(1)
        }
        defer f.Close()

        outlines, err := pdfReader.GetOutlines()
        if err != nil {
                fmt.Printf("Error: %v\n", err)
                os.Exit(1)
        }

        data, err := json.MarshalIndent(outlines, "", "    ")
        if err != nil {
                fmt.Printf("Error: %v\n", err)
                os.Exit(1)
        }
        fmt.Printf("%s\n", data)
}

BadOutline.pdf
GoodOutline.pdf

@github-actions
Copy link

Welcome! Thanks for posting your first issue. The way things work here is that while customer issues are prioritized, other issues go into our backlog where they are assessed and fitted into the roadmap when suitable. If you need to get this done, consider buying a license which also enables you to use it in your commercial products. More information can be found on https://unidoc.io/

@sampila
Copy link
Collaborator

sampila commented May 18, 2022

Hi @jonkgrimes,
Thank you for reporting the issue and providing us with the details.
This issue should be fixed on UniPDF v3.33.0 https://github.com/unidoc/unipdf-src/releases/tag/v3.33.0

@sampila sampila closed this as completed May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants