Introduce the `Buf` type #47

LaurenzV · 2024-10-30T20:34:44Z

So, the problem I'm trying to solve is that for PDF 1.4 (and by extension PDF/A1), there are a couple of more architectural limits that need to be upheld, and doing so on the caller-side is very hard.

The idea is simple: Right now, must writers (i.e. mainly the Chunk type, but also the writers for cmaps and postscript code) use a simple vector of bytes to write to). This PR introduces a Buf type, which is a very thin wrapper around a vector of bytes, but in addition to that it holds a struct which tracks the limits of the data types that were used in the buffer. The caller can then collect those and update them for each chunk it creates, using the merge method. And in the very end, we can check whether at any location in the PDF, we've used a limit that is higher than the allowed one. I've already integrated this in krilla, and it seems to work pretty nicely.

It would be nice if the merging would happen in pdf-writer by forcing the user to pass a Buf instead of Vec<u8> wherever possible (for example, when writing a content stream), but the problem is that the user has to be able to add arbirary data to a stream (since any kind of binary data can appear in a stream, instead of just content streams), and the user currently has no way of constructing a Buf manually using a vector of bytes. I could change that (and if I do that I might as well implement DerefMut for Buf so that I don't need to wrap reserve for example), but I think it's fine if for now, the user has to merge the limits themselves manually. Let me know if you think this would be better, though.

If you are happy with the overall approach, I'll try to add some test cases for limits merging, too.

# Conflicts: # src/functions.rs

src/buf.rs

src/chunk.rs

src/renumber.rs

src/content.rs

laurmaedje · 2024-11-27T15:38:03Z

examples/limits.rs

+    assert_eq!(
+        chunk.as_bytes(),
+        b"1 0 obj
+<<


Asserting the PDF in the example is a bit unconventional. Do you think we need this?

Print it instead? Or just remove it completely?

The other examples write it to a file

Hmm yeah but the other examples also create an actually valid PDF file. :p

I can change it so it’s a valid PDF too, but maybe a bit unnecessary if it’s just about showing how to use the Limits API.

I see. Maybe we can just completely ignore it?

So remove all asserts? Or just the one from the PDF buffer?

LaurenzV and others added 21 commits October 26, 2024 18:54

Introduce Buf type

26cb3ed

Return content by default

ba1c728

First version

2d3cf45

re=exprt

18b779a

more fixes

8974609

make to_bytes public

0345d52

Add limits getter

9e99c77

Add limits to chunk

2202232

add getters

1f70211

add getters

664c270

integrate postscript changes

8294635

Change separators

31565d1

revert push float change

ea782ea

revert changes

cf042eb

Merge branch 'main' into buf-tests

ccb54f4

# Conflicts: # src/functions.rs

tidy up a bit

71b7593

do not implement DerefMut

803ef27

format

8af675d

merge limits in renumber as well

ddae285

Fix test

2c80827

Fix clippy

f95a19c

laurmaedje reviewed Nov 12, 2024

View reviewed changes

LaurenzV and others added 5 commits November 12, 2024 13:39

Apply some code review

e32816a

Add some documentation

749fe13

Add two test cases and fix two bugs

34f6b8c

Reformat

9dcf56b

Add an exmaple for tracking limits

33052e2

laurmaedje reviewed Nov 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce the `Buf` type #47

Introduce the `Buf` type #47

LaurenzV commented Oct 30, 2024

laurmaedje Nov 27, 2024

LaurenzV Nov 27, 2024

laurmaedje Nov 27, 2024

LaurenzV Nov 27, 2024

LaurenzV Nov 27, 2024

laurmaedje Nov 27, 2024

LaurenzV Nov 27, 2024

Introduce the Buf type #47

Are you sure you want to change the base?

Introduce the Buf type #47

Conversation

LaurenzV commented Oct 30, 2024

laurmaedje Nov 27, 2024

Choose a reason for hiding this comment

LaurenzV Nov 27, 2024

Choose a reason for hiding this comment

laurmaedje Nov 27, 2024

Choose a reason for hiding this comment

LaurenzV Nov 27, 2024

Choose a reason for hiding this comment

LaurenzV Nov 27, 2024

Choose a reason for hiding this comment

laurmaedje Nov 27, 2024

Choose a reason for hiding this comment

LaurenzV Nov 27, 2024

Choose a reason for hiding this comment

Introduce the `Buf` type #47

Introduce the `Buf` type #47