Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JBIG2 Generic Encoder #264

Merged
merged 75 commits into from
Mar 27, 2020
Merged

JBIG2 Generic Encoder #264

merged 75 commits into from
Mar 27, 2020

Conversation

kucjac
Copy link
Contributor

@kucjac kucjac commented Feb 28, 2020

This Pull Request provide JBIG2 lossless encoder for the bi-level image compression.


This change is Reviewable

@codecov
Copy link

codecov bot commented Mar 1, 2020

Codecov Report

Merging #264 into development will decrease coverage by 6.69%.
The diff coverage is 61.24%.

Impacted file tree graph

@@              Coverage Diff               @@
##           development     #264     +/-   ##
==============================================
- Coverage        59.09%   52.39%   -6.7%     
==============================================
  Files              192      235     +43     
  Lines            36884    45287   +8403     
==============================================
+ Hits             21795    23727   +1932     
- Misses           12318    18527   +6209     
- Partials          2771     3033    +262
Impacted Files Coverage Δ
internal/jbig2/document/segments/text-region.go 40.83% <ø> (ø)
...ernal/jbig2/document/segments/symbol-dictionary.go 38.25% <ø> (ø)
internal/jbig2/encoder/arithmetic/class.go 0% <ø> (ø)
internal/jbig2/encoder/classer/classer.go 0% <ø> (ø)
...ternal/jbig2/document/segments/halftone-segment.go 62.33% <ø> (ø)
internal/jbig2/document/segments/region.go 63.38% <ø> (ø)
internal/jbig2/encoder/classer/settings.go 0% <ø> (ø)
internal/jbig2/reader/reader.go 79.38% <ø> (+0.76%) ⬆️
core/encoding.go 41.35% <ø> (-9.42%) ⬇️
internal/jbig2/writer/writer.go 50.48% <ø> (-37.24%) ⬇️
... and 161 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e056c0e...d939608. Read the comment docs.

@gunnsth gunnsth requested a review from adrg March 2, 2020 00:00
Copy link
Collaborator

@adrg adrg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks very good for the most part. I noticed two breaking changes in the JBIG2Encoder struct which should be sorted out.

}

// NewJBIG2Encoder returns a new instance of JBIG2Encoder.
func NewJBIG2Encoder() *JBIG2Encoder {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only remove public functions in major releases of the package. So we have to keep it for now. Maybe adjust it to reflect the newly introduced functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're definitely right. I'm reverting that change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't forget to bring back the NewJBIG2Encoder function.


/**

JBIG2Encoder/Decoder
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the same style for section separators which are already used in the library, at least in the non-internal packages.

//
// JBIG2Encoder/Decoder
//

}

// UpdateParams updates the parameter values of the encoder.
func (enc *JBIG2Encoder) UpdateParams(params *PdfObjectDictionary) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the body of this method be empty? It wasn't before:

func (enc *JBIG2Encoder) UpdateParams(params *PdfObjectDictionary) {
	if decode := params.Get("Decode"); decode != nil {
		enc.setChocolateData(decode)
	}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check? empty or not?

}
}

func (d *JBIG2Document) updateParameters() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is only used once and has only a line of code. Is it expected to grow in the future or be called from multiple places? If not, maybe it can be removed.
The same applies for the initializeIfNeeded method of the document.


*/

func autoThresholdTriangle(histogram [256]int) uint8 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these util functions could be moved somewhere in the internal jbig2 package.

core/encoding.go Outdated
@@ -1903,8 +1902,8 @@ func (enc *CCITTFaxEncoder) DecodeBytes(encoded []byte) ([]byte, error) {
EndOfBlock: enc.EndOfBlock,
BlackIs1: enc.BlackIs1,
DamagedRowsBeforeError: enc.DamagedRowsBeforeError,
Rows: enc.Rows,
EncodedByteAlign: enc.EncodedByteAlign,
Rows: enc.Rows,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not appear formatted with gofmt. Please make sure you run gofmt on the file.

@@ -57,6 +59,8 @@ func (DummyLogger) IsLogLevel(level LogLevel) bool {
// LogLevel is the verbosity level for logging.
type LogLevel int

// defines log level enum where the most important logs have the lowest values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please start comments with capital letter and end them with . where possible, especially in non-internal package source files.

type JBIG2Encoder struct {
JBIG2Document
// Globals are the JBIG2 global segments.
Globals *document.Globals
Copy link
Collaborator

@adrg adrg Mar 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a breaking change as the the field was Globals jbig2.Globals previously. Breaking changes are only possible when releasing major versions of the package.

Before:

type Globals map[int]*segments.Header 

Now:

type Globals struct {
	Segments []*segments.Header
}

// output writes `format`, `args` log message prefixed by the source file name, line and `prefix`
func (l ConsoleLogger) output(f *os.File, prefix string, format string, args ...interface{}) {
//noinspection GoUnhandledErrorResult
func (l WriterLogger) output(f io.Writer, prefix string, format string, args ...interface{}) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method seems to be identical to the one in the ConsoleLogger struct. It may be better to make it a standalone function and call it from both loggers.


*/

// JBIG2Image is the image structure used by the jbig2 encoder. It's Data must be in a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo: It's instead of Its. Might be good to go through the encoding_jbig2.go file and check the method documentation comments.

Copy link
Contributor Author

@kucjac kucjac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 126 files reviewed, 8 unresolved discussions (waiting on @adrg)


common/logging.go, line 62 at r6 (raw file):

Previously, adrg (Adrian-George Bostan) wrote…

Please start comments with capital letter and end them with . where possible, especially in non-internal package source files.

Done.


common/logging.go, line 230 at r6 (raw file):

Previously, adrg (Adrian-George Bostan) wrote…

This method seems to be identical to the one in the ConsoleLogger struct. It may be better to make it a standalone function and call it from both loggers.

Done.


core/encoding.go, line 1905 at r6 (raw file):

Previously, adrg (Adrian-George Bostan) wrote…

This does not appear formatted with gofmt. Please make sure you run gofmt on the file.

Done.


core/encoding_jbig2.go, line 42 at r6 (raw file):

Previously, adrg (Adrian-George Bostan) wrote…

Please use the same style for section separators which are already used in the library, at least in the non-internal packages.

//
// JBIG2Encoder/Decoder
//

Done.


core/encoding_jbig2.go, line 58 at r6 (raw file):

Previously, adrg (Adrian-George Bostan) wrote…

This seems to be a breaking change as the the field was Globals jbig2.Globals previously. Breaking changes are only possible when releasing major versions of the package.

Before:

type Globals map[int]*segments.Header 

Now:

type Globals struct {
	Segments []*segments.Header
}

Hmm. Field Globals in the *JBIG2Encoder are obtainable only as a result from the internal functions - and were created on the run of Decode method. This little change is very important in the Encoder context. I will make some workaround that converts jbig2.Globals into *document.Globals internally.


core/encoding_jbig2.go, line 165 at r6 (raw file):

Previously, adrg (Adrian-George Bostan) wrote…

Should the body of this method be empty? It wasn't before:

func (enc *JBIG2Encoder) UpdateParams(params *PdfObjectDictionary) {
	if decode := params.Get("Decode"); decode != nil {
		enc.setChocolateData(decode)
	}
}

This Decode Parameter - Chocolate Data concept - was solved by the #159. This results of these functions were no longer used by the *JBIG2Encoder. Thus the content of this method could be empty now.


core/encoding_jbig2.go, line 304 at r6 (raw file):

Previously, adrg (Adrian-George Bostan) wrote…

This method is only used once and has only a line of code. Is it expected to grow in the future or be called from multiple places? If not, maybe it can be removed.
The same applies for the initializeIfNeeded method of the document.

It was expected at the beginning. Right now it doesn't look like it would be needed anytime again. It can be removed.


core/encoding_jbig2.go, line 464 at r6 (raw file):

Previously, adrg (Adrian-George Bostan) wrote…

I think these util functions could be moved somewhere in the internal jbig2 package.

Good Idea. Done.

Copy link
Contributor Author

@kucjac kucjac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 14 files at r7.
Reviewable status: 3 of 126 files reviewed, 8 unresolved discussions (waiting on @adrg)

Copy link
Contributor Author

@kucjac kucjac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 112 of 127 files at r5, 11 of 14 files at r7.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @adrg)

output(f, prefix, format, args)
}

func output(f io.Writer, prefix string, format string, args ...interface{}) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name of this function should be changed to something a bit more suggestive. As part of a struct, it did provide some context. Maybe logToWriter or logOutput or something like that.

Copy link
Contributor Author

@kucjac kucjac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 123 of 126 files reviewed, 9 unresolved discussions (waiting on @adrg and @kucjac)


common/logging.go, line 225 at r8 (raw file):

Previously, adrg (Adrian-George Bostan) wrote…

I think the name of this function should be changed to something a bit more suggestive. As part of a struct, it did provide some context. Maybe logToWriter or logOutput or something like that.

Done. logToWriter looks good to me and in fact it is more suggestive.

Copy link
Collaborator

@adrg adrg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good. Thank you for addressing the review requested changes.

Copy link
Collaborator

@adrg adrg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 112 of 127 files at r5, 11 of 14 files at r7, 1 of 2 files at r8, 2 of 2 files at r9.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @adrg)

Copy link
Contributor

@gunnsth gunnsth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. I added a few comments. Can you please go through.
Can you also update our ACKNOWLEDGEMENTS.md file?

func (l ConsoleLogger) Trace(format string, args ...interface{}) {
if l.LogLevel >= LogLevelTrace {
prefix := "[TRACE] "
l.output(os.Stdout, prefix, format, args...)
}
}

// output writes `format`, `args` log message prefixed by the source file name, line and `prefix`
//noinspection GoUnhandledErrorResult
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the "noinspection" comment, seems to be for some linter that we are not using.

func (l ConsoleLogger) Warning(format string, args ...interface{}) {
if l.LogLevel >= LogLevelWarning {
prefix := "[WARNING] "
l.output(os.Stdout, prefix, format, args...)
}
}

// Notice logs 'Notice' message.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be // Note logs notice messages.
No need to put 'Notice' into quotes. Same for other comment similar.
(Error, Warning, Debug, Trace)
Sometimes variable names are put into ticks
https://github.com/unidoc/unipdf/wiki/UniPDF-Developer-Guide#12-doc-comments
but not usually function names

}

// UpdateParams updates the parameter values of the encoder.
func (enc *JBIG2Encoder) UpdateParams(params *PdfObjectDictionary) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check? empty or not?

globalsStream, ok := globals.(*PdfObjectStream)
if !ok {
err = errors.Error(processName, "jbig2.Globals stream should be an Object Stream")
//noinspection GoNilness
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove those noinspection comments?
At the moment we just try to minimize golint output but not commenting on each single deviation.

go.mod Outdated
@@ -12,3 +12,5 @@ require (
golang.org/x/text v0.3.2
golang.org/x/tools v0.0.0-20190606174628-0139d5756a7d // indirect
)

go 1.13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this? Or check what is the oldest version that works?
I think go 1.11 should work.

}

// func (d *Document) uniteTemplatesWithIndexes(firstTempIndex, secondTempIndex int) error {
// if len(d.Classer.Pixat) < firstTempIndex || len(d.Classer.Pixat) < secondTempIndex {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove commented code

// t.Run("PDFMode", func(t *testing.T) {
// scales := []int{2, 4}
// for _, scale := range scales {
// t.Run(fmt.Sprintf("SinglePageX%d", scale), func(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented code or add explanation if there is reason to keep

// Then the function writes it's encoded data into 'w' writer.
// func (p *Page) encodeSegment(w writer.BinaryWriter, segmentNumber int) (n int, err error) {
// const processName = "encodeSegment"
// // get the segment for given 'segmentNumber'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented code, remove or add reason to keep


// firstSegmentNumber gets the number of the first segment in the page.
// func (p *Page) firstSegmentNumber() (first uint32, err error) {
// const processName = "firstSegmentNumber"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented code, remove or document reason to keep


// // iterate over all symbols in the list
// for i := 0; i < len(s.symbolList); {
// height, err := s.bmHeight(i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented code: remove or document reason to keep

Copy link
Contributor Author

@kucjac kucjac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @gunnsth)


go.mod, line 16 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

Can you remove this? Or check what is the oldest version that works?
I think go 1.11 should work.

go 1.11 should be fine as it was the first version that supports modules. Done


common/logging.go, line 105 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

Should be // Note logs notice messages.
No need to put 'Notice' into quotes. Same for other comment similar.
(Error, Warning, Debug, Trace)
Sometimes variable names are put into ticks
https://github.com/unidoc/unipdf/wiki/UniPDF-Developer-Guide#12-doc-comments
but not usually function names

Done.


common/logging.go, line 138 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

Remove the "noinspection" comment, seems to be for some linter that we are not using.

Done.


core/encoding_jbig2.go, line 165 at r6 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

check? empty or not?

It should be empty now.


core/encoding_jbig2.go, line 273 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

Can you remove those noinspection comments?
At the moment we just try to minimize golint output but not commenting on each single deviation.

Done. Sure. I will remove all noinsepction comments.


internal/jbig2/bitmap/bitmap_test.go, line 992 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

Add TODO for adding this test?

Done. Removed that test.


internal/jbig2/bitmap/combine.go, line 80 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

with the ...?

Done.


internal/jbig2/bitmap/components.go, line 270 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

should be empty newline above the function.
Need to run goimports ?

Done.


internal/jbig2/bitmap/components_test.go, line 1291 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

Can we remove the commented code?
Typically we don't keep commented code unless for some reason, which should be indicated.

Done. This part of the code was the development part of the Classified Encoder that didn't pass the tests or compile well. Moved to separate branch for future development and remove from this PR.


internal/jbig2/document/document.go, line 786 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

can remove commented code

Done. This part of the code was the development part of the Classified Encoder that didn't pass the tests or compile well. Moved to separate branch for future development and remove from this PR.


internal/jbig2/document/page.go, line 403 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

commented code, remove or add reason to keep

Done. This part of the code was the development part of the Classified Encoder that didn't pass the tests or compile well. Moved to separate branch for future development and remove from this PR.


internal/jbig2/document/page.go, line 418 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

commented code, remove or document reason to keep

Done. This part of the code was the development part of the Classified Encoder that didn't pass the tests or compile well. Moved to separate branch for future development and remove from this PR.


internal/jbig2/document/document_test.go, line 687 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

remove commented code or add explanation if there is reason to keep

Done. This part of the code was the development part of the Classified Encoder that didn't pass the tests or compile well. Moved to separate branch for future development and remove from this PR.


internal/jbig2/document/segments/symbol-dictionary.go, line 293 at r9 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

commented code: remove or document reason to keep

Done.

Copy link
Contributor

@gunnsth gunnsth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 59 files at r2, 2 of 57 files at r4, 1 of 127 files at r5, 15 of 15 files at r10.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @kucjac)


common/logging.go, line 178 at r10 (raw file):

}

// Warning logs 'warning' message.

Check comments for this set of log functions too Error, Warning, Notice etc


core/encoding_jbig2.go, line 165 at r6 (raw file):

Previously, kucjac (Jacek Kucharczyk) wrote…

It should be empty now.

Can you add a comment like // Empty but necessary to implement interface X or something like that?

Copy link
Contributor Author

@kucjac kucjac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 124 of 127 files reviewed, 1 unresolved discussion (waiting on @gunnsth)


common/logging.go, line 178 at r10 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

Check comments for this set of log functions too Error, Warning, Notice etc

Done.


core/encoding_jbig2.go, line 165 at r6 (raw file):

Previously, gunnsth (Gunnsteinn Hall) wrote…

Can you add a comment like // Empty but necessary to implement interface X or something like that?

Done. Good idea.

Copy link
Contributor

@gunnsth gunnsth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 127 files at r5, 3 of 3 files at r11.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@gunnsth gunnsth merged commit c582323 into unidoc:development Mar 27, 2020
@kucjac kucjac deleted the jbig2-encoder branch March 30, 2020 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants