-
-
Notifications
You must be signed in to change notification settings - Fork 678
Conversation
bimg is MIT licensed. It depends on the C library libvips which is LGPL v2.1+ licensed. libvips must be installed separately.
From: https://github.com/h2non/bimg#prerequisites
Travis is configured to use The libvips wiki notes the ppa I've added to get a newer |
@kegsay previously expressed a preference for pure go libraries instead of depending on libraries that have to be installed externally and bridged into go. However, the performance of the available golang image processing libraries (https://github.com/nfnt/resize) is grossly worse than something based on libvips. The pure golang solution with nfnt/resize does not use any smart use of box filters to downscale large JPEGs (as in resolutions used in all digital cameras today) to within some percentage of the desired output resolution, and then use some traditional interpolation algorithm like bicubic or lanczos3. nfnt/resize can only use interpolation directly which is much much slower. Also, libvips has a smart multi-threaded design which splits the image up and processes those parts in parallel. In summary, I think trying to do any media processing in pure golang instead of using mature and appropriately-optimised libraries written in other languages like C and using ASM optimisations is currently not a good choice. Perhaps the situation will change in the future but I strongly vote for bimg + libvips today and accept the cost of additional installation complexity. (I think we should ship a docker container anyway and then it's basically just pulling down binary data to execute. 😄) |
Looks like nfnt/resize already does this based on the number of max procs (defaults to # cores) https://github.com/nfnt/resize/blob/master/resize.go#L612 - so it's a shame that there no obvious room for improvement for the golang version. |
i don't see the point in jumping through hoops to avoid native deps - surely we just use the best tool for the job? |
I think that it's a bit of a double standard to say "use the best tool for the job" on the one hand, but strive to eradicate Kafka/Zookeeper, to the point of emphasising this on all the Dendrite literature, on the other hand. |
I'm really not happy with adding a system dependency, especially one that requires adding a random ppa. We can always ship a libvips version of the media repository in a separate go project if we need the additional performance. |
Alternatively go supports tags that you could use to make libvips conditional.
to the stop of the libvips specific source files. |
Logger: util.GetLogger(req.Context()).WithFields(log.Fields{ | ||
"Origin": origin, | ||
"MediaID": mediaID, | ||
}), | ||
} | ||
|
||
if r.IsThumbnailRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a slight bias towards using separate functions for downloading and thumbnailing instead of using a condition to wedge both into a single function. Sharing the common code is nice but usually the same effect can be better handled by lifting the common code into a separate function.
However the reason I dislike having the code for the two cases wedged together is that it makes it harder to follow what is going on in a single case. That doesn't seem to be too bad in this case since it's a single bool condition and is always tested in the same way.
So while this is usually something I'd object to, I think it is acceptable here for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can refactor. It's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I changed my mind. I think this is better. 😄
if err != nil { | ||
r.Logger.WithError(err).Warn("Error generating thumbnails") | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we always generate the thumbnails in the background?
What happens if this is a request for thumbnails? Will this request race with the background thumbnail generation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thumbnails being pre-generated according to the configuration of the server are generated in the background. There is a race in time, but it's done safely using the activeThumbnailGeneration
synchronised map of thumbnail file name to a sync.Cond
and the result of the thumbnail attempt, in a very similar way to the download of remote files.
For example (and I have tested this and specifically made sure that this works), say that we are running our server configured to pre-generate a bunch of thumbnail formats including 800x600 using the scale method. If a thumbnail request is received for an 800x600 scaled version of some image that is hosted on a remote server and we don't yet have that file, the server downloads the file from the remote server and then fires off a separate go routine (the code commented on here) to do the pre-generated thumbnailing. Each thumbnail configuration in turn will lock activeThumbnailGeneration
, check if there's an active thumbnail generation job ongoing for that thumbnail, etc and if not, add a sync.Cond to the map and unlock. If there is an active thumbnail generation job ongoing, it waits on the sync.Cond. So it doesn't matter if the pre-generation job gets to generating that thumbnail first, or if the code that continues after this explicitly and synchronously requests to generate that specific thumbnail. First come, first served and the rest will wait or just find the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if dynamic thumbnail generation is turned off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's basically all covered here - pay attention to the control flow and hopefully it'll be clear:
dendrite/src/github.com/matrix-org/dendrite/mediaapi/writers/download.go
Lines 292 to 327 in a1e62a2
if dynamicThumbnails { | |
thumbnail, resErr = r.generateThumbnail(filePath, r.ThumbnailSize, activeThumbnailGeneration, db) | |
if resErr != nil { | |
return nil, nil, resErr | |
} | |
} else { | |
thumbnails, err := db.GetThumbnails(r.MediaMetadata.MediaID, r.MediaMetadata.Origin) | |
if err != nil { | |
r.Logger.WithError(err).WithFields(log.Fields{ | |
"Width": r.ThumbnailSize.Width, | |
"Height": r.ThumbnailSize.Height, | |
"ResizeMethod": r.ThumbnailSize.ResizeMethod, | |
}).Error("Error looking up thumbnails") | |
resErr := jsonerror.InternalServerError() | |
return nil, nil, &resErr | |
} | |
// If we get a thumbnailSize, a pre-generated thumbnail would be best but it is not yet generated. | |
// If we get a thumbnail, we're done. | |
var thumbnailSize *types.ThumbnailSize | |
thumbnail, thumbnailSize = thumbnailer.SelectThumbnail(r.ThumbnailSize, thumbnails, thumbnailSizes) | |
if thumbnailSize != nil { | |
r.Logger.WithFields(log.Fields{ | |
"Width": thumbnailSize.Width, | |
"Height": thumbnailSize.Height, | |
"ResizeMethod": thumbnailSize.ResizeMethod, | |
}).Info("Pre-generating thumbnail for immediate response.") | |
thumbnail, resErr = r.generateThumbnail(filePath, *thumbnailSize, activeThumbnailGeneration, db) | |
if resErr != nil { | |
return nil, nil, resErr | |
} | |
} | |
} | |
if thumbnail == nil { | |
return nil, nil, nil | |
} |
return 0 | ||
} | ||
|
||
func compareThumbnailMetrics(a thumbnailMetrics, b thumbnailMetrics, desiredCrop bool) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe write this as:
// betterThan checks whether the thumbnail metrics are a better fit than the other metrics
func (a thumbnailMetrics) betterThan(b thumbnailMetrics, desiredCrop bool) bool {
...
}
I don't think we are distinguishing between `1` and `0` at the callsites so returning a bool might be neater.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
func newThumbnailMetrics() thumbnailMetrics { | ||
return thumbnailMetrics{ | ||
isSmaller: 1, | ||
aspect: float64(16384 * 16384), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use https://golang.org/pkg/math/#Inf for the worst value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually looking at MaxFloat64
and previously I was doing some b.aspect - a.aspect
and was concerned that the precision with such a large number would be such that the subtraction would be inconsequential.
As such, I took something more reasonably within the realm of an acceptable max image resolution. However, now I am comparing the numbers directly I don't have the same precision problem.
I knew this would provoke comment but I forgot to comment it to explain. How do you want me to proceed - comment, Inf()
or MaxFloat64
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Inf() makes the most sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
logger.WithFields(log.Fields{ | ||
"Width": config.Width, | ||
"Height": config.Height, | ||
"ResizeMethod": config.ResizeMethod, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe do the WithFields
once and cache the result in a local variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did consider that but had no strong argument either way - I'll do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Both in a few places in download.go
and in thumbnailer.go
.
} | ||
|
||
// init with worst values | ||
func newThumbnailMetrics() thumbnailMetrics { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have a different name than metrics
here? I'm going to confuse this with prometheus metrics otherwise.
Maybe thumbnailFitness
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Metrics is used in the context of monitoring with Prometheus so renaming to avoid confusion.
Fall back to selecting from already-/pre-generated thumbnails or serving the original.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This looks awesome 🥇
nfnt is slightly sharper but uses more cpu and takes longer (~845ms for the matrix.org config) The sharpness and performance is partially because nfnt uses Lanczos3 the whole way where bimg/libvips uses a box filter to shrink most of the way for large scale factors and then Lanczos3 for the last bit. |
I added a comment and a couple of small sample images to the README. Thanks for reviewing and glad you like it. 😄 |
* Verify that the user ID for registration matches the spec, and the auth data (#10) * Blacklist some sytest tests that are failing in our environment * Commenting out test that isn't reliably passing or failing, probably a race * refresh latest dendrite main * pull latest from dendrite-fork subtree * refresh latest dendrite main * pull dendrite subtree and resolve merge conflicts * check that userID matches the signed message * verify that the user ID for registration is CAIP-10 compliant and MXID compliant * removed space Co-authored-by: Brian Meek <brian@hntlabs.com> Co-authored-by: Tak Wai Wong <takwaiw@gmail.com> * Fix nats.go commit (matrix-org#2540) Signed-off-by: Jean Lucas <jean@4ray.co> * Don't return `end` if there are not more messages (matrix-org#2542) * Be more spec compliant * Move lazyLoadMembers to own method * Return an error if trying to invite a malformed user ID (matrix-org#2543) * Add `evacuateUser` endpoint, use it when deactivating accounts (matrix-org#2545) * Add `evacuateUser` endpoint, use it when deactivating accounts * Populate the API * Clean up user devices when deactivating * Include invites, delete pushers * Silence presence logs (matrix-org#2547) * Takwaiw/fix concurrent registration bug (#12) * fix concurrent registration bug. Rename decentralizedid * remove unused module * add regressed test to blacklist Co-authored-by: Tak Wai Wong <takwaiw@gmail.com> Co-authored-by: Brian Meek <brian@hntlabs.com> Co-authored-by: Tak Wai Wong <takwaiw@gmail.com> Co-authored-by: Jean Lucas <jean@4ray.co> Co-authored-by: Till <2353100+S7evinK@users.noreply.github.com> Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Verify that the user ID for registration matches the spec, and the auth data (#10) * Blacklist some sytest tests that are failing in our environment * Commenting out test that isn't reliably passing or failing, probably a race * refresh latest dendrite main * pull latest from dendrite-fork subtree * refresh latest dendrite main * pull dendrite subtree and resolve merge conflicts * check that userID matches the signed message * verify that the user ID for registration is CAIP-10 compliant and MXID compliant * removed space Co-authored-by: Brian Meek <brian@hntlabs.com> Co-authored-by: Tak Wai Wong <takwaiw@gmail.com> * Fix nats.go commit (matrix-org#2540) Signed-off-by: Jean Lucas <jean@4ray.co> * Don't return `end` if there are not more messages (matrix-org#2542) * Be more spec compliant * Move lazyLoadMembers to own method * Return an error if trying to invite a malformed user ID (matrix-org#2543) * Add `evacuateUser` endpoint, use it when deactivating accounts (matrix-org#2545) * Add `evacuateUser` endpoint, use it when deactivating accounts * Populate the API * Clean up user devices when deactivating * Include invites, delete pushers * Silence presence logs (matrix-org#2547) * Takwaiw/fix concurrent registration bug (#12) * fix concurrent registration bug. Rename decentralizedid * remove unused module * add regressed test to blacklist Co-authored-by: Tak Wai Wong <takwaiw@gmail.com> Co-authored-by: Brian Meek <brian@hntlabs.com> Co-authored-by: Tak Wai Wong <takwaiw@gmail.com> Co-authored-by: Jean Lucas <jean@4ray.co> Co-authored-by: Till <2353100+S7evinK@users.noreply.github.com> Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
No description provided.