-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labelbot - Backend #1012
base: master
Are you sure you want to change the base?
Labelbot - Backend #1012
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't implement all the logic as helper methods. Since the logic is used only in the single controller (for now), implement it as (protected) controller methods there.
Also please us camelCase for variable names.
Finally, please fix the failing checks (except the test-current check which fails because biigle/largo is not there). The lint checks can be run locally with composer lint
and composer fix
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor comments left now. We have to make sure that we get the exact queries here. Maybe you can dump the two queries as I explained below and post them here for comparison?
@@ -236,6 +257,11 @@ public function store(StoreImageAnnotation $request) | |||
|
|||
$annotation->load('labels.label', 'labels.user'); | |||
|
|||
// Attach the other two labels if they exist. | |||
for ($i = 1; $i < count($topNLabels); $i++) { | |||
$annotation->labelBOTLabels[] = $topNLabels[$i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UI will need the full label models (with name and color), not only the IDs.
*/ | ||
protected function performAnnSearch($featureVector, $trees) | ||
{ | ||
$featureVector = new Vector($featureVector); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should also be able to pass a simple array here. No need ti instantiate a Vector
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. It throws the error: SQLSTATE[HY093]: Invalid parameter number: parameter was not defined
.
$subquery = ImageAnnotationLabelFeatureVector::select('label_id', 'label_tree_id') | ||
->selectRaw('(vector <=> ?) AS distance', [$featureVector]) | ||
->orderBy('distance') | ||
// K = 100 | ||
->limit(config('labelbot.K')); | ||
|
||
return DB::table(DB::raw("({$subquery->toSql()}) as subquery")) | ||
->setBindings([$featureVector]) | ||
->whereIn('label_tree_id', $trees) | ||
->select('label_id') | ||
->groupBy('label_id') | ||
->orderByRaw('MIN(distance)') | ||
->limit(config('labelbot.N')) // N = 3 | ||
->pluck('label_id') | ||
->toArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this produce the same SQL query than you developed in your thesis? You can check by placing this at the beginning of the method:
DB::listen(fn ($q) => dd($q->sql));
This will dump the SQL when you run a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it does.
ANN:
"select "label_id"
from (
select "label_id", "label_tree_id", (vector <=> ?) AS distance
from "image_annotation_label_feature_vectors"
order by "distance" asc
limit 100
) as "subquery"
where "label_tree_id" in (?)
group by "label_id"
order by MIN(distance)
limit 3"
KNN:
"select "label_id"
from (
select "label_id", "label_tree_id", (vector <=> ?) AS distance
from "image_annotation_label_feature_vectors"
where "label_tree_id" in (?)
order by "distance" asc
limit 100
) as "subquery"
group by "label_id"
order by MIN(distance)
limit 3"
return DB::table(DB::raw("({$subquery->toSql()}) as subquery")) | ||
->setBindings([$featureVector]) | ||
->whereIn('label_tree_id', $trees) | ||
->select('label_id') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for select()
if you use pluck()
.
->limit(config('labelbot.K')); | ||
|
||
return DB::table(DB::raw("({$subquery->toSql()}) as subquery")) | ||
->setBindings([$featureVector]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required again although you already added the binding to the subquery? Maybe there is a better way than to add the subquery as a string.
* | ||
* @return array | ||
*/ | ||
protected function performKnnSearch($featureVector, $trees) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically the same comments than above.
} | ||
|
||
/** | ||
* Perform ANN search (HNSW + Post-Subquery-Filtering). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please expand the comment with a brief explanation why we need both ANN and KNN search here.
@@ -5,7 +5,7 @@ | |||
use Biigle\Image; | |||
use Biigle\Shape; | |||
|
|||
class StoreImageAnnotation extends StoreImageAnnotationLabel | |||
class StoreImageAnnotation extends StoreImageAnnotationLabelFeatureVector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for inheritance here, just implement the stuff of StoreImageAnnotationLabelFeatureVector
directly in this class.
public function authorize() | ||
{ | ||
$this->annotation = ImageAnnotation::findOrFail($this->route('id')); | ||
$this->label = Label::findOrFail($this->input('label_id')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this fail if no label_id
is provided?
config/labelbot.php
Outdated
/* | ||
| K for KNN | ||
*/ | ||
'K' => 100, | ||
|
||
/* | ||
| N for top N labels | ||
*/ | ||
'N' => 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please elaborate these comments so you still understand what the variables mean if you come back in 5 years 😉
TODO
store
method inImageAnnotationController
requires alabel_id
in the request, but when LabelBOT is active, a request with a feature vector should be accepted without label_id.