Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-4199: add TCEs to the combined query page #2838

191 changes: 191 additions & 0 deletions doctests/query-combined.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
// EXAMPLE: query_combined
// HIDE_START
import assert from 'assert';
dwdougherty marked this conversation as resolved.
Show resolved Hide resolved
import fs from 'fs';
import { createClient } from 'redis';
import { SchemaFieldTypes, VectorAlgorithms } from '@redis/search';
import { pipeline } from '@xenova/transformers';

const float32Buffer = (arr) => {
dwdougherty marked this conversation as resolved.
Show resolved Hide resolved
const floatArray = new Float32Array(arr);
const float32Buffer = Buffer.from(floatArray.buffer);
return float32Buffer;
};

async function embedText(sentence) {
let modelName = 'Xenova/all-MiniLM-L6-v2';
let pipe = await pipeline('feature-extraction', modelName);

let vectorOutput = await pipe(sentence, {
pooling: 'mean',
normalize: true,
});

const embedding = Object.values(vectorOutput?.data);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will throw an error if vectorOutput is undefined (which seems to be an option since you are using .?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the change you want me to make here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never used the @xenova/transformers package, but because you've used the .? I guess the function returns undefined in some cases (?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno. I stole that code from another set of examples.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @leibale point is that the pipeline can seemingly return an undefined and then you would be passing an undefined into Object.values(), which will result in an exception.

Now, if this were just to be a test, that probably wouldn't be a huge problem (as the exception would fail the test), but if this is also suppoesd to be documentation on how to use it, perhaps a check for it being undefined with a throw error if it is, would help (and then wouldn't need the ?, as the if undefined check would restrict it away).

just a thought, could be wrong :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test around vectorOutput and removed the '?' from the assignment as suggested.


return embedding;
}

let query = "Bike for small kids";
dwdougherty marked this conversation as resolved.
Show resolved Hide resolved
let vector_query = float32Buffer(await embedText("That is a very happy person"));

const client = createClient();
dwdougherty marked this conversation as resolved.
Show resolved Hide resolved
await client.connect();

// create index
await client.ft.create('idx:bicycle', {
'$.description': {
type: SchemaFieldTypes.TEXT,
AS: 'description',
sortable: false
dwdougherty marked this conversation as resolved.
Show resolved Hide resolved
},
'$.condition': {
type: SchemaFieldTypes.TAG,
AS: 'condition',
sortable: false
},
'$.price': {
type: SchemaFieldTypes.NUMERIC,
AS: 'price',
sortable: false
},
'$.description_embeddings': {
type: SchemaFieldTypes.VECTOR,
TYPE: 'FLOAT32',
ALGORITHM: VectorAlgorithms.FLAT,
DIM: 384,
DISTANCE_METRIC: 'COSINE',
AS: 'vector',
}
}, {
ON: 'JSON',
dwdougherty marked this conversation as resolved.
Show resolved Hide resolved
PREFIX: 'bicycle:'
});

// load data
const bicycles = JSON.parse(fs.readFileSync('data/query_vector.json', 'utf8'));

await Promise.all(
bicycles.map((bicycle, bid) => {
return client.json.set(`bicycle:${bid}`, '$', bicycle);
})
);
// HIDE_END

// STEP_START combined1
let res = await client.ft.search('idx:bicycle', '@price:[500 1000] @condition:{new}');
console.log(res.total); // >>> 1
console.log(res); // >>>
//{
// total: 1,
// documents: [ { id: 'bicycle:5', value: [Object: null prototype] } ]
//}
// REMOVE_START
assert.strictEqual(res.total, 1);
// REMOVE_END
// STEP_END

// STEP_START combined2
res = await client.ft.search('idx:bicycle', 'kids @price:[500 1000] @condition:{used}');
dwdougherty marked this conversation as resolved.
Show resolved Hide resolved
console.log(res.total); // >>> 1
console.log(res); // >>>
// {
// total: 1,
// documents: [ { id: 'bicycle:2', value: [Object: null prototype] } ]
// }
// REMOVE_START
assert.strictEqual(res.total, 1);
// REMOVE_END
// STEP_END

// STEP_START combined3
res = await client.ft.search('idx:bicycle', '(kids | small) @condition:{used}');
console.log(res.total); // >>> 2
console.log(res); // >>>
//{
// total: 2,
// documents: [
// { id: 'bicycle:2', value: [Object: null prototype] },
// { id: 'bicycle:1', value: [Object: null prototype] }
// ]
//}
// REMOVE_START
assert.strictEqual(res.total, 2);
// REMOVE_END
// STEP_END

// STEP_START combined4
res = await client.ft.search('idx:bicycle', '@description:(kids | small) @condition:{used}');
console.log(res.total); // >>> 2
console.log(res); // >>>
//{
// total: 2,
// documents: [
// { id: 'bicycle:2', value: [Object: null prototype] },
// { id: 'bicycle:1', value: [Object: null prototype] }
// ]
//}
// REMOVE_START
assert.strictEqual(res.total, 2);
// REMOVE_END
// STEP_END

// STEP_START combined5
res = await client.ft.search('idx:bicycle', '@description:(kids | small) @condition:{new | used}');
console.log(res.total); // >>> 3
console.log(res); // >>>
//{
// total: 3,
// documents: [
// { id: 'bicycle:1', value: [Object: null prototype] },
// { id: 'bicycle:0', value: [Object: null prototype] },
// { id: 'bicycle:2', value: [Object: null prototype] }
// ]
//}
// REMOVE_START
assert.strictEqual(res.total, 3);
// REMOVE_END
// STEP_END

// STEP_START combined6
res = await client.ft.search('idx:bicycle', '@price:[500 1000] -@condition:{new}');
console.log(res.total); // >>> 2
console.log(res); // >>>
//{
// total: 2,
// documents: [
// { id: 'bicycle:2', value: [Object: null prototype] },
// { id: 'bicycle:9', value: [Object: null prototype] }
// ]
//}
// REMOVE_START
assert.strictEqual(res.total, 2);
// REMOVE_END
// STEP_END

// STEP_START combined7
res = await client.ft.search('idx:bicycle',
'(@price:[500 1000] -@condition:{new})=>[KNN 3 @vector $query_vector]', {
PARAMS: { query_vector: vector_query },
DIALECT: 2
}
);
console.log(res.total); // >>> 2
console.log(res); // >>>
//{
// total: 2,
// documents: [
// { id: 'bicycle:2', value: [Object: null prototype] },
// { id: 'bicycle:9', value: [Object: null prototype] }
// ]
//}
// REMOVE_START
assert.strictEqual(res.total, 2);
// REMOVE_END
// STEP_END

// REMOVE_START
// destroy index and data
await client.ft.dropIndex('idx:bicycle', { DD: true });
await client.disconnect();
// REMOVE_END
Loading