Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault #540

Closed
godkillok opened this issue Jul 22, 2018 · 10 comments
Closed

Segmentation fault #540

godkillok opened this issue Jul 22, 2018 · 10 comments

Comments

@godkillok
Copy link

godkillok commented Jul 22, 2018

Segmentation fault

Running on:

  • [v] CPU

Interface:

  • [ v] Python
    training_vectors.shape is (2357720, 100). Training is done, but when go to search< index.search(training_vectors[0:10000], 100) > , it always report "Segmentation fault".
    the code as follow:

    (num, d) = training_vectors.shape
    t1 = time.time()

    nlist = max(5,int(num/500))
    normalize_L2(training_vectors)
    quantizer=faiss.IndexFlatIP(d)

    index = faiss.IndexIVFFlat(quantizer, d, nlist, faiss.METRIC_INNER_PRODUCT)
    index.train(training_vectors)
    index.nprobe =max(1,int(nlist*0.6)) # default nprobe is 1, try a few more
    index.add(training_vectors)
    t2 = time.time()
    logging.info('{} times is {}'.format('add and train', t2 - t1))

    t1=time.time()

    score, sim_id=index.search(training_vectors[0:10000], 100) # this line goes wrong

    t2=time.time()

@godkillok godkillok changed the title seg Segmentation fault Jul 22, 2018
@mdouze
Copy link
Contributor

mdouze commented Jul 23, 2018

I don't see an obvious reason why this could go wrong.
Does the segfault still occur when searching 1000 vectors instead of 100000?

@wuhu
Copy link

wuhu commented Jul 30, 2018

I also get a segmentation fault (also using cpu).

This happens only if the faiss index is a member of an object, for instance here:

import faiss
import numpy as np

class Products:
    def __init__(self, path):
        self.embeddings = np.load(f'{path}/embeddings.npy')  # this loads a ~ 100000x512 float32 array
        quantizer = faiss.IndexFlatIP(512)
        self.index = faiss.IndexIVFFlat(quantizer, 512, 100, faiss.METRIC_L2)
        self.index.train(self.embeddings)
        self.index.add(self.embeddings)

    def find_nearest(self, index, n):
        return self.index.search(self.embeddings[index].reshape(1, -1), n)

p = Products('path/to/the/npy')
p.find_nearest(100, 10)  # segfault happens here

When implementing the same without a class, there is no segmentation fault:

import faiss
import numpy as np

path = 'path/to/the/npy'
embeddings = np.load(f'{path}/embeddings.npy')  # this loads a ~ 100000x512 float32 array
quantizer = faiss.IndexFlatIP(512)
index = faiss.IndexIVFFlat(quantizer, 512, 100, faiss.METRIC_L2)
index.train(embeddings)
index.add(embeddings)

def find_nearest(i, n):
    return index.search(embeddings[i].reshape(1, -1), n)

find_nearest(100, 10)  # no segfault, works as expected

@beauby
Copy link
Contributor

beauby commented Jul 30, 2018

@wuhu This is because the quantizer gets garbage-collected by python. You could do self.quantizer = ... instead, so that it is not GCed before your Products instance is destroyed.

@wuhu
Copy link

wuhu commented Jul 30, 2018

@beauby Thanks for the quick reply! That worked.

@Enet4
Copy link
Contributor

Enet4 commented Jul 30, 2018

This seems to be another case of crashes in Python code. That makes me wonder: what would be the consequences of having the bindings automatically keep references to nested indexes on construction of new indexes (as in, apply the dont_dealloc_me trick)? I would imagine this to be safer and more predictable than requiring users to keep references by themselves in Python-land (although I probably overlooked something).

@mdouze
Copy link
Contributor

mdouze commented Jul 30, 2018

It is on our TODO list. It requires to do some SWIG trick that adds that reference when an object is passed in as a reference to a few dozen functions and constructors.
The reason why we have not done it yet is:

  • I don't know exactly where in SWIG this has to be done
  • this problem hits only mid-level users. Low-level users know well enough the library and how the references work, and high-level users only use the index_factory that does not have this problem.

@godkillok
Copy link
Author

sorry, i reply so late. i double checked my code, and i found the same problem as @wuhu, not caused by the line i mentioned

@beauby
Copy link
Contributor

beauby commented Jul 31, 2018

@godkillok Great – can we close this issue then?

@asanakoy
Copy link
Member

asanakoy commented Jul 24, 2019

It should be documented at least somewhere while the automatic reference counting is not implemented yet.

@beauby
Copy link
Contributor

beauby commented Jul 24, 2019

@asanakoy It is actually done now. If you are encountering the same behavior, it is a bug, in which case please open a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants