Skip to content
This repository has been archived by the owner on Jun 14, 2018. It is now read-only.

pyLDAvis ValidationError: Not all rows (distributions) in doc_topic_dists sum to 1 #80

Open
imranshaikmuma opened this issue Jun 7, 2017 · 2 comments

Comments

@imranshaikmuma
Copy link

i am getting the below error when trying to visualize HDP model trained on gensim

**_---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
in ()
----> 1 vis_data_hdp = gensimvis.prepare(hdpmodel, corpus, dictionary)
2 #pyLDAvis.display(vis_data_hdp)

C:\Anaconda2\lib\site-packages\pyLDAvis\gensim.pyc in prepare(topic_model, corpus, dictionary, doc_topic_dist, **kwargs)
110 """
111 opts = fp.merge(_extract_data(topic_model, corpus, dictionary, doc_topic_dist), kwargs)
--> 112 return vis_prepare(**opts)

C:\Anaconda2\lib\site-packages\pyLDAvis_prepare.pyc in prepare(topic_term_dists, doc_topic_dists, doc_lengths, vocab, term_frequency, R, lambda_step, mds, n_jobs, plot_opts, sort_topics)
372 doc_lengths = _series_with_name(doc_lengths, 'doc_length')
373 vocab = _series_with_name(vocab, 'vocab')
--> 374 _input_validate(topic_term_dists, doc_topic_dists, doc_lengths, vocab, term_frequency)
375 R = min(R, len(vocab))
376

C:\Anaconda2\lib\site-packages\pyLDAvis_prepare.pyc in _input_validate(*args)
63 res = _input_check(*args)
64 if res:
---> 65 raise ValidationError('\n' + '\n'.join([' * ' + s for s in res]))
66
67
ValidationError:

  • Not all rows (distributions) in doc_topic_dists sum to 1._**

To train hdp model i have used the following syntax:
hdpmodel = models.hdpmodel.HdpModel(corpus, dictionary)

corpus looks like this:
[[(0, 1), (1, 1), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (11, 2), (12, 1), (13, 2), (14, 1), (15, 1), (16, 1), (17, 1), (18, 1), (19, 1), (20, 1), (21, 1), (22, 1), (23, 1), (24, 1), (25, 1), (26, 1), (27, 1), (28, 1), (29, 1), (30, 1), (31, 1), (32, 1), (33, 4), (34, 1), (35, 1), (36, 1), (37, 1), (38, 1), (39, 2), (40, 1), (41, 2), (42, 1), (43, 2), (44, 1), (45, 1), (46, 1), (47, 3), (48, 1), (49, 1), (50, 2), (51, 1), (52, 1), (53, 1), (54, 1), (55, 1), (56, 1), (57, 1), (58, 1), (59, 1), (60, 1), (61, 1), (62, 1), (63, 1), (64, 1), (65, 1)]

dictionary looks like this:
[u'', u'dacteur', u'reallocations', u'advcompliance', u'resolveboth............

@ned2
Copy link

ned2 commented Jun 12, 2018

I can confirm this also.

@a087861
Copy link

a087861 commented Jun 13, 2018

double confirmed

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants