Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support FeatureUnion for InvertableHashingVectorizer #16

Closed
lopuhin opened this issue Oct 11, 2016 · 4 comments · Fixed by #176
Closed

Support FeatureUnion for InvertableHashingVectorizer #16

lopuhin opened this issue Oct 11, 2016 · 4 comments · Fixed by #176

Comments

@lopuhin
Copy link
Contributor

lopuhin commented Oct 11, 2016

Just adding features from .transformer_list, possibly with prefixes, should be enough

@kmike
Copy link
Contributor

kmike commented Oct 11, 2016

Hm, do you mean it should be possible to pass FeatureUnion to InvertableHashingVectorizer, and it'll find all HashingVectorizers and apply itself to them? If so, it'd be also required to handle FeatureUnion recursively, and maybe add Pipeline support, because there can be other FeatureUnions in a FeatureUnion.

I'd not put it directly to InvertableHashingVectorizer class; what about a helper function which returns a new FeatureUnion (or maybe a Pipeline)? It may be convenient for simple pipelines as well.

@lopuhin
Copy link
Contributor Author

lopuhin commented Oct 12, 2016

Yeah, it looks hard to do in general, and we also need some sensible names for the different vectorizers (parts of union) to show them in the feature report. Right now I just construct feature_names by hand, including the vectorizer name in the feature name.

I don't quite understand what the helper would do, sorry :)

@lopuhin
Copy link
Contributor Author

lopuhin commented Nov 16, 2016

Aha, I finally understand what that helper could do: we already have some rudimentary support for FeatureUnion (meaning it works out of the box in simple cases), so that helper would just build a new feature union with hashing vectorizer replaces with inverting hashing vectorizer.

@lopuhin
Copy link
Contributor Author

lopuhin commented Jan 23, 2017

FeatureUnion for text was implemented in #96, but it's still lacking unhashing support. Some work on it is in union-hashing-vec-3 branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants