Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST]: Implement ML Features #381

Open
10 of 39 tasks
GoEddie opened this issue Dec 30, 2019 · 22 comments
Open
10 of 39 tasks

[FEATURE REQUEST]: Implement ML Features #381

GoEddie opened this issue Dec 30, 2019 · 22 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@GoEddie
Copy link
Contributor

GoEddie commented Dec 30, 2019

** These should all be implemented with #1031 **

=========================================

This is to track implementation of the ML-Features: https://spark.apache.org/docs/latest/ml-features

Bucketizer has been implemented in #378 but there are more features that should be implemented.

If anyone else is going to implement probably best to put a comment here and I'll keep the list up to date.

@GoEddie GoEddie added the enhancement New feature or request label Dec 30, 2019
@imback82
Copy link
Contributor

Thanks @GoEddie for compiling the list.

If anyone else is going to implement probably best to put a comment here or something??

This sounds good to me. We will review #378 in the next few days (some of us are out on vacation).

@GoEddie
Copy link
Contributor Author

GoEddie commented Dec 30, 2019

Great thanks, @imback82 I didn’t expect anyone to reply until the new year :)

@Niharikadutta
Copy link
Collaborator

@GoEddie Thanks so much for your work on this! Would you be willing to write a blog post on these ML-features currently supported on .NET for Apache Spark? Please let us know if that is something that could interest you, and if you might require any help from us. Thanks! :)

@GoEddie
Copy link
Contributor Author

GoEddie commented Apr 13, 2020

Hi @Niharikadutta sure no problem I would be happy to do this.

Where is best to publish it? My site or is there a ms one I could guest write it for?

@Niharikadutta
Copy link
Collaborator

Hi @GoEddie Awesome, thanks so much! I will get back to you with the details about where to publish and how to get you guest access for it. Could you provide me with a preferred email account you would wish to have this associated with, and I will work on getting that access. Thanks again! :)

@GoEddie
Copy link
Contributor Author

GoEddie commented Apr 13, 2020

Great, ed.elliott AT outlook.com is good :)

@Niharikadutta
Copy link
Collaborator

Perfect, thank you! Will get back to you soon with the details. Also, is this email good to also have the monthly office hour Teams meeting on, as discussed in a separate thread?

@GoEddie
Copy link
Contributor Author

GoEddie commented Apr 13, 2020

Yes great (sorry forgot about that issue!)

@Niharikadutta
Copy link
Collaborator

Awesome, thanks for confirming! Will set that one up starting sometime next month if that works for you, or please let me know when you would like to have it start, along with your day of the week/time preference.

@GoEddie
Copy link
Contributor Author

GoEddie commented Apr 13, 2020

Just whatever suits you really, I didn’t want to make work for anyone

@Niharikadutta
Copy link
Collaborator

oh no we are happy to do this :) I just want to make sure I have your timezone/work hour constraints in mind before setting a meeting up.

@GoEddie
Copy link
Contributor Author

GoEddie commented Apr 13, 2020

Great, I’m in the UK so morning US time works better for me

@Niharikadutta
Copy link
Collaborator

Sounds good, will set up a cadence accordingly, thanks!

@Niharikadutta
Copy link
Collaborator

Hi @GoEddie , We had a discussion with the .NET blog folks and looks like the current process of doing this is to have the guest publish on their own blog if they have one, which then gets amplified by us through social media. Do you have a blog you can publish this on? If no, we can work on co-authoring one on the .NET blog.

@GoEddie
Copy link
Contributor Author

GoEddie commented May 12, 2020

Hi @Niharikadutta yes no problem, I have a blog https://the.agilesql.club/ - what are you looking for in the post?

@SARAVANA1501
Copy link
Contributor

@Niharikadutta @GoEddie I saw a list of features, quite a lot and amazing, I would like to contribute this, Can you help me to get start, gain context?

@Niharikadutta
Copy link
Collaborator

Hi @SARAVANA1501 ! Thank you so much for your interest in contributing to this project. You can go through the following docs to familiarize yourself with the project and it's coding guidelines:

  1. Coding guidelines
  2. Getting started guides
  3. Contributing guide
  4. Developer guide

Also you can checkout a few PRs related to ML features as implemented by @GoEddie:

  1. FeatureHasher
  2. ML/CountVectorizer and ML/CountVectorizerModel
  3. ML Features: Word2Vec

Please feel free to reach out in case of any questions or doubts and let us know if you have any feedback about the on-boarding docs or process. Thanks again!

This was referenced Oct 9, 2020
@SARAVANA1501
Copy link
Contributor

@GoEddie update list NGram is in progress

@ramanathanv
Copy link
Contributor

@GoEddie As part of #381, SQLTransformer class is in progress

@GoEddie
Copy link
Contributor Author

GoEddie commented Nov 20, 2020

@GoEddie As part of #381, SQLTransformer class is in progress

Awesome - have updated list :)

@ramanathanv
Copy link
Contributor

Hi @GoEddie . I have started implementation for StringIndexer class.

@GoEddie
Copy link
Contributor Author

GoEddie commented Dec 13, 2020

@ramanathanv awesome, have updated the list

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants