Skip to content
This repository has been archived by the owner on Jun 27, 2022. It is now read-only.

Need advice about how to evaluate your proficiency #11

Closed
floer32 opened this issue Nov 4, 2015 · 9 comments
Closed

Need advice about how to evaluate your proficiency #11

floer32 opened this issue Nov 4, 2015 · 9 comments

Comments

@floer32
Copy link
Collaborator

floer32 commented Nov 4, 2015

Please don't sell yourself as a Machine Learning expert while you're still in the Danger Zone. Don't build bad products or publish junk science. This guide can't tell you how you'll know you've "made it" into Machine Learning competence ... let alone expertise. It's hard to evaluate proficiency without schools or other institutions. This is a common problem for self-taught people. Your best bet may be: expert peers.

If you know a good way to evaluate Machine Learning proficiency, please submit a Pull Request to share it with us.

Need to tell people how they know they're out of the Danger Zone or how they know they are hire-able.

@floer32
Copy link
Collaborator Author

floer32 commented Nov 4, 2015

@floer32 floer32 changed the title Need info about how to evaluate your proficiency Need advice about how to evaluate your proficiency Nov 5, 2015
floer32 pushed a commit that referenced this issue Nov 5, 2015
Right now just quoting the idea from Hacker News user,
olympus -- go compete!

May paraphrase, first want to submit for review.

addresses #11
@floer32
Copy link
Collaborator Author

floer32 commented Nov 5, 2015

I need some review!

I added a section warning people about the "Danger Zone" (when you know enough to throw some algorithms at some things, but you don't have enough knowledge, science, or stats knowledge to be an expert). The "danger zone" is familiar to anyone who's taught themselves something really big. [Here's the section.]

What's missing: some advice! It would be nice if the guide could say something besides, basically, "It's hard."

Well, a user on Hacker News was kind enough to give a suggestion today. I put a quote from them on this branch because I'm hoping for some review. If it looks like a good thing to include, maybe I'll paraphrase.

I'm wary of making false promises or saying "well just do this and YOU'RE ALL SET," but it does sound like some sound advice honestly. So maybe after paraphrasing into a more cautious/conservative tone, it will be good.

@floer32
Copy link
Collaborator Author

floer32 commented Nov 5, 2015

@rhiever have any thoughts? No worries if you don't want to touch this with a ten foot pole :P Would understand.

@davidlowjw ?

@rhiever
Copy link

rhiever commented Nov 5, 2015

Kaggle competitions are a so-so way to practice ML. I have ethical issues with Kaggle because I think they're exploiting researchers to build products for companies, but that's a conversation for another day.

One good way to have your work double-checked is to post it on Cross-Validated: http://stats.stackexchange.com/

There's some really smart people on there that will give you great advice.

There's also some great online communities like Hacker News, reddit.com/r/DataIsBeautiful, /r/DataScience, and /r/MachineLearning where you can post your work and ask for feedback. I've learned a ton this way, and it really helps you practice dealing with feedback on your week (which is an often-underpracticed skill).

I think the best advice is to tell people to always present their methods clearly and to avoid over-interpreting their results. Part of being an expert is knowing that there's rarely a clear answer, especially when you're working with real data.

@floer32
Copy link
Collaborator Author

floer32 commented Nov 5, 2015

Thanks for the thoughtful response @rhiever.

I hadn't thought about Kaggle that way, as an outsider looking in ... I work in InfoSec and being so used to bug bounties and the like, the premise didn't shock me. But this is a good perspective to hear.

So your suggestion is rather to:

  1. practice a lot with real data
  2. when you have a novel finding, reach out for review (on one of the communities you mentioned)
  3. fix issues and learn

And repeat, of course. This makes a lot of sense. I'll mull this over a bit and try to add a clear, succinct section to the guide.

I think the best advice is to tell people to always present their methods clearly and to avoid over-interpreting their results. Part of being an expert is knowing that there's rarely a clear answer, especially when you're working with real data.

I don't think I can paraphrase this better than you've said it. Can I quote you? Alternatively, I could do this first PR (about the practice-review-fix approach), then you could use your own words and submit a PR. Or just quote. LMK. 😄

@rhiever
Copy link

rhiever commented Nov 5, 2015

Sure, feel free to quote.

floer32 pushed a commit that referenced this issue Nov 7, 2015
floer32 pushed a commit that referenced this issue Nov 7, 2015
floer32 pushed a commit that referenced this issue Nov 7, 2015
@floer32
Copy link
Collaborator Author

floer32 commented Nov 7, 2015

Changes now in master. I think this really helps the guide and adds something that was missing. Thanks for your help on this, @rhiever!

@floer32 floer32 closed this as completed Nov 7, 2015
floer32 pushed a commit that referenced this issue Jan 13, 2016
Right now just quoting the idea from Hacker News user,
olympus -- go compete!

May paraphrase, first want to submit for review.

addresses #11
floer32 pushed a commit that referenced this issue Jan 13, 2016
floer32 pushed a commit that referenced this issue Jan 13, 2016
floer32 pushed a commit that referenced this issue Jan 13, 2016
@floer32
Copy link
Collaborator Author

floer32 commented May 10, 2016

Saw your new library TPOT, @rhiever ... looks awesome! I'm going to check it out and try it out, and figure out where to link to in the guide. (issue)

@rhiever
Copy link

rhiever commented May 10, 2016

👍 Let me know if you have any questions! Thanks for adding it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants