Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Hacker News Challenge #164

Merged
merged 4 commits into from
Jul 20, 2020
Merged

Add Hacker News Challenge #164

merged 4 commits into from
Jul 20, 2020

Conversation

raven300
Copy link
Contributor

Signed-off-by: radha rads_venkatesan@yahoo.com

Signed-off-by: radha <rads_venkatesan@yahoo.com>
Copy link
Contributor

@zeibura zeibura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few inline suggestions added

Signed-off-by: radha <rads_venkatesan@yahoo.com>
@jellypuno jellypuno self-requested a review July 15, 2020 07:53
@jellypuno jellypuno requested review from zeibura and MikeBauerCA July 16, 2020 12:02
@jellypuno
Copy link
Contributor

Hi @MikeBauerCA @zeibura , Can you please review this PR again? I would really like this to be merged. Thank you so much!

We will be working on a Hacker News 2015-2016 dataset from Kaggle with a full year’s worth of stories: Our goal is to extract only the Mainframe/COBOL related stories and assign ranking scores to them based on (a simplified version) the published Hacker News ranking algorithm. We will create a front page report that reflects this ranking order. The algorithm works in a way that nothing stays on the front page for too long, so a story’s score will eventually drop to zero over time (the gravity effect). Since our posts are spread out over a year and as older posts will always have a lower (or zero) ranking, we will distort the data so all our stories have the same date and and consider only the times in the ranking score calculation. This will give all our posts a fair chance of landing the front page. Our front page report is published at 11:59pm. [Here's some additional information on the ranking.](http://www.righto.com/2013/11/how-hacker-news-ranking-really-works.html)

### The Plan
[] There are different creative ways of accomplishing this but here’s our plan: We will have a COBOL program that reads the input CSV file and retrieves only the ***Mainframe/COBOL*** stories. It then calculates the ranking score for the stories by factoring in the time they were posted and the number of votes they received. Each of the records is then written to an output dataset along with the ranking score.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should [] be removed?

Copy link
Contributor

@MikeBauerCA MikeBauerCA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a couple instances of [ ] that should probably be removed. Other than that, this looks good to me. Thanks!

@MikeBauerCA
Copy link
Contributor

@raven300 I'll mark as approved. Thanks for making the change. The DCO signoff is failing because it wants you to put raven300 as your name in the signoff. You will need to head to your local branch on your machine to resolve the DCO signoff issue. Be sure to pull first.

Signed-off-by: Radha Venkatesan <rads_venkatesan@yahoo.com>
Signed-off-by: radha <rads_venkatesan@yahoo.com>
@raven300
Copy link
Contributor Author

Done. Thank you!

@MikeBauerCA
Copy link
Contributor

@zeibura once you are satisfied with your requested changes, please approve and merge :)

@zeibura zeibura merged commit 1b72401 into openmainframeproject:master Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants