data_science

seeing is believing. A witty saying proves nothing.

"When solving a problem of interest, do not solve a more general problem as an intermediate step." (Vladimir Vapnik)

Must read

foundation of dl: https://www.youtube.com/watch?time_continue=157&v=zl99IZvW7rE
(Bradley)Bayesian, Frequentist and Scientist: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.179.1454&rep=rep1&type=pdf
(Breiman) 2 cultures http://www2.math.uu.se/~thulin/mm/breiman.pdf
https://gluebenchmark.com/leaderboard

My implementations

Chatbot

RecSys

https://github.com/maciejkula/spotlight
session based: https://arxiv.org/pdf/1511.06939.pdf
pool next item: https://www.semanticscholar.org/paper/Deep-Neural-Networks-for-YouTube-Recommendations-Covington-Adams
tune nlp: http://ruder.io/deep-learning-nlp-best-practices/index.html#classification

Winining solutions

http://ndres.me/kaggle-past-solutions/
Rossmann Sales Forecasting, 1st solution: https://kaggle2.blob.core.windows.net/forum-message-attachments/102102/3454/Rossmann_nr1_doc.pdf

Stats

Good, Hardin. Common Errors in Statistics (and How to Avoid Them) (2003)
Kanji. 100 statistical tests (2006)
Doing Data Science: Straight Talk from the Frontline

Game Industry:

Case stydies:

DS Coursera

Heroes of DL

Geoffrey Hinton: https://www.youtube.com/watch?v=-eyhCTvrEtE
Andreij Karpathy: https://www.youtube.com/watch?v=_au3yw46lcg

Top conferences:

KDD 2018 London, UK: http://www3.imperial.ac.uk/newsandeventspggrp/imperialcollege/engineering/datascienceinstitute/newssummary/news_22-8-2017-11-17-28
WSDM 2018, US: http://www.wsdm-conference.org/2018/
NIPS 2017, Long Beach, US: https://nips.cc/
DepLing 2017: http://www.depling.org/depling2017/program.html
CIKM 2017: http://cikm2017.org/
https://webdocs.cs.ualberta.ca/~zaiane/htmldocs/ConfRanking.html
http://www.guide2research.com/topconf/machine-learning
http://portal.core.edu.au/conf-ranks/?search=&by=all&source=CORE2017&sort=atitle&page=1

Deep Learning

http://www.deeplearningbook.org/contents/applications.html

Events: I will put word cloud for that.

EMNLP 2017: http://noisy-text.github.io/2017/

NLPStan reading

http://nlp.stanford.edu/read/
NLP dataset: https://github.com/niderhoff/nlp-datasets

LXMLS16:

ACL2017

keynote: linguistic is back, reduce search space: https://drive.google.com/file/d/0B2cCJQ2_aOwjMlg5MnFjTEpBNG8/view

VietAI

Quoc Le (Google Brain): http://cs.stanford.edu/~quocle/
Thang Luong (Google Brain): http://t.co/3zNHouUn
Dustin (Columbia) http://dustintran.com/
Thien (NYU) http://www.cs.nyu.edu/~thien/
Hieu Pham (CMU) https://www.quora.com/profile/Hieu-Pham-20
Ken Tran (Microsofts) http://www.kentran.net/
Laurent Dinh (MILA):https://laurent-dinh.github.io/about/
Luong Hoang, Harvard: https://github.com/lhoang29/recurrent-entity-networks
Vu Pham

My SOTA

My ATIS: sequence tagging, nb of params: 324335, bi-LSTM
Quore question duplicate detection: Accuracy 85% on Wang's test

 - best F1 score: 94.92/94.64
 - train scores: 97.5446666667/96.17
 - val scores: 93.664/92.94

Game industry

TCCP PU learning https://arxiv.org/pdf/1802.09788.pdf
By last time login: https://mpra.ub.uni-muenchen.de/82871/1/paper8.pdf
https://www.slideshare.net/aistconf/webgames-61437118

Yandex

ICLR 2017 Review

if you wanna turn LSTM, it's worth to read (from Socher): https://arxiv.org/pdf/1611.05104v2.pdf

LearningNewThingIn2017

Torch/Lua (Facebook/HarvardNLP): http://nlp.seas.harvard.edu/code/, http://cs287.fas.harvard.edu/
TF/Python (Google/Stanford): https://github.com/BinRoot/TensorFlow-Book
cs287: https://github.com/CS287/Lectures

Conf events

Coling 2016, Osaka Japan: http://coling2016.anlp.jp/
ICLR 2017, Apr in France: http://www.iclr.cc/doku.php?id=ICLR2017:main&redirect=1
open review: http://openreview.net/group?id=ICLR.cc/2017/conference

NIPs 2016 slides

https://github.com/hindupuravinash/nips2016
Ian GAN tut: http://www.iangoodfellow.com/slides/2016-12-9-gans.pdf
Ng nuts and bolts: https://www.dropbox.com/s/dyjdq1prjbs8pmc/NIPS2016%20-%20Pages%202-6%20(1).pdf
variational inference: http://www.cs.columbia.edu/~blei/talks/2016_NIPS_VI_tutorial.pdf

Theano based DL applications

https://news.ycombinator.com/item?id=9283105

learn to learn: algos optimization

sgd and friends: http://cs231n.github.io/neural-networks-3/#update
overview of gd: http://sebastianruder.com/optimizing-gradient-descent/
keras-team/keras#898
I used to choose adam and rmsprop with tuning lr and batch size.

People

Pin:

semantic scholar: https://www.semanticscholar.org/
grow a mind: http://web.mit.edu/cocosci/Papers/tkgg-science11-reprint.pdf
trendingarxiv: http://trendingarxiv.smerity.com/
https://github.com/andrewt3000/DL4NLP
Natural languague inference NLI: https://github.com/Smerity/keras_snli
ACL: http://www.aclweb.org/anthology/P/P16/

Data type: NOQ

Nominal (N):cat, dog --> x,o | vis: shape, color
Ordinal (O): Jan - Feb - Mar - Apr | vis: area, density
Quantitative (Q): numerical 0.42, 0.58 | vis: length, position

People:

Graham CMU: http://www.phontron.com/teaching.php, https://github.com/neubig/

Fin data:

Reuters 8M (2007-2016): https://github.com/philipperemy/Reuters-full-data-set.git
Bloomberg https://github.com/philipperemy/financial-news-dataset
stocktwits: https://github.com/goodwillyoga/E107project/tree/master/pooja/data

Projects:

https://github.com/THEdavehogue/glassdoor-analysis

Wikidata:

Cartoons & Quotes:

"cause you know sometimes words have two meanings" led zeppelin
http://stats.stackexchange.com/questions/423/what-is-your-favorite-data-analysis-cartoon?newsletter=1&nlcode=231076%7C1179

Books:

http://neuralnetworksanddeeplearning.com/index.html
u.cs.biu.ac.il/~yogo/nnlp.pdf

Done:

EMNLP 2016, Austin, 2-4 Nov: http://www.emnlp2016.net/tutorials.html#practical

Dynet (CMU: https://t.co/nSCkBt0i0F
lifelong ML (Google): http://www.emnlp2016.net/tutorials/chen-liu-t3.pdf
Markov logic for scalable joint inference: http://www.emnlp2016.net/tutorials/venugopal-gogate-ng-t2.pdf
good summary of sentiment analysis with NN (Singapore): http://www.emnlp2016.net/tutorials/zhang-vo-t4.pdf
structure prediction (POS, NER)(Singapore): http://www.emnlp2016.net/tutorials/sun-feng-t6.pdf
BADLS: 2 day conference at Stanford university

day 1:

Hugo(Twitter): Feed forward NN
Kartpathy(OpenAI): Convnet
Socher(MetaMind): NLP = word2vec/glove + GRU + MemNet
Tensorflow tut: from 5:55:49
Ruslan: Deep Unsup Learning: from 7:10:39
Andrew Ng: Nuts and bolts in applied DL from 9:09:46

day 2:

Schulman: RL from 06:40
Pascal(MILA): theano, from 1:52:03
ASR from 4:01:11
NN with Torch from 5:49:32, https://github.com/alexbw/bayarea-dl-summerschool
seq2seq learning, Quoc Le: from 7:03:44
Bengio: Foundations and challenges in DL, from 9:01:14
data fest: https://alexanderdyakonov.wordpress.com/
8,9,12,13 Sept: data science week: http://dsw2016.datascienceweek.com/
KDD 2016: http://www.kdd.org/kdd2016/
ACL 2016, Berlin, 7-12 Aug: http://acl2016.org/index.php?article_id=60

AI mistakes:

napalm girl: https://techcrunch.com/2016/09/12/facebook-employees-say-deleting-napalm-girl-photo-was-a-mistake/
fine for his car shadow: http://www.independent.co.uk/news/world/europe/russian-driver-fined-car-shadow-moscow-a7225146.html
human on motorcycle: http://cs.stanford.edu/people/karpathy/deepimagesent/generationdemo/

Keras:

image classification with vgg16: http://www.pyimagesearch.com/2016/08/10/imagenet-classification-with-python-and-keras/
hualos, keras viz: https://github.com/fchollet/hualos
https://github.com/dylandrover/keras_tutorial/blob/master/keras_tutorial/keras_deck.pdf
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/learn/wide_n_deep_tutorial.py
model zoo:https://github.com/tensorflow/models
music auto tag: https://github.com/keunwoochoi/music-auto_tagging-keras
expose API: https://github.com/samjabrahams/inception-resnet-flask-demo

NLP:

Apps:

https://github.com/fginter/w2v_demo
http://bionlp-www.utu.fi/wv_demo/
3top: https://github.com/3Top/word2vec-api
next wave of nn: http://www.nextplatform.com/2016/09/14/next-wave-deep-learning-applications/
labeling tools: http://cs.stanford.edu/people/karpathy/ilsvrc/
deep art: https://deepart.io/hire/kzXhuUPf/
text sum: http://esapi.intellexer.com/Summarizer
http://www.deeplearningpatterns.com/doku.php/applications
mt: http://104.131.78.120/
rnn: http://www.cs.toronto.edu/~ilya/fourth.cgi?prefix=I+have+a+dream.+&numChars=150
chatbot: http://sumve.com/firesidechat/
text vis: http://slanglab.cs.umass.edu/topic-animator/
music auto tag: https://github.com/keunwoochoi/music-auto_tagging-keras
deep image sent: http://cs.stanford.edu/people/karpathy/deepimagesent/rankingdemo/

German word embedding:

pretrained: http://devmount.github.io/GermanWordEmbeddings/
vis: pca, tsne: https://github.com/devmount/GermanWordEmbeddings/blob/master/code/pca.ipynb

PyGotham:

textacy: http://michelleful.github.io/code-blog/2016/07/23/nlp-at-pygotham-2016/
nlp with keras, rnn, cnn
https://github.com/drincruz/PyGotham-2016
skipthought: https://libraries.io/github/LeavesBreathe/Sequence-To-Sequence-Generation-Skip-Thoughts-
https://github.com/ryankiros/skip-thoughts
doc sum: http://mike.place/talks/pygotham/#p1

Journalist LDA and ML:

Europython:

Scipy 2016:

http://scipy2016.scipy.org/ehome/146062/332963/

Performance Evaluation(PE):

book ELA: http://www.cambridge.org/us/academic/subjects/computer-science/pattern-recognition-and-machine-learning/evaluating-learning-algorithms-classification-perspective
slides: http://www.icmla-conference.org/icmla11/PE_Tutorial.pdf
bayesian hypothesis testing: http://ipg.idsia.ch/preprints/corani2015c.pdf

Hypothesis testing

http://bebi103.caltech.edu/2015/tutorials/t6b_frequentist_hypothesis_testing.html
central limit theorem: http://nbviewer.jupyter.org/github/mbakker7/exploratory_computing_with_python/blob/master/notebook_s3/py_exp_comp_s3_sol.ipynb
hypothesis testing and p value: http://vietsciences.free.fr/khaocuu/nguyenvantuan/bieudoR/ch7-kiemdinhgiathiet.htm

Metrics:

http://users.dsic.upv.es/~dpinto/duc/RougeLin.pdf

Rock, Metal and NLP:

Financial:

https://github.com/johnymontana/NewzTrader_AI_project

Twitter:

http://nlp.stanford.edu/projects/glove/preprocess-twitter.rb
GATE NER dataset: https://gate.ac.uk/wiki/broad-twitter-corpus.html

Deep Learning Frameworks/Toolkits:

Tensorflow
Torch
Theano
Keras
Dynet
CNTK

ElasticSearch + Kibana:

install ES 2.4 + Kibana: default sense in console 5601
http://ghostweather.slides.com/lynncherny/deck

Attention based:

code RWA in TF: https://github.com/jostmey/rwa
decomposable attention: https://github.com/explosion/spaCy/tree/master/examples/keras_parikh_entailment
customized lstm with attention: http://benjaminbolte.com/blog/2016/keras-language-modeling.html
vis + cnn + lstm: https://blog.heuritech.com/2016/01/20/attention-mechanism/

ResNet: Residual Networks

Sentiment

NER

https://github.com/aleju/ner-crf
2017 conference: http://noisy-text.github.io/2017/
demo: http://nlp.stanford.edu:8080/ner/process
ritter: https://www.cise.ufl.edu/class/cis6930fa11lad/cis6930fa11_NEROverTweets.pdf
cmu tweetnlp: http://www.cs.cmu.edu/~ark/TweetNLP/
opencalais: http://www.opencalais.com/opencalais-demo/
https://www.quora.com/How-can-I-find-city-country-company-name-from-a-tweet-text-using-Java
no broad domain, average accuracy 80-85% is quite good: https://www.quora.com/How-accurate-are-entity-extraction-tools
http://blog.districtdatalabs.com/named-entity-recognition-and-classification-for-entity-extraction
http://noisy-text.github.io/2016/ner-shared-task.html
https://noisy-text.github.io/2016/pdf/WNUT26.pdf
dataset: https://www.dropbox.com/s/yaoy7zi9vz71nki/wnut_ner_evaluation.tgz?dl=0
wnut solution: https://github.com/napsternxg/TwitterNER
dataset wnut16: https://github.com/aritter/twitter_nlp/tree/master/data/annotated/wnut16/data

ML Stacking

brew: https://github.com/viisar/brew
heamy: https://github.com/rushter/heamy

Tensorflow tutorials

Covariate shift

#PydataLondon2017

NLP course

https://www.cs.bgu.ac.il/~elhadad/nlp17.html

Dataset

Tricks of DL

Pointer network

Attention

https://arxiv.org/abs/1707.00110

Log likelihood test

tool http://ucrel.lancs.ac.uk/llwizard.html
significance testing of word frequency in corpora: https://users.ics.aalto.fi/lijffijt/articles/lijffijt2015a.pdf
TA and TM for social: https://de.dariah.eu/tatom/
http://sappingattention.blogspot.com/2011/10/comparing-corpuses-by-word-use.html#comments
http://sappingattention.blogspot.com/2011/11/dunning-amok.html
https://tedunderwood.com/2011/11/09/identifying-the-terms-that-characterize-an-author-or-genre-why-dunnings-may-not-be-the-best-method/

MLtrainings.ru

quora presentation: https://gh.mltrainings.ru/presentations/Skornyakov_KaggleQuora_2017.pdf
hearthstone: https://gh.mltrainings.ru/presentations/Patekha_Hearthstone_2017.pdf

GCloud

Current conference

https://github.com/aymericdamien/TensorFlow-Examples

Timeline

WSDM 2019

Computer Vision

ICCV 2019

07.10

https://stackoverflow.com/questions/42307949/color-theme-for-vs-code-integrated-terminal/46166487
https://github.com/zhulingchen/tfp-tutorial
tf2 keras for researcher: https://colab.research.google.com/drive/1UCJt8EYjlzCs1H1d1X0iDGYJsHKwu-NO
visualizing outliers in big data: https://www.cs.uic.edu/~wilkinson/Publications/outliers.pdf

13.06

04.06

https://storage.googleapis.com/openimages/web/challenge.html

18.05

https://www.slideshare.net/HITCONGIRLS/ithome-2019-ai-turkeymelodypdf-138370023

17.05

14.05

13.05

08.05

https://github.com/cydonia999/Tiny_Faces_in_Tensorflow/blob/master/README.md

07.05

03.05

28.04

https://assessingpsyche.wordpress.com/2014/06/04/using-the-truncated-normal-distribution/

24.04

19.04

10.04

09.04

08.04

https://hurenjun.github.io/
beam search: https://www.coursera.org/lecture/nlp-sequence-models/beam-search-4EtHZ
joint embedding for transportation: https://hurenjun.github.io/pubs/aaai2019-slides.pdf
embedding for anomaly detection: https://hurenjun.github.io/pubs/icde2016-slides.pdf

05.04

03.04

01.04

31.03

30.03

29.03

https://github.com/yenchenlin/awesome-adversarial-machine-learning

28.03

21.03

https://www.nguyenvantuan.info/research-blog/the-blind-faith-in-the-p-values-should-be-stopped

20.03

14.03

11.03

07.03

06.03

https://www.youtube.com/channel/UCZ_qlZbg9EzwRnLq_hFQumQ/featured?app=desktop
https://www.slideshare.net/albedan/kaggle-days-paris-alberto-danese-ml-interpretability
xgboost from 0: https://www.youtube.com/watch?v=0hxX4XAf2DA
kdd2016 recsys ctr field awared https://www.youtube.com/watch?v=1cRGpDXTJC8

01.03

21.02

https://boosters.pro/championships
machine learns physic laws. https://arxiv.org/abs/1807.10300
https://istina.msu.ru/media/publications/article/972/9eb/7537819/sw-factors-dyakonov.pdf

20.02

19.02

13.02

12.02

11.02

09.02

03.02

24.01

https://causalinference.gitlab.io/kdd-tutorial/

21.01

18.01

kids learn and acquire language using statistic learning. Chomsky school. https://www.youtube.com/watch?v=uSFPgDuyv6E
bootstrap with pitfalls: https://arxiv.org/pdf/1411.5279.pdf
categorial data analysis: https://www.youtube.com/watch?v=FCrYGuO8CmU
humbio: https://www.ted.com/talks/robert_sapolsky_the_biology_of_our_best_and_worst_selves?language=en

16.01

https://machinelearningforkids.co.uk/
www.quantamagazine.org/been-kim-is-building-a-translator-for-artificial-intelligence-20190110
https://ai.googleblog.com/2019/01/looking-back-at-googles-research.html
hbr.org/2019/01/data-science-and-the-art-of-persuasion

14.01

03.01

02.01

===== GOODBYE 2018

29.12

25.12

https://github.com/mwburke/population-stability-index/blob/master/walkthrough-example.ipynb

22.12

20.12

19.12

18.12

17.12

12.12

10.12

09.12

-https://hai.stanford.edu/news/the_intertwined_quest_for_understanding_biological_intelligence_and_creating_artificial_intelligence/

07.12

06.12

https://github.com/tensorflow/ranking

04.12

02.12

01.12

https://github.com/zhpmatrix/zhpmatrix.github.io/blob/master/cellar/Dive_into_XGBoost.pdf

29.11

26.11

transform net for target sentiment analysis: https://ai.tencent.com/ailab/media/publications/acl/Transformation_Networks_for_Target-Oriented_Sentiment_Classification.pdf
https://lixin4ever.github.io/paper/ACL2018/slides/acl18_lixin_slides.pdf

BERT with <3

20.11

15.11

14.11

vietnamese ner: https://github.com/duongna21/VNsequencelabeling
pzad data preprocessing: https://github.com/Dyakonov/PZAD/blob/master/PZAD2018_09_datapreprocessing_15.pdf
https://medium.com/acing-ai/what-is-hidden-in-the-hidden-markov-models-eee7bab45ac3

13.11

http://ruder.io/optimizing-gradient-descent/
dont decay lr, double your batch size: https://arxiv.org/abs/1711.00489

12.11

10.11

deep learning in airbnb search: https://arxiv.org/pdf/1810.09591.pdf
https://www.youtube.com/watch?v=FmKU-1LZGoE

08.11

07.11

06.11

https://github.com/Hvass-Labs/TensorFlow-Tutorials

04.11

01.11

29.10

https://towardsdatascience.com/understanding-feature-engineering-part-1-continuous-numeric-data-da4e47099a7b

25.10

https://ingoscholtes.github.io/kdd2018-tutorial/

23.10

18.10

16.10

10.10

elmo at apple: https://machinelearning.apple.com/2018/09/27/can-global-semantic-context-improve-neural-language-models.html
https://github.com/MorvanZhou/Tensorflow-Tutorial
expose blackbox: https://github.com/tsterbak/pydata2018-amsterdam/blob/master/presentation.ipynb
elmo with keras: https://github.com/UKPLab/elmo-bilstm-cnn-crf

09.10

08.10

03.10

02.10

CVTraining better then ELMO? https://arxiv.org/abs/1809.08370
https://machinelearning.apple.com/2018/09/27/can-global-semantic-context-improve-neural-language-models.html

29.09

27.09

26.09

25.09

24.09

21.09

20.09

19.09

https://towardsdatascience.com/elmo-embeddings-in-keras-with-tensorflow-hub-7eb6f0145440

18.09

16.09

13.09

11.09

08.09

hyperbolic RS: https://arxiv.org/pdf/1809.01703.pdf

07.09

https://medium.com/@matsutton/repurchase-rate-the-most-overlooked-ecommerce-kpi-337bccde184b

04.09

28.08

kdd wrap up: https://habr.com/company/mailru/blog/421041/
bayesian reasoning: https://github.com/bayesgroup/deepbayes-2018/blob/master/day1_bayesian-reasoning/presentation.pdf

27.08

23.08

22.08

stats and sport https://statsbylopez.com/276labs/
cs229 https://stanford.edu/~shervine/teaching/cs-229.html

21.08

ncsoft blade & soul churn prediction https://arxiv.org/pdf/1802.02301.pdf
bayesian intro: https://www.datascience.com/blog/introduction-to-bayesian-inference-learn-data-science-tutorials

20.08

churn data science game https://arxiv.org/pdf/1802.02301.pdf
https://speakerdeck.com/teoliphant/ml-in-python?slide=46
Murphy law: anything that can go wrong will go wrong https://en.wikipedia.org/wiki/Murphy%27s_law
https://alexanderdyakonov.wordpress.com/2018/07/30/%D0%B1%D0%B0%D0%B9%D0%B5%D1%81%D0%BE%D0%B2%D1%81%D0%BA%D0%B8%D0%B9-%D0%BF%D0%BE%D0%B4%D1%85%D0%BE%D0%B4/
https://github.com/springcoil/PyDataLondonTutorial/blob/master/notebooks/LogisticRegScikitlearn.ipynb

18.08

17.08

16.08

15.08

14.08

large to small better than small to large: http://koaning.io/variable-selection-in-machine-learning.html
bayesian is good https://blog.datank.ai/how-i-learned-to-stop-worrying-and-love-uncertainty-fd13c23442b6
think bayesian: http://www.greenteapress.com/thinkbayes/thinkbayes.pdf
bayesian for hackers: https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

13.08

http://cbl.eng.cam.ac.uk/pub/Intranet/MLG/ReadingGroup/Bronskill_infer_NET.pdf
Tim is good: http://timvieira.github.io/
https://imaddabbura.github.io/blog/machine%20learning/data%20science/2018/03/15/predicting-loan-repayment.html
gumbel max trick: https://arxiv.org/abs/1611.01144
love uncertainty: https://github.com/arinarmo/love_uncertainty/blob/master/slides.pdf
Vincent talk: https://www.youtube.com/watch?v=dE5j6NW-Kzg

10.08

08.08

07.08

06.08

03.08

01.08

30.7

27.7

26.07

24.07

20.07

17.07

15.07

14.07

11.07

10.07

05.07

http://eric.univ-lyon2.fr/~ricco/cours/slides/PJ%20-%20en%20-%20machine%20learning%20avec%20scikit-learn.pdf

04.07

29.06

28.06

26.06

25.06

22.06

21.06

https://einstein.ai/static/images/pages/research/decaNLP/decaNLP.pdf

20.06

19.06

18.06

15.06

toxic in russian https://www.youtube.com/watch?v=aMlpeDOjib8
multitask learning https://arxiv.org/pdf/1806.03713.pdf

14.06

12.06

trieutrinh, google brain: https://github.com/tensorflow/models/tree/master/research/lm_commonsense
finetune transformer: https://github.com/openai/finetune-transformer-lm
https://blog.openai.com/language-unsupervised/

11.06

09.06

http://dylan-chen.com/model/lightgbm-tutorial/

08.06

07.06

06.06

05.06

https://www.tensorflow.org/hub/modules/google/universal-sentence-encoder/1
okcupid, basic stats: https://ww2.amstat.org/publications/jse/v23n2/kim.pdf

04.06

02.06

01.06

29.05

https://elitedatascience.com/python-seaborn-tutorial

28.05

26.05

25.05

24.05

23.05

22.05

21.05

18.05

17.05

sentence piece, sub word: https://github.com/google/sentencepiece
fastai nlp with transfer learning: http://forums.fast.ai/t/part-2-lesson-10-wiki/14364
https://xcitech.github.io/tutorials/heroku_tutorial/
lime: https://homes.cs.washington.edu/~marcotcr/blog/lime/
http://nlp.fast.ai/
https://medium.com/activewizards-machine-learning-company/top-7-data-science-use-cases-in-finance-303c05a3cb58
https://www.oreilly.com/learning/introduction-to-local-interpretable-model-agnostic-explanations-lime

15.05

https://www.youtube.com/watch?v=mmLukrKMSnw

14.05

13.05

10.05

09.05

08.05

07.05

02.05

make nnet uncool again: http://www.fast.ai/2018/04/29/categorical-embeddings/
https://pdfs.semanticscholar.org/8004/cd728305c9abb203cc09885c64fcc5e45f43.pdf

01.05

30.04

29.04

28.04

24.04

https://www.slideshare.net/amr_qura/neural-network-based-player-retention-prediction-in-free-to-play-games

23.04

20.04

19.04

https://medium.com/@keeper6928/how-to-unit-test-machine-learning-code-57cf6fd81765

18.04

15.04

10.04

09.04

06.04

05.04

04.04

https://academy.microsoft.com/en-us/professional-program/tracks/artificial-intelligence/

02.04

01.04

https://funmatu.wordpress.com/2017/11/02/hyperopt/

churn:

repeat purchase:

31.03

30.03

28.03

27.03

26.03

24.03

23.03

22.03

21.03

20.03

19.03

18.03

16.03

12.03

08.03

07.03

05.03

04.03

01.03

28.02

https://github.com/slundberg/shap

27.02

26.02

21.02

20.02

13.02

https://gist.github.com/iskandr/a874e4cf358697037d14a17020304535

09.02

07.02

https://github.com/the-deep-learners/TensorFlow-LiveLessons/
https://github.com/jfloff/pywFM
Uber epxirement design: https://www.youtube.com/watch?v=9bl7SPSqbX0

06.02:

05.02

02.02

01.02

31.01

30.01

29.01

26.01

25.01

22.01

20.01

19.01

fashion relevant is not enough: https://arxiv.org/pdf/1406.3561.pdf
Yahoo portrait user: https://arxiv.org/pdf/1512.04912.pdf
predict buying intention: https://arxiv.org/pdf/1511.06247.pdf
realtime community detection: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188702

18.01

17.01

15.01

12.01

11.01

https://github.com/emilwallner/Screenshot-to-code-in-Keras

10.01

08.01

04.01

03.01

02.01

22.12

http://www.real-statistics.com/chi-square-and-f-distributions/one-sample-hypothesis-testing-variance/

21.12

20.12

18.12

17.12

16.12

https://donjayamanne.github.io/pythonVSCodeDocs/docs/jupyter_getting-started/

15.12

14.12

13.12

12.12

https://rare-technologies.com/mummy-effect-bridging-gap-between-academia-industry/
http://ruder.io/deep-learning-optimization-2017/
dont decay learning rate, increase batch size: https://pdfs.semanticscholar.org/3299/aee7a354877e43339d06abb967af2be8b872.pdf
https://medium.com/@Synced/nips-2017-day-1-2-highlights-67ab464086c

11.12

10.12

https://www.datascience.com/resources/notebooks/overview-churn-modeling-techniques

07.12

bayesian variable explanation: https://www.kdnuggets.com/2017/11/bayesian-networks-understanding-effects-variables.html
end2end ML/DL https://aws.amazon.com/sagemaker/ (colab?)
test of time https://www.youtube.com/watch?time_continue=2&v=Qi1Yry33TQE

06.12

05.12

04.12

02.12

online marketing applications

01.12

https://github.com/latuannetnam/kaggle

30.11

29.11

28.11

27.11

24.11

23.11

22.11

21.11

17.11

16.11

15.11

14.11

https://github.com/PipelineAI/pipeline

13.11

10.11

what wrong with CNN: https://www.youtube.com/watch?v=rTawFwUvnLE
https://medium.com/@culurciello/deep-neural-network-capsules-137be2877d44

09.11

08.11

pearson correlation: https://en.wikipedia.org/wiki/Pearson_correlation_coefficient
jensen inequality: https://en.wikipedia.org/wiki/Jensen%27s_inequality
ui2code: https://uizard.io/
https://pypi.python.org/pypi/textstat/
mse vs pearson correlation: http://www.bwgriffin.com/gsu/courses/edur8132/notes/Notes8c2_RegressionModelFit.pdf

3.11

https://hackernoon.com/latest-deep-learning-ocr-with-keras-and-supervisely-in-15-minutes-34aecd630ed8

2.11

1.11

two sample test, mean: https://www.isixsigma.com/tools-templates/hypothesis-testing/making-sense-two-sample-t-test/
two sample test, ratio: https://github.com/maoting1223/pycon_sg_2016
welchs test vs t student: http://daniellakens.blogspot.com/2015/01/always-use-welchs-t-test-instead-of.html

31.10

structure data: https://github.com/random-forests/tensorflow-workshop/blob/master/examples/07_structured_data.ipynb
https://www.pyimagesearch.com/2017/10/30/how-to-multi-gpu-training-with-keras-python-and-deep-learning/
kaggle survey: LR first, tree second: https://www.kaggle.com/surveys/2017
fe best practice: https://www.quora.com/What-are-some-best-practices-in-Feature-Engineering
ppmi vs svd: https://github.com/piskvorky/word_embeddings/blob/master/run_embed.py
class imbalance in cnn: https://arxiv.org/pdf/1710.05381.pdf
rnnvis: https://arxiv.org/pdf/1710.10777.pdf
task detection from email: https://medium.com/@rodrigo_23805/extracting-tasks-from-emails-first-challenges-86e7fbbf4672
interactive cm: https://rare-technologies.com/interactive-confusion-matrix-python/

30.10

29.10

linguistic structure is back, acl 2017: http://www.abigailsee.com/2017/08/30/four-deep-learning-trends-from-acl-2017-part-1.html

28.10

https://www.bloomberg.com/graphics/2017-wall-street-robots/

27.10

https://www.kaggle.com/knowledgegrappler/magic-embeddings-keras-a-toy-example

26.10

Coursera kaggle: https://www.coursera.org/learn/competitive-data-science

25.10

24.10

23.10

https://github.com/leondz/entity_recognition

20.10

19.10

18.10

swish = x.sigmoid(x) https://arxiv.org/pdf/1710.05941.pdf
DrQA: document retriever, document reader: https://github.com/facebookresearch/DrQA
https://gist.github.com/GaelVaroquaux/ead9898bd3c973c40429

17.10

outlier detection: https://storage.googleapis.com/supplemental_media/udacityu/3104648634/Hodge+Austin_OutlierDetection_AIRE381.pdf
https://lilianweng.github.io/lil-log/2017/09/28/anatomize-deep-learning-with-information-theory.html
opening the black box of DNN: https://arxiv.org/pdf/1703.00810.pdf
information plane for DL: https://www.youtube.com/watch?v=bLqJHjXihK8
information theory with C.Olah: http://colah.github.io/posts/2015-09-Visual-Information/

16.10

https://developers.soundcloud.com/blog/soundclouds-data-science-process

15.10

13.10

Information theory of DL https://www.youtube.com/watch?v=RKvS958AqGY
https://arxiv.org/pdf/1709.03856.pdf

12.10

11.10

10.10

07.10

https://arxiv.org/pdf/1710.00027.pdf

05.10

04.10

03.10

02.10

30.09

29.09

feature selection multiple hypothesis testing: http://kelvinguu.com/posts/feature-selection-and-multiple-hypothesis-testing/
how to do feature selection correctly: http://kelvinguu.com/posts/why-naive-cross-validation-fails-at-feature-selection/
https://habrahabr.ru/post/326122/
http://soloro.ru
http://kelvinguu.com/
http://jakob.uszkoreit.net/
coarse to fine QA for long document: https://arxiv.org/pdf/1611.01839.pdf
generating sentences by editing prototypes: https://arxiv.org/pdf/1709.08878.pdf

28.09

27.09

25.09

22.09

http://blog.kaggle.com/2017/09/21/instacart-market-basket-analysis-winners-interview-2nd-place-kazuki-onodera/

21.09

19.09

memory augmented nnet for nlp: https://drive.google.com/file/d/0B9dqzboiV5u-UmxJQlJqcUl6anM/view
kaggle quora blog: https://indatalabs.com/blog/data-science/how-to-win-kaggle-competition

18.09

17.09

http://xrds.acm.org/blog/2017/07/power-wordnet-use-python/
https://simons.berkeley.edu/sites/default/files/docs/5950/2017.02.01-21.15.12-simons-nlp-tutorial.pdf
talking to machine: http://cs.stanford.edu/~pliang/papers/talking-xrds2014.pdf
zero learning talk: https://www.youtube.com/watch?v=6O5sttckalE

16.09

https://github.com/philipperemy/tensorflow-class-activation-mapping

15.09

http://www.cs.tut.fi/kurssit/SGN-2556/slides/Lecture6.pdf

14.09

https://cloud.google.com/blog/big-data/2017/01/learn-tensorflow-and-deep-learning-without-a-phd

13.09

strong algos: GBT, RF, SVM for classification: https://arxiv.org/pdf/1708.05070.pdf
https://medium.com/slalom-engineering/detecting-malicious-requests-with-keras-tensorflow-5d5db06b4f28
https://github.com/tensorflow/workshops
https://github.com/chuckyee/cardiac-segmentation
real time CNN: https://github.com/lampts/face_classification/blob/master/technical_report.pdf

12.09

https://en.wikipedia.org/wiki/White_Noise_(novel)
hitchhike guide to the galaxy:
https://www.cs.bgu.ac.il/~yoavg/uni/bloglike/baboons.html
http://u.cs.biu.ac.il/~yogo/courses/sem2017/

11.09

word embedding Komninos https://www.cs.york.ac.uk/nlp/extvec/
https://ku.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=0954a17c-2702-4d8e-9412-12ae958a2790
score distribution is better: https://arxiv.org/abs/1707.09861
make a stable architecture: https://arxiv.org/abs/1707.06799, pretrained embedding, last layer of lstm is crucial.
https://github.com/lanwuwei/paraphrase-dataset
why non convex: https://github.com/lanwuwei/paraphrase-dataset
https://www.reddit.com/r/dataisbeautiful/comments/6ykfvl/average_word_length_for_nytimes_crossword_answers/

10.09

dilated convnet https://medium.com/@TalPerry/convolutional-methods-for-text-d5260fd5675f
quora view: https://www.quora.com/challenges#views

09.09

08.09

07.09

06.09

05.09

ds interview: http://www.thedsinterview.com/
4 trends: structure is back, re embedding, blackbox transparency, attention: http://www.abigailsee.com/2017/08/30/four-deep-learning-trends-from-acl-2017-part-2.html
https://github.com/UKPLab/emnlp2017-relation-extraction
intepret rnn: https://github.com/philipperemy/tensorflow-isan-rnn

04.09

03.09

02.09

01.09

31.08

30.08

effective tf: https://github.com/vahidk/EffectiveTensorflow
knn and bilstm https://arxiv.org/pdf/1708.07863.pdf
https://nlp.stanford.edu/pubs/jia2017adversarial.pdf
https://github.com/dformoso/machine-learning-mindmap

29.08

https://medium.com/@burgalon/deploying-your-keras-model-using-keras-js-2e5a29589ad8

28.08

26.08

25.08

24.08

22.08

21.08

18.08

http://evexdb.org/pmresources/vec-space-models/

17.08

https://medium.com/the-mission/a-genius-explains-how-to-be-creative-claude-shannons-long-lost-1952-speech-fbbcb2ebe07f

16.08

15.08

http://bayes.wustl.edu/etj/prob/book.pdf

14.08

13.08

https://github.com/vahidk/EffectiveTensorflow

11.08

10.08

https://github.com/cpury/lstm-math

09.08

dl course: https://www.coursera.org/specializations/deep-learning

08.08

07.08

06.08

04.08

emoji transfer learning: https://arxiv.org/pdf/1708.00524.pdf
http://deepmoji.mit.edu/
importance sampling https://arxiv.org/pdf/1706.00043.pdf
larochelle https://drive.google.com/file/d/0ByUKRdiCDK7-LXZkM3hVSzFGTkE/view
bengio https://drive.google.com/file/d/0ByUKRdiCDK7-UXB1R1ZpX082MEk/view

01.08

31.07

25.07

24.05

data readiness: https://arxiv.org/pdf/1705.02245.pdf
trophy data scientist: https://peadarcoyle.wordpress.com/2017/07/23/avoiding-being-a-trophy-data-scientist/
best paper cvpr 17: https://arxiv.org/pdf/1608.06993.pdf, https://github.com/liuzhuang13/DenseNet
https://github.com/titu1994/DenseNet
https://github.com/UKPLab/emnlp2017-bilstm-cnn-crf

23.07

https://github.com/greydanus/excitation_bp

22.07

21.07

20.07

https://github.com/hollance/YOLO-CoreML-MPSNNGraph

19.07

18.07

17.07

15.07

14.07

13.07

12.07

10.07

06.07

https://nlp.stanford.edu/software/crf-faq.shtml
Redcatlab: http://www.redcatlabs.com/2015-11-24_IES-2015_NER-from-Experts/
embedding compression http://sei.pku.edu.cn/~moull12/paper/cikm16.pdf
https://github.com/facebookresearch/InferSent

Maxout:

05.07

04.07

03.07

02.07

30.06

29.06

scorecard application: https://www.linkedin.com/pulse/credit-risk-scorecard-monitoring-tracking-shailendra
http://cds.nyu.edu/wp-content/uploads/2014/04/bertini_datascience_showcase_May12_2014.pdf
annotation tool: https://github.com/RicardoUsbeck/QRTool
ned dataset: https://datahub.io/dataset/reuters-128-nif-ner-corpus

28.06

27.06

26.06

24.06

https://github.com/klarsen1/Information

23.06

22.06

21.06

19.06

all you need is attention: https://github.com/Kyubyong/transformer
http://damiano.github.io/learning-similarity-functions-ORM/
https://github.com/abhishekkrthakur/clickbaits_revisited
entity filtering and topic detection: thesis-DamianoSpina.pdf
https://alexanderdyakonov.files.wordpress.com/2017/06/book_boosting_pdf.pdf
https://github.com/ejmeij/entity-linking-and-retrieval-tutorial

14.06

automating FE, OneBM: https://arxiv.org/pdf/1706.00327.pdf
imbalance sklearn: https://glemaitre.github.io/talks/2017_PyParis/#1
feature selection: http://www.kdnuggets.com/2017/06/practical-importance-feature-selection.html
https://groups.google.com/a/tensorflow.org/forum/#!msg/discuss/Dhy9MseSXQI/naoy_EElBAAJ
https://github.com/curiousily
EL and ER: https://www.dropbox.com/sh/h7fr4yfrih6tisr/Q9BU8Qshcq?lst=

13.06

12.06

09.06

07.06

05.06

https://github.com/Franck-Dernoncourt/NeuroNER

02.06

01.06

31.05

30.05

29.05

why PReLU, maxout: http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture6.pdf

26.05

25.05

21.05

20.05

https://github.com/nelson-liu/paraphrase-id-tensorflow

19.05

18.05

17.05

http://www.jaist.ac.jp/~bao/DS2017/BigData-I-Dinh-v4-4perPage.pdf

16.05

15.05

13.05

12.05

11.05

10.05

A/B test common pitfalls: https://www.youtube.com/watch?v=NkQ51iyFgs0

09.05

http://www.aclweb.org/anthology/N10-1021

08.05

05.05

04.05

03.05

02.05

https://github.com/tjpalanca/facebook-news-analysis

30.04

https://snap.stanford.edu/data/web-Amazon.html

27.04

https://howchoo.com/g/ytkwotvkztq/using-the-iterm-2-and-tmux-integration

26.04

25.04

24.04

21.04

https://www.kaggle.com/arthurtok/titanic/introduction-to-ensembling-stacking-in-python/notebook

20.04

19.04

18.04

17.04

16.04

15.04

14.04

13.04

12.04

10.04

08.04

07.04

https://gist.github.com/udibr
tf sequence tagging: https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html
tweet2vec cluster: https://github.com/vendi12/tweet2vec_clustering
learning to generate review and discore sentiment: https://github.com/openai/generating-reviews-discovering-sentiment
https://aclweb.org/anthology/K15-1013
https://github.com/brmson/dataset-sts
https://drive.google.com/drive/folders/0B-btHzfJjPnobXZ0MndjSkxkRkk

06.04

05.04

04.04

03.04

01.04

https://github.com/tuanavu/coursera-university-of-washington/tree/master/machine_learning/3_classification

31.03

30.03

deepnl: https://github.com/attardi/deepnl
https://gist.github.com/jeremystan/c236000a4159f9d47c28784fa6693c45#file-initial_architecture-py
Relationship Modeling network: https://pbs.twimg.com/media/C7dvymYVQAAut9_.jpg:large
https://tech.instacart.com/deep-learning-with-emojis-not-math-660ba1ad6cdc
Rethink RNN: https://docs.google.com/document/d/1X9f-wst8QhrCCFTWiJIz6vq1qAOlpyYAUo_kaFf0J8M/edit
crfasrnn: https://github.com/torrvision/crfasrnn

29.03

28.03

27.03

26.03

25.03

https://github.com/seatgeek/fuzzywuzzy
misunderstanding of P: http://tuanvannguyen.blogspot.com/2017/03/10-hieu-lam-ve-tri-so-p-trong-khoa-hoc.html

23.03

21.03

20.03

I haven't gone back to check what they are suggesting in their original paper, but I can guarantee that recent code written by Christian applies relu before BN. It is still occasionally a topic of debate, though.

17.03

install keras on gpu: please use --no-deps flags: https://github.com/fchollet/keras/wiki/Keras-2.0-release-notes
quora again: https://github.com/abhishekkrthakur/is_that_a_duplicate_quora_question
clickbait: https://github.com/abhishekkrthakur/clickbaits_revisited

16.03

15.03

14.03

seq2seq on tf(general) https://github.com/google/seq2seq
sentencepiece tokenizer https://github.com/google/sentencepiece

13.03

visual search in es: https://github.com/tuan3w/visual_search
9-15% twitter active users are bot: https://arxiv.org/pdf/1703.03107.pdf
http://www.springer.com/gp/book/9783319472409
https://arxiv.org/pdf/1602.04427.pdf
Socher at LXMS: http://lxmls.it.pt/2014/socher-lxmls.pdf
use vgg to classify cat/dog: https://gist.github.com/embanner/6149bba89c174af3bfd69537b72bca74
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

10.03

09.03

08.03

CMU RF and control course: https://katefvision.github.io/
https://www.slideshare.net/JasonKessler/turning-unstructured-content-into-kernels-of-ideas/52
norvig ngram: http://norvig.com/ngrams/

07.03

06.03

05.03

fcholet: xception https://arxiv.org/pdf/1610.02357.pdf

04.03

02.03

01.03

http://smerity.com/articles/2017/deepcoder_and_ai_hype.html
Twitter NER annotation: https://docs.google.com/document/d/12hI-2A3vATMWRdsKkzDPHu5oT74_tG0-PPQ7VN0IRaw/edit
WNUT 19, Japan, result: https://noisy-text.github.io/2016/pdf/WNUT19.pdf
pytorch vs keras/tf: https://www.reddit.com/r/MachineLearning/comments/5w3q74/d_so_pytorch_vs_tensorflow_whats_the_verdict_on/
quora duplicate question detection: accuracy 1%(84.8) higher but 100x params than my model: https://github.com/abhishekkrthakur/is_that_a_duplicate_quora_question/blob/master/deepnet.py
https://github.com/chiphuyen/tf-stanford-tutorials?files=1
pretrained fasttext on wikipedia: https://github.com/facebookresearch/fastText

28.02

27.02

random walk -> graph -> node2vec: http://www.kdd.org/kdd2016/subtopic/view/node2vec-scalable-feature-learning-for-networks
URL2VEC: http://www.newfoundland.nl/wp/?p=112
5 diseases of doing science: http://www.sciencedirect.com/science/article/pii/S104898431730070X
recommended book: https://www.amazon.com/Language-Processing-Perl-Prolog-Implementation/
Martin DL without PHD: https://github.com/martin-gorner/tensorflow-mnist-tutorial
https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist/#0
https://docs.google.com/presentation/d/18MiZndRCOxB7g-TcCl2EZOElS5udVaCuxnGznLnmOlE/pub?slide=id.p
https://docs.google.com/presentation/d/1TVixw6ItiZ8igjp6U17tcgoFrLSaHWQmMOwjlgQY9co/pub?slide=id.p

26.02

25.02

24.02

how to init uniform (-b,b), summerschool of marek http://www.marekrei.com/blog/26-things-i-learned-in-the-deep-learning-summer-school/
Beam preprocessing: https://research.googleblog.com/2017/02/preprocessing-for-machine-learning-with.html
https://github.com/offbit/char-models/blob/master/doc-rnn2.py

23.02

22.02

https://github.com/offbit/char-models
https://offbit.github.io/how-to-read/
https://hackernoon.com/learning-ai-if-you-suck-at-math-p4-tensors-illustrated-with-cats-27f0002c9b32#.xqpspe69f
Beam search, NN tut from Quoc Le: https://cs.stanford.edu/~quocle/tutorial2.pdf
marek sequence tagger: https://github.com/marekrei/sequence-labeler

21.02

https://github.com/marekrei/sequence-labeler
markrei word + char attention: http://www.marekrei.com/blog/
datalab: https://github.com/googledatalab/
https://tw.pycon.org/2017/en-us/speaking/cfp/

20.02

19.02

scikit plot: https://github.com/reiinakano/scikit-plot

18.02

really cool Francis: https://github.com/frnsys/
ai notes: http://frnsys.com/ai_notes/ai_notes.pdf
brilliant wrong, ROC explanation: http://arogozhnikov.github.io/2015/10/05/roc-curve.html
yandex MLSchool in Londo: https://github.com/yandexdataschool/MLatImperial2017/

17.02

RNNs bag of applications: http://www.cs.toronto.edu/~urtasun/courses/CSC2541_Winter17/RNN.pdf
BiMPM https://arxiv.org/pdf/1702.03814.pdf
TextSum step by step: http://www.fastforwardlabs.com/luhn/
https://keon.io/rl/deep-q-learning-with-keras-and-gym/
https://medium.com/startup-grind/fueling-the-ai-gold-rush-7ae438505bc2#.ny8j80fl3
big 5 for DS: https://www.quora.com/How-do-you-judge-a-good-Data-scientist-with-just-5-questions
keon: https://github.com/keon/awesome-nlp
quid: word2vec + wikipedia: https://quid.com/feed/how-quid-improved-its-search-with-word2vec-and-wikipedia?utm_content=42445351&utm_medium=social&utm_source=twitter
https://gist.github.com/asmeurer/5843625

16.02

15.02

sentiment analysis on Super Bowl: http://blog.aylien.com/sentiment-analysis-of-2-2-million-tweets-from-super-bowl-51/
spacy advanced text analysis: https://github.com/JonathanReeve/advanced-text-analysis-workshop-2017/blob/master/advanced-text-analysis.ipynb
pytorch: https://github.com/vinhkhuc/PyTorch-Mini-Tutorials
Quora engineering: https://engineering.quora.com/Semantic-Question-Matching-with-Deep-Learning
Space bag of nns: https://explosion.ai/blog/quora-deep-text-pair-classification
AUC 0.875 http://analyzecore.com/2017/02/08/twitter-sentiment-analysis-doc2vec/

14.02

event detection, extraction, triggering, mention: https://github.com/anoperson/jointEE-NN
batch renorm, due to sensitivity of batch size, initiation: https://arxiv.org/pdf/1702.03275.pdf
https://github.com/bmitra-msft/Demos/blob/master/notebooks/DESM.ipynb
nn for document ranking, mistra, ms cntk: https://github.com/bmitra-msft/NDRM
TFDevSummit: https://events.withgoogle.com/tensorflow-dev-summit/watch-the-videos/#content

13.02

Quora siamese: https://github.com/erogol/QuoraDQBaseline

12.02

http://www.slideshare.net/BhaskarMitra3/neural-text-embeddings-for-information-retrieval-wsdm-2017

10.02

kerlym: https://github.com/osh/kerlym
ICLR 17: https://amundtveit.com/2016/11/12/deep-learning-for-natural-language-processing-iclr-2017-discoveries/
https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation.ipynb
all but of the top, pca on word2vec: https://arxiv.org/pdf/1702.01417.pdf
https://github.com/peter3125/sentence2vec

08.02

polarised term for document anonymisation: https://ddu1.github.io/Anonymization/
oxford course: https://github.com/oxford-cs-deepnlp-2017/lectures
tf fold: dynamic batching: https://research.googleblog.com/2017/02/announcing-tensorflow-fold-deep.html
https://www.insight-centre.org/sites/default/files/publications/newhorizons_online.pdf
https://github.com/chsasank/Traffic-Sign-Classification.keras/blob/master/Traffic%20Sign%20Classification.ipynb

07.02

openrefine: http://alexpetralia.com/posts/2015/12/14/the-problem-with-openrefine-clean-vs-messy-data
https://www.linkedin.com/pulse/keras-neural-networks-win-nvidia-titan-x-abhishek-thakur
deep q learning with keras and gym: https://keon.io/rl/deep-q-learning-with-keras-and-gym/
structured attention, Yoon Kim and Hoang Luong: https://github.com/harvardnlp/struct-attn
understanding DL requires rethinking generalisation: https://openreview.net/pdf?id=Sy8gdB9xx
GAN: https://github.com/osh/KerasGAN

06.02

http://lxmls.it.pt/2016/LxMLS2016.pdf
http://www.cs.umb.edu/~twang/file/tricks_from_dl.pdf
https://svn.spraakdata.gu.se/repos/richard/pub/ml2016_web/LT2306_2016_example_solution.pdf
https://svn.spraakdata.gu.se/repos/richard/pub/ml2015_web/l7.pdf
https://chsasank.github.io/spoken-language-understanding.html
ML4NLP: http://stp.lingfil.uu.se/~shaooyan/ml/nn.part2.pdf
Topic Modeling for extracting key words: http://bugra.github.io/work/notes/2017-02-05/topic-modeling-for-keyword-extraction/
Google Scraper: https://github.com/NikolaiT/GoogleScraper
Richard Johanson: https://svn.spraakdata.gu.se/repos/richard/pub/ml2015_web/l7.pdf
https://code.facebook.com/posts/457605107772545/under-the-hood-building-accessibility-tools-for-the-visually-impaired-on-facebook/
l2svm outperforms softmax: https://arxiv.org/pdf/1306.0239v4.pdf
xent vs hinge loss: http://cs231n.github.io/linear-classify/
https://github.com/nzw0301/keras-examples/blob/master/Skip-gram-with-NS.ipynb
model zoo pytorch: https://github.com/Cadene/tensorflow-model-zoo.torch
quora question pair: http://www.forbes.com/sites/quora/2017/01/30/data-at-quora-first-quora-dataset-release-question-pairs/#3d052ef475cb
Psychometric, CA and Trump: https://motherboard.vice.com/en_us/article/how-our-likes-helped-trump-win

27.1

26.1

25.1

question duplication of Quora: https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs
stats for hackers code: https://github.com/croach/blog/tree/master/content
http://multithreaded.stitchfix.com/blog/2017/01/23/scaling-ds-at-sf-slides-from-ddtexas/

24.1

23.1

nlp terms for novice: http://www.datasciencecentral.com/profiles/blogs/10-common-nlp-terms-explained-for-the-text-analysis-novice?utm_content=buffer172af&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer
blockchain: https://opendatascience.com/blog/what-is-the-blockchain-and-why-is-it-so-important/
nbgrader: https://github.com/jupyter/nbgrader
Adversarial ML: https://mascherari.press/introduction-to-adversarial-machine-learning/
4 questions for G. Hinton: https://gigaom.com/2017/01/16/four-questions-for-geoff-hinton/
Debug in TF: https://wookayin.github.io/TensorflowKR-2016-talk-debugging/#1

20.1

19.1

Facebook again, pytorch: http://pytorch.org/
https://rare-technologies.com/new-gensim-feature-author-topic-modeling-lda-with-metadata/
pointer network: https://github.com/devsisters/pointer-network-tensorflow

18.1

http://blog.dennybritz.com/2017/01/17/engineering-is-the-bottleneck-in-deep-learning-research/
ml for practitioner: http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf
write dl/nn from scratch: https://github.com/dmlc/minpy

17.1

improve headlines with salient words and seo score: http://www-personal.umich.edu/~tdszyman/misc/nlpmj16.pdf
text summarisation: http://www-personal.umich.edu/~tdszyman/misc/summarization15.pdf
word embedding over time: http://www-personal.umich.edu/~tdszyman/misc/InsightSIGNLP16.pdf
victor DS politech in France: https://github.com/Vict0rSch/data_science_polytechnique
Thien NYU: http://www.cs.nyu.edu/~thien/
tonymooori: https://github.com/TonyMooori/studying
learning theory: https://web.stanford.edu/class/cs229t/notes.pdf
time series predictions: http://danielhnyk.cz/predicting-sequences-vectors-keras-using-rnn-lstm/

16.1

Edward Dustin Tran in TF already, so cool: https://arxiv.org/pdf/1701.03757v1.pdf
keras in tensorflow now on. @fchollet informed on Twitter.
squeezednet = tiny alexnet (5MB) https://github.com/rcmalli/keras-squeezenet
won $5k: https://medium.freecodecamp.com/recognizing-traffic-lights-with-deep-learning-23dae23287cc#.9yb31nsm4
https://github.com/karoldvl/paper-2015-esc-convnet/blob/master/Code/Results.ipynb

15.1

deep spell code: https://github.com/MajorTal/DeepSpell
draw svg in jupyter: https://github.com/uclmr/egal
sound classification with cnn: https://github.com/karoldvl/paper-2015-esc-convnet

14.1

https://medium.com/@majortal/deep-spelling-9ffef96a24f6
line bot + rnn + tf, vanhuyz: https://github.com/vanhuyz/line-sticker-bot
https://github.com/Vict0rSch/deep_learning/tree/master/keras
https://github.com/openai/pixel-cnn
AWS Lambda: http://blog.matthewdfuller.com/p/aws-lambda-pricing-calculator.html
deep text corrector: http://atpaino.com/2017/01/03/deep-text-correcter.html
https://github.com/dhwajraj/deep-text-classifier-mtl

13.1

convlstm: https://github.com/carlthome/tensorflow-convlstm-cell
GAN and RNN: https://www.reddit.com/r/MachineLearning/comments/40ldq6/generative_adversarial_networks_for_text/
generate sentences from continuous space: https://arxiv.org/pdf/1511.06349v2.pdf
How to train your Gen. model: Sampling, likelihood or adversary

12.1

https://www.raywenderlich.com/126063/react-native-tutorial
ml practitioners: https://news.ycombinator.com/item?id=10954508
spotify word2vec: https://douweosinga.com/projects/marconi?song1_id=45yEy5WJywhJ3sDI28ajTm&song2_id=
https://github.com/DOsinga/marconi/blob/master/train_model.py
True| Good | Kind | Useful | Relevant | Necessary https://www.quora.com/What-is-Triple-Filter-test-of-Socrates
https://www.youtube.com/watch?v=ifYfJdo27_k
student note: https://adeshpande3.github.io/adeshpande3.github.io/Deep-Learning-Research-Review-Week-3-Natural-Language-Processing

11.1

ggplot2 in R: http://sharpsightlabs.com/blog/mapping-vc-investment/
TF 1.0, mature. https://opendatascience.com/blog/rnns-in-tensorflow-a-practical-guide-and-undocumented-features/
NN semantic encoder: https://github.com/pdasigi/neural-semantic-encoders/blob/master/nse.py
DL in NN, overview: https://arxiv.org/pdf/1404.7828v4.pdf
jurgen schmid: http://people.idsia.ch/~juergen/

10.1

GDG NL: http://www.slideshare.net/RokeshJankie/introducing-tensorflow-the-game-changer-in-building-intelligent-applications
https://github.com/ToferC/Twitter_graphing_python
http://www.oujago.com/DL_more.html
thiago DS at Yahoo: https://tgmstat.wordpress.com/
deepstack playing poker: https://arxiv.org/pdf/1701.01724v1.pdf
silly DL: https://news.ycombinator.com/item?id=13353941
http://p.migdal.pl/2017/01/06/king-man-woman-queen-why.html
AE for new molecule: http://www.impactjournals.com/oncotarget/index.php?journal=oncotarget&page=article&op=view&path[]=14073&pubmed-linkout=1

9.1

xlingual embedding: https://levyomer.wordpress.com/2017/01/08/a-strong-baseline-for-learning-cross-lingual-word-embeddings-from-sentence-alignments/
greg notebooks: https://github.com/gjreda/gregreda.com/tree/master/content/notebooks
the periodic table of AI: http://ai.xprize.org/news/periodic-table-of-ai
the same table of DL: http://www.deeplearningpatterns.com/doku.php/overview
aylien text mining and analysis: Sebastien Ruder: https://arxiv.org/pdf/1609.02746v1.pdf
DS as a freelancer from Greg Yhat: http://www.gregreda.com/2017/01/07/freelance-data-science-experience/

7.1

how bayesian inference works: http://brohrer.github.io/how_bayesian_inference_works.html
best vis projects in 2016: http://flowingdata.com/2016/12/29/best-data-visualization-projects-of-2016/
https://flowingdata.com/2012/12/17/getting-started-with-charts-in-r/

5.1

allenai biattflow: https://github.com/allenai/bi-att-flow
fork guy: https://github.com/BinbinBian
ICRL 17, DCNN: https://arxiv.org/pdf/1611.01604v2.pdf
victor zhong: https://github.com/vzhong/posts-notebooks
BN, if you wann gaussian, zero mean: https://kratzert.github.io/2016/02/12/understanding-the-gradient-flow-through-the-batch-normalization-layer.html
statsnlp https://github.com/uclmr/stat-nlp-book
sota of qa: http://metamind.io/research/state-of-the-art-deep-learning-model-for-question-answering/

4.1

dynet: CMU neural networks in C++: https://github.com/clab
systran: https://arxiv.org/pdf/1610.05540v1.pdf
punctuation normalisation: http://www.statmt.org/wmt11/normalize-punctuation.perl
GAN in keras: https://github.com/osh/KerasGAN
reinforcement learning in keras and gym: https://github.com/osh/kerlym
ML 101 for DE: https://drive.google.com/drive/folders/0B3bb7xB2VOUBMW1LQjVYUlJNRFU

3.1

variational for text processing: https://github.com/carpedm20/variational-text-tensorflow
spotify CNN music classification: https://www.dropbox.com/s/22bqmco45179t7z/thesis-FINAL.pdf
kaggle winning solution for whale detection: https://github.com/benanne
https://github.com/zygmuntz?tab=repositories

2.1.17

overfitting in life: http://tuanvannguyen.blogspot.com/2016/12/over-fitting-va-y-nghia-thuc-te-trong.html
optimal stopping problem: https://plus.maths.org/content/solution-optimal-stopping-problem

31.12

visualisation NLP: http://www.aclweb.org/anthology/N16-1082

30.12

zero shot translation: https://techcrunch.com/2016/11/22/googles-ai-translation-tool-seems-to-have-invented-its-own-secret-internal-language/

29.12

Music Tagging, CRNN https://arxiv.org/pdf/1609.04243v3.pdf
Benmusic: http://www.bensound.com/
event detection: http://anthology.aclweb.org/C/C14/C14-1134.pdf

28.12

NIPs 2016, embedding projector: https://arxiv.org/pdf/1611.05469.pdf
stats learning: https://web.stanford.edu/class/cs229t/notes.pdf
http://www.normansoft.com/blog/index.html
Tf projector is really cool: https://github.com/normanheckscher/mnist-tensorboard-embeddings/blob/master/mnist_t-sne.py
Who to follow on Twitter in ML/DL: https://twitter.com/DL_ML_Loop/lists/deep-learning-loop/members
How to learn? BPTT https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b#.sunmvqmsx

27.12

deep learning with Torch: https://github.com/soumith/cvpr2015
T7: https://github.com/soumith/cvpr2015/blob/master/cvpr-torch.pdf
GPOD general purpose object detector: https://github.com/EvgenyNekrasov/gpod
mckinseys: http://www.forbes.com/sites/louiscolumbus/2016/12/18/mckinseys-2016-analytics-study-defines-the-future-machine-learning
gumbel add noise to sigmoid: https://github.com/yandexdataschool/gumbel_lstm
fastai wordembedding: https://github.com/fastai/courses/blob/master/deeplearning1/nbs/wordvectors.ipynb

26.12

spotify cnn: http://benanne.github.io/2014/08/05/spotify-cnns.html
Gated RNN https://arxiv.org/pdf/1612.08083v1.pdf
http://www.slideshare.net/SebastianRuder/nips-2016-highlights-sebastian-ruder
monolingal dataset WMT 2014: http://www.statmt.org/wmt14/translation-task.html
neural turing machine: https://github.com/shawntan/neural-turing-machines
yandex ml school HSE: https://github.com/yandexdataschool/HSE_deeplearning

24.12

Laurent Dinh: Density estimation https://docs.google.com/presentation/d/152NyIZYDRlYuml5DbBONchJYA7AAwlti5gTWW1eXlLM/
Swiftkey, LM: https://blog.swiftkey.com/swiftkey-debuts-worlds-first-smartphone-keyboard-powered-by-neural-networks/
porting Theano to TF: https://medium.com/@sentimentron/faceoff-theano-vs-tensorflow-e25648c31800
tractica: DL for retailer: https://www.tractica.com/automation-robotics/leveraging-deep-learning-to-improve-the-retail-experience/
Effective Size: is Singaporean better in math than Vietnamese? if ES = 0.3, the overlap is near 90%, nothing to say in this Pisa's ranking.
dracula: twitter POS utilised GATE: https://github.com/Sentimentron/Dracula/
Business process with LSTM: https://arxiv.org/pdf/1612.02130v1.pdf

23.12

https://bigdatauniversity.com/courses/deep-learning-tensorflow/

22.12

https://quid.com/feed/how-quid-uses-deep-learning-with-small-data
dl for coders: http://course.fast.ai/, notebooks here: https://github.com/fastai/courses
encoder-decoder RNN: http://www.slideshare.net/ssuser77b8c6/reducing-the-dimensionality-of-data-with-neural-networks
https://trello.com/b/rbpEfMld/data-science
http://tuanvannguyen.blogspot.com/2016/12/yeu-to-nao-anh-huong-en-iem-pisa-2015.html

21.12

20.12

http://opennmt.net
neural relation extraction https://www.aclweb.org/anthology/P/P16/P16-1200.pdf
claim classification: https://github.com/UKPLab/coling2016-claim-classification
https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/publikationen/2016/2016_COLING_CG.pdf

19.12

fasttext.zip https://arxiv.org/abs/1612.03651
bi sequence classification: same SNLI, event detection: https://pdfs.semanticscholar.org/6f42/cb23262066b4034aba99bf674783ed6cac8b.pdf
large scale contextual LSTM and NLP task: https://arxiv.org/pdf/1602.06291.pdf
main advances in ML 2016, Xavier at Quora: https://www.quora.com/What-were-the-main-advances-in-machine-learning-artificial-intelligence-in-2016?

17.12

https://github.com/jwkvam/bowtie

16.12

tensorflow book with code: https://github.com/BinRoot/TensorFlow-Book
trading with ML (Georgia university): https://www.udacity.com/course/machine-learning-for-trading--ud501

15.12

14.12

spacy vs nltk: https://gist.github.com/rschroll/61b20c41e984a963df2870cfc9e628ed
psychometrics, precision marketing, privacy no longer: http://www.michalkosinski.com/
300+ ML projects from Stanford: http://cs229.stanford.edu/PosterSessionProgram.pdf
NIPs 2016 codes: https://www.reddit.com/r/MachineLearning/comments/5hwqeb/project_all_code_implementations_for_nips_2016/
Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences: https://github.com/dannyneil/public_plstm

13.12

NIPs summary: http://beamandrew.github.io/deeplearning/2016/12/12/nips-2016.html
how to choose batch size: https://github.com/karpathy/char-rnn, https://svail.github.io/rnn_perf/, http://axon.cs.byu.edu/papers/Wilson.nn03.batch.pdf
https://github.com/lmthang/thesis

12.12

Relation classification (RC) via data augmentation: https://arxiv.org/abs/1601.03651
broader twitter NER: http://www.slideshare.net/leonderczynski/broad-twitter-corpus-a-diverse-named-entity-recognition-resource
sequence classification such as NER, POS: https://github.com/napsternxg/DeepSequenceClassification
arctic captions: https://github.com/kelvinxu/arctic-captions/blob/master/alpha_visualization.ipynb
COLING 2016 from 13 to 16 Dec, Japan: https://github.com/napsternxg/TwitterNER, http://coling2016.anlp.jp/

11.12

9.12

if then learning: https://papers.nips.cc/paper/6284-latent-attention-for-if-then-program-synthesis.pdf
reinforcement learning: https://github.com/DanielTakeshi
NIPS 2016: https://github.com/mphuget/NIPS2016
https://github.com/zelandiya/KiwiPyCon-NLP-tutorial
http://www.wrangleconf.com/apac.html
http://cs231n.github.io/aws-tutorial/
clickbait F1 98, AUC 99, too good too be true: https://arxiv.org/pdf/1612.01340v1.pdf
https://arxiv.org/abs/1606.04474
https://github.com/deepmind/learning-to-learn

8.12

hackermath: https://github.com/amitkaps/hackermath/blob/master/talk.pdf
tensorboard: https://www.tensorflow.org/versions/master/how_tos/embedding_viz/index.html
embedding projector: http://projector.tensorflow.org/
dl4nlp at ukplab, Germany: https://github.com/UKPLab/deeplearning4nlp-tutorial/tree/master/2016-11_Seminar
Filter bubble vs Info cascading, Eli Pariser: https://www.ted.com/talks/eli_pariser_beware_online_filter_bubbles

7.12

tidy data in pandas: http://www.jeannicholashould.com/tidy-data-in-python.html
graph db: https://blog.grakn.ai/adding-semantics-to-graph-databases-with-mindmapsdb-part-1-82022bbb3b1c
https://github.com/mikonapoli
reinforcement learninghttp, open ai://people.eecs.berkeley.edu/~pabbeel/nips-tutorial-policy-optimization-Schulman-Abbeel.pdf
meal description and food tagging: https://pdfs.semanticscholar.org/5f55/c5535e80d3e5ed7f1f0b89531e32725faff5.pdf

6.12

rationale cnn [keras] https://github.com/bwallace/rationale-CNN
churn analysis, f1 75%, lr, svm hinge: http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9849/9527
thanapon noraset: https://northanapon.github.io/read/
https://github.com/NorThanapon/adaptive_lm
train general AI: https://openai.com/blog/universe/
NIPS 2016 https://nips.cc/Conferences/2016/Schedule
full ds notebook: https://github.com/donnemartin/data-science-ipython-notebooks
Quoc Le, tut2: Autoencoder, CNN, RNN: http://ai.stanford.edu/~quocle/tutorial2.pdf
Quoc Le, tut1: nonlinear classifier and backprop: http://ai.stanford.edu/~quocle/tutorial1.pdf
Quoc Le, ex1: http://ai.stanford.edu/~quocle/exercise1.py
https://alexanderdyakonov.wordpress.com/2016/12/04/сундуки-и-монеты/#more-4401

5.12

semantic role labelings: https://blog.acolyer.org/2016/07/05/end-to-end-learning-of-semantic-role-labeling-using-recurrent-neural-networks/
ml yearning: https://gallery.mailchimp.com/dc3a7ef4d750c0abfc19202a3/files/Machine_Learning_Yearning_V0.5_01.pdf
stock embedding:https://medium.com/@TalPerry/deep-learning-the-stock-market-df853d139e02#.9q1d9hnai
fast weights: https://github.com/ajarai

2.12

https://github.com/cgpotts/cs224u

1.12

https://gist.github.com/honnibal
siamese lstm: https://github.com/aditya1503/Siamese-LSTM
accuracy of lunar chinese calendar to predict baby sex http://onlinelibrary.wiley.com/doi/10.1111/j.1365-3016.2010.01129.x/abstract;
customized keras lambda: https://gist.github.com/keunwoochoi

30.11

rnn tricks: http://www.slideshare.net/indicods/general-sequence-learning-with-recurrent-neural-networks-for-next-ml
data mining in action: Moscow, Russia: https://github.com/vkantor/MIPT_Data_Mining_In_Action_2016
hypo testing, birthday effect: http://www.slideshare.net/SergeyIvanov105/birthday-effect-67829860
LUI: linguistic UI https://medium.com/swlh/a-natural-language-user-interface-is-just-a-user-interface-4a6d898e9721
fake news is 80% accuracy better: http://www.mallikarjunan.com/verytas/how-good-are-you-at-recognizing-satire-quiz
nampi, spain 2017
decode thought vector: http://gabgoh.github.io/ThoughtVectors/
unstrained fmin: https://github.com/benfred/fmin
neural programmer: https://github.com/tensorflow/models/tree/master/neural_programmer
https://www.tensorflow.org/versions/master/how_tos/embedding_viz/index.html#tensorboard-embedding-visualization

29.11

28.11

event detection and deep learning: http://www.cs.nyu.edu/~thien/
https://github.com/anoperson/NeuralNetworksForRE
ED EE and MD with RNN and CNN: http://www.aclweb.org/anthology/P/P15/P15-2060.pdf

27.11

26.11

slides from mlconf sf 2016:http://www.slideshare.net/SessionsEvents/anjuli-kannan-software-engineer-google-at-mlconf-sf-2016
http://www.slideshare.net/KenjiEsaki/kdd-2016-slide

25.11

vo duy tin: https://github.com/duytinvo
https://spacy.io/docs/usage/entity-recognition

24.11

chinese NLP: https://github.com/taozhijiang/chinese_nlp
not news: http://venturebeat.com/2016/11/23/twitter-cortex-team-loses-some-ai-researchers/
sentihood: http://annotate-neighborhood.com/download/download.html, https://arxiv.org/pdf/1610.03771v1.pdf

23.11

Multithread in Theano:

check your blas: https://raw.githubusercontent.com/Theano/Theano/master/theano/misc/check_blas.py
http://deeplearning.net/software/theano/tutorial/multi_cores.html?highlight=multi%20co
Theano/Theano#3239
set OMP_NUM_THREADS=4 inside the notebook with env: https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/

Debug

torch vs theano vs tf: https://www.quora.com/Is-TensorFlow-better-than-other-leading-libraries-such-as-Torch-Theano
debug Deep Learning: https://gab41.lab41.org/some-tips-for-debugging-deep-learning-3f69e56ea134#.1ldbphlav
negative loss: keras-team/keras#1917
CAP: Clustering Association Prediction, stas thinking https://www.researchgate.net/publication/310597778_Scientific_discovery_through_statistics

22.11

stance detection: favour or against: http://isabelleaugenstein.github.io/papers/SemEval2016-Stance.pdf
Hugo from Twitter to Google Brain, Montreal: https://techcrunch.com/2016/11/21/google-opens-new-ai-lab-and-invests-3-4m-in-montreal-based-ai-research/?sr_share=facebook
train word2vec in gensim in good way: https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-IMDB.ipynb

21.11

sparql in python: https://joernhees.de/blog/tag/install/
minhash: http://mccormickml.com/2015/06/12/minhash-tutorial-with-python-code/
beating the kaggle easy way: http://www.ke.tu-darmstadt.de/lehre/arbeiten/studien/2015/Dong_Ying.pdf

19.11

10 takeaways writeup MLConf SF: https://tryolabs.com/blog/2016/11/18/10-main-takeaways-from-mlconf/
theano summer school: https://github.com/mila-udem/summerschool2015
gpu card for macbook pro: http://udibr.github.io/using-external-gtx-980-with-macbook-pro.html
transfer learning using pretrained vgg, resnet for your problem: https://github.com/dolaameng/transfer-learning-lab

18.11

17.11

wikidata: http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial
wptools: https://github.com/siznax/wptools/wiki
google translate: https://arxiv.org/pdf/1611.04558v1.pdf
https://arxiv.org/pdf/1611.05104v1.pdf
https://arxiv.org/pdf/1611.01587v2.pdf

16.11

dssm deep sem sim models: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/wsdm2015.v3.pdf
twitter @ Singapore: http://www.straitstimes.com/singapore/twitter-eyes-local-talent-for-singapore-data-science-team
multiple tasks of NLP: https://arxiv.org/pdf/1611.01587v2.pdf
QUASI RNN: https://arxiv.org/pdf/1611.01576v1.pdf

15.11

regex learning: http://dlacombejr.github.io/2016/11/13/deep-learning-for-regex.html
recurrent + cnn for text classification: https://github.com/airalcorn2/Recurrent-Convolutional-Neural-Network-Text-Classifier
quiver: to view convnet layer https://github.com/jakebian/quiver
hera: to see training progress board: https://github.com/jakebian/hera
RAISR: Rapid and Accurate Image Super Resolution https://arxiv.org/pdf/1606.01299v3.pdf
why is machine learning hard: http://ai.stanford.edu/~zayd/why-is-machine-learning-hard.html

14.11

event ODSC West: https://www.odsc.com/california
MLconf SF 12 Nov, summary: https://github.com/adarsh0806/ODSCWest/blob/master/MLConf.md
Duy Do talk: https://speakerdeck.com/duydo/elasticsearch-for-data-engineers

13.11

barcampsaigon 2016: some good topics on Elastic Search (Duy Do), Big Data analytics (Trieu Nguyen)
Altair https://speakerdeck.com/jakevdp/visualization-in-python-with-altair

12.11

11.11

https://github.com/wiki-ai/revscoring
Visual OCR attention: https://github.com/da03/Attention-OCR
startup and DL: https://github.com/lipiji/App-DL
embed + encode + attend + predict: https://explosion.ai/blog/deep-learning-formula-nlp
HN: https://www.cs.cmu.edu/~diyiy/docs/naacl16.pdf

10.11

https://arxiv.org/pdf/1508.06615.pdf

9.11

ibm researcher, lda gib sampling, doc2vec: https://github.com/jhlau

8.11

quoc le, rnn with reinforcement learning: http://openreview.net/pdf?id=r1Ue8Hcxg

7.11

https://github.com/vinhkhuc/MemN2N-babi-python
similarity proximity: http://www.datasciencecentral.com/profiles/blogs/comparison-between-global-vs-local-normalization-of-tweets-and
pycon15, elastic search: https://github.com/erikrose/elasticsearch-tutorial

6.11

https://github.com/Keats/rodent

04.11

airbnb knowledge scale: https://medium.com/airbnb-engineering/scaling-knowledge-at-airbnb-875d73eff091#.5moos4eki
R notebooks: http://rmarkdown.rstudio.com/r_notebooks.html
dask: https://github.com/dask/dask
dask vs celery: http://matthewrocklin.com/blog/work/2016/09/13/dask-and-celery
dask in jupyperlab: https://learning.acm.org/webinar_pdfs/ChristineDoig_WebinarSlides.pdf

3.11

https://hbr.org/resources/pdfs/hbr-articles/2016/11/the_state_of_machine_intelligence.pdf
shallow learn: gensim + fasttext: https://github.com/giacbrd/ShallowLearn
nn for sa: http://www.emnlp2016.net/tutorials/zhang-vo-t4.pdf

2.11

mask bilstm: http://dirko.github.io/Bidirectional-LSTM

Name		Name	Last commit message	Last commit date
Latest commit History 1,959 Commits
nlp		nlp
stories		stories
submit		submit
vtc_word_map_tintuc_files		vtc_word_map_tintuc_files
vtc_word_map_tintuc_legend_files		vtc_word_map_tintuc_legend_files
25quotes.md		25quotes.md
LAB_word2vec_4DA.ipynb		LAB_word2vec_4DA.ipynb
LeadQualifier-a88.ipynb		LeadQualifier-a88.ipynb
ML101		ML101
README.md		README.md
data		data
data-science-board.json		data-science-board.json
f2p		f2p
fintech.md		fintech.md
myupdates.md		myupdates.md
questions.md		questions.md
quotes.md		quotes.md
test_colab.ipynb		test_colab.ipynb
tricks.md		tricks.md
vtc_docsent_doc_map.ipynb		vtc_docsent_doc_map.ipynb
vtc_docsent_v1.0_release.ipynb		vtc_docsent_v1.0_release.ipynb
vtc_smart_news_ldavis.ipynb		vtc_smart_news_ldavis.ipynb
vtc_smartnews_v01.ipynb		vtc_smartnews_v01.ipynb
vtc_word_map.ipynb		vtc_word_map.ipynb
vtc_word_map_02.ipynb		vtc_word_map_02.ipynb
vtc_word_map_03.ipynb		vtc_word_map_03.ipynb
vtc_word_map_tintuc.html		vtc_word_map_tintuc.html
vtc_word_map_tintuc_legend.html		vtc_word_map_tintuc_legend.html
vtc_word_map_womenday_2010.ipynb		vtc_word_map_womenday_2010.ipynb
vtc_word_map_womenday_2010_02.ipynb		vtc_word_map_womenday_2010_02.ipynb
wikidata_sparql.md		wikidata_sparql.md

lampts/data_science

Folders and files

Latest commit

History

Repository files navigation

data_science

Must read

My implementations

Chatbot

RecSys

Winining solutions

Stats

Game Industry:

Case stydies:

DS Coursera

Heroes of DL

Top conferences:

Deep Learning

NLPStan reading

LXMLS16:

ACL2017

VietAI

My SOTA

Game industry

Yandex

ICLR 2017 Review

LearningNewThingIn2017

Conf events

NIPs 2016 slides

Theano based DL applications

learn to learn: algos optimization

People

NLP course

Dataset

Tricks of DL

Pointer network

Attention

Log likelihood test

MLtrainings.ru

GCloud

Current conference

Timeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages