This repository has been archived by the owner on Jan 11, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathArxivData.txt
executable file
·2007 lines (2006 loc) · 247 KB
/
ArxivData.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
If there are any errors
please Abort, and run `arxiv_required` for required package installation, and start again
Please wait while we phrase the requested information from global arxiv[arxiv.org] servers
------------>
---------------------------->
------------------------------------------------------>
A Comparative Study of Consistent Snapshot Algorithms for Main-Memory Database Systems (Liang Li - 11 October, 2018)
Formally, the in-memory consistent snapshot problem refers to taking an in-memory consistent time-in-point snapshot with the constraints that 1) clients can read the latest data items and 2) any data item in the snapshot should not be overwritten
Link: https://arxiv.org/abs/1810.04915
====================================================
Perfusion parameter estimation using neural networks and data augmentation (David Robben - 11 October, 2018)
A comparison on simulated CT Perfusion data shows that the neural network provides better estimations for both CBF and Tmax than a state of the art deconvolution method, and this over a wide range of noise levels. The proposed data augmentation enables to achieve these results with less than 100 datasets.
Link: https://arxiv.org/abs/1810.04898
====================================================
Applications of PageRank to Function Comparison and Malware Classification (Michael A. Slawinski - 10 October, 2018)
The model was trained on 2.5 million samples of .NET and has an accuracy of 98.3\% on test data
Link: https://arxiv.org/abs/1810.04789
====================================================
Deep Inertial Poser: Learning to Reconstruct Human Pose from Sparse Inertial Measurements in Real Time (Yinghao Huang - 10 October, 2018)
To evaluate our method, we recorded DIP-IMU, a dataset consisting of $10$ subjects wearing 17 IMUs for validation in $64$ sequences with $330\,000$ time instants; this constitutes the largest IMU dataset publicly available
Link: https://arxiv.org/abs/1810.04703
====================================================
Intrusion Detection Using Mouse Dynamics (Margit Antal - 10 October, 2018)
The Balabit data set was released in 2016 for a data science competition, which against the few subjects, can be considered the first adequate publicly available one. Set of actions-based evaluation achieves 0.92 AUC on the test part of the data set. However, the same type of evaluation conducted on the training part of the data set resulted in maximal AUC (1) using only 13 actions
Link: https://arxiv.org/abs/1810.04668
====================================================
Multimodal Speech Emotion Recognition Using Audio and Text (Seunghyun Yoon - 10 October, 2018)
Our proposed model outperforms previous state-of-the-art methods in assigning data to one of four emotion categories (i.e., angry, happy, sad and neutral) when the model is applied to the IEMOCAP dataset, as reflected by accuracies ranging from 68.8% to 71.8%.
Link: https://arxiv.org/abs/1810.04635
====================================================
Revitalizing Copybacks in Modern SSDs: Why and How (Duwon Hong - 10 October, 2018)
By limiting the number of successive copybacks, it guarantees that no data reliability problem occurs when data is internally migrated using rCopyback. Our evaluation results show that rcFTL can improve the overall I/O throughput by 54% on average over an existing FTL which does not use copybacks.
Link: https://arxiv.org/abs/1810.04603
====================================================
LIRS: Enabling efficient machine learning on NVM-based storage via a lightweight implementation of random shuffling (Zhi-Lin Ke - 10 October, 2018)
With the emerging non-volatile memory-based storage device, such as Intel Optane SSD, which provides fast random accesses, we propose a lightweight implementation of random shuffling (LIRS) to randomly shuffle the indexes of the entire training dataset, and the selected training instances are directly accessed from the storage and packed into batches. Experimental results show that LIRS can reduce the total training time of SVM and DNN by 49.9% and 43.5% on average, and improve the final testing accuracy on DNN by 1.01%.
Link: https://arxiv.org/abs/1810.04509
====================================================
Is your Statement Purposeless? Predicting Computer Science Graduation Admission Acceptance based on Statement Of Purpose (Diptesh Kanojia - 9 October, 2018)
We present a quantitative, data-driven machine learning approach to mitigate the problem of unpredictability of Computer Science Graduate School Admissions. We train a model over fifty manually verified SOPs for which it uses an SVM classifier and achieves the highest accuracy of 92% with 10-fold cross-validation
Link: https://arxiv.org/abs/1810.04502
====================================================
Invariance Analysis of Saliency Models versus Human Gaze During Scene Free Viewing (Zhaohui Che - 10 October, 2018)
In this paper, we first create a large-scale database including eye movements of 10 observers over 1900 images degraded by 19 types of distortions
Link: https://arxiv.org/abs/1810.04456
====================================================
Improving Neural Text Simplification Model with Simplified Corpora (Jipeng Qiang - 10 October, 2018)
We train encoder-decoder model using synthetic sentence pairs and original sentence pairs, which can obtain substantial improvements on the available WikiLarge data and WikiSmall data compared with the state-of-the-art methods.
Link: https://arxiv.org/abs/1810.04428
====================================================
V3C - a Research Video Collection (Luca Rossetto - 10 October, 2018)
V3C comes with a shot segmentation for each video, together with the resulting keyframes in original as well as reduced resolution and additional metadata. It is intended to be used from 2019 at the International large-scale TREC Video Retrieval Evaluation campaign (TRECVid).
Link: https://arxiv.org/abs/1810.04401
====================================================
Filtration Simplification for Persistent Homology via Edge Contraction (Tamal K. Dey - 10 October, 2018)
Persistent homology is a popular data analysis technique that is used to capture the changing topology of a filtration associated with some simplicial complex $K$. The first assumes that the underlying space of $K$ is a $2$-manifold and ensures that simplices are paired with the same simplices in the contracted complex as they are in the original
Link: https://arxiv.org/abs/1810.04388
====================================================
Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach (Muhammad H. Hilman - 9 October, 2018)
We compare our solution to a state-of-the-art approach that exploits the resources monitoring data based on regression machine learning technique. From our experiments, the proposed strategy improves the performance, in terms of the error, up to 29.89%, compared to the state-of-the-art solutions.
Link: https://arxiv.org/abs/1810.04329
====================================================
Smtlink 2.0 (Yan Peng - 9 October, 2018)
Smtlink 2.0 provides support for FTY defprod, deflist, defalist, and defoption types by using Z3's arrays and user-defined data types
Link: https://arxiv.org/abs/1810.04317
====================================================
Inter-Scanner Harmonization of High Angular Resolution DW-MRI using Null Space Deep Learning (Vishwesh Nath - 9 October, 2018)
To use these data, we propose a new network architecture, the null space deep network (NSDN), to simultaneously learn on traditional observed/truth pairs (e.g., MRI-histology voxels) along with repeated observations without a known truth (e.g., scan-rescan MRI). NSDN significantly improved absolute performance relative to histology by 3.87% over CSD and 1.42% over a recently proposed deep neural network approach. More-over, it improved reproducibility on the paired data by 21.19% over CSD and 10.09% over a recently proposed deep approach. Finally, NSDN improved gen-eralizability of the model to a third in vivo human scanner (which was not used in training) by 16.08% over CSD and 10.41% over a recently proposed deep learn-ing approach
Link: https://arxiv.org/abs/1810.04260
====================================================
Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives (George Tucker - 9 October, 2018)
These approaches maximize a variational lower bound on the intractable log likelihood of the observed data. Counterintuitively, the typical inference network gradient estimator for the IWAE bound performs poorly as the number of samples increases (Rainforth et al., 2018; Le et al., 2018)
Link: https://arxiv.org/abs/1810.04152
====================================================
Semi-supervised Deep Reinforcement Learning in Support of IoT and Smart City Services (Mehdi Mohammadi - 9 October, 2018)
In this paper, we propose a semi-supervised deep reinforcement learning model that fits smart city applications as it consumes both labeled and unlabeled data to improve the performance and accuracy of the learning agent. Our model learns the best action policies that lead to a close estimation of the target locations with an improvement of 23% in terms of distance to the target and at least 67% more received rewards compared to the supervised DRL model.
Link: https://arxiv.org/abs/1810.04118
====================================================
Detecting object region and working state of aerator based on computer vision and machine learning (Yeqi Liu - 9 October, 2018)
In the work state detection module, this paper proposes a novel method called reference frame Kanade-Lucas-Tomasi (RF-KLT) algorithm, and constructs a classification procedure for the unlabeled time series data. The results of this study show that the accuracy of detecting object region and working state of aerator in the complex background is 100% and 99.9% respectively, and the detection speed is 77-333 frames per second (FPS) according to the different types of surveillance camera
Link: https://arxiv.org/abs/1810.04108
====================================================
Geometry meets semantics for semi-supervised monocular depth estimation (Pierluigi Zama Ramirez - 9 October, 2018)
In particular, on the KITTI dataset our network outperforms state-of-the-art methods for monocular depth estimation.
Link: https://arxiv.org/abs/1810.04093
====================================================
Deep Geodesic Learning for Segmentation and Anatomical Landmarking (Neslisah Torosdagli - 6 October, 2018)
In step 1, we propose a deep neu- ral network architecture with carefully designed regularization, and network hyper-parameters to perform image segmentation without the need for data augmentation and complex post- processing refinement. In step 2, we formulate the landmark localization problem directly on the geodesic space for sparsely- spaced anatomical landmarks. In step 3, we propose to use a long short-term memory (LSTM) network to identify closely- spaced landmarks, which is rather difficult to obtain using other standard detection networks. We used a very challenging CBCT dataset of 50 patients with a high-degree of craniomaxillofacial (CMF) variability that is realistic in clinical practice. Complementary to the quantitative analysis, the qualitative visual inspection was conducted for distinct CBCT scans from 250 patients with high anatomical variability. We have also shown feasibility of the proposed work in an independent dataset from MICCAI Head-Neck Challenge (2015) achieving the state-of-the-art performance
Link: https://arxiv.org/abs/1810.04021
====================================================
Glioma Segmentation with Cascaded Unet (Dmitry Lachinov - 9 October, 2018)
We evaluate presented approach on BraTS 2018 dataset and discuss results.
Link: https://arxiv.org/abs/1810.04008
====================================================
Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings (Zhongwei Xie - 4 October, 2018)
Currently,state-of-the-art approaches mainly focus on the task-specific data,neglecting the extra information on the different but related tasks
Link: https://arxiv.org/abs/1810.03989
====================================================
Explicit optimal-length locally repairable codes of distance 5 (Allison Beemer - 9 October, 2018)
Locally repairable codes (LRCs) have received significant recent attention as a method of designing data storage systems robust to server failure. For optimal LRCs with minimum distance greater than or equal to 5, block length is bounded by a polynomial function of alphabet size. In this paper, we give explicit constructions of optimal-length (in terms of alphabet size), optimal LRCs with minimum distance equal to 5.
Link: https://arxiv.org/abs/1810.03980
====================================================
Extended Bit-Plane Compression for Convolutional Neural Network Accelerators (Lukas Cavigelli - 1 October, 2018)
We show that an average compression ratio of 4.4x relative to uncompressed data and a gain of 60% over existing method can be achieved for ResNet-34 with a compression block requiring <300 bit of sequential cells and minimal combinational logic.
Link: https://arxiv.org/abs/1810.03979
====================================================
Improving Myocardium Segmentation in Cardiac CT Angiography using Spectral Information (Steffen Bruns - 27 September, 2018)
We propose augmentation of the training data with virtual mono-energetic reconstructions from a spectral CT scanner which show different attenuation levels of the contrast agent. We train a 3D fully convolutional network (FCN) with 10 conventional CCTA images and corresponding virtual mono-energetic reconstructions acquired on a spectral CT scanner, and evaluate on 40 CCTA scans acquired on a conventional CT scanner. We show that training with data augmentation using virtual mono-energetic images improves upon training with only conventional images (Dice similarity coefficient (DSC) 0.895 $\pm$ 0.039 vs. 0.846 $\pm$ 0.125). In comparison, training with data augmentation using linear scaling improves the DSC to 0.890 $\pm$ 0.039. Moreover, combining the results of both augmentation methods leads to a DSC of 0.901 $\pm$ 0.036, showing that both augmentations lead to different local improvements of the segmentations
Link: https://arxiv.org/abs/1810.03968
====================================================
Conditional Generative Refinement Adversarial Networks for Unbalanced Medical Image Semantic Segmentation (Mina Rezaei - 9 October, 2018)
The proposed architecture shows state-of-the-art results on LiTS-2017 for liver lesion segmentation, and two microscopic cell segmentation datasets MDA231, PhC-HeLa
Link: https://arxiv.org/abs/1810.03871
====================================================
Deep Attentive Tracking via Reciprocative Learning (Shi Pu - 9 October, 2018)
Extensive experiments on large-scale benchmark datasets show that the proposed attentive tracking method performs favorably against the state-of-the-art approaches.
Link: https://arxiv.org/abs/1810.03851
====================================================
Learning Bounds for Greedy Approximation with Explicit Feature Maps from Multiple Kernels (Shahin Shahrampour - 9 October, 2018)
Our empirical results show that given a fixed number of explicit features, the method can achieve a lower test error with a smaller time cost, compared to the state-of-the-art in data-dependent random features.
Link: https://arxiv.org/abs/1810.03817
====================================================
Deep residual networks for automatic sleep stage classification of raw polysomnographic waveforms (Alexander Neergaard Olesen - 8 October, 2018)
Briefly, the raw data is passed through 50 convolutional layers before subsequent classification into one of five sleep stages. Three model configurations were trained on 1850 polysomnogram recordings and subsequently tested on 230 independent recordings. Our best performing model yielded an accuracy of 84.1% and a Cohen's kappa of 0.746, improving on previous reported results by other groups also using only raw polysomnogram data. Most errors were made on non-REM stage 1 and 3 decisions, errors likely resulting from the definition of these stages
Link: https://arxiv.org/abs/1810.03745
====================================================
Bootstrapped CNNs for Building Segmentation on RGB-D Aerial Imagery (Clint Sebastian - 8 October, 2018)
Second, the proposed method outperforms the non-bootstrapped version by utilizing only one-sixth of the original training data and it obtains a precision-recall break-even of 95.10% on our aerial imagery dataset.
Link: https://arxiv.org/abs/1810.03570
====================================================
Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling (Jaejin Cho - 4 October, 2018)
In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model as a prior model, and then port them towards 4 other BABEL languages using transfer learning approach. Experimental results show that the transfer learning approach from the multilingual model shows substantial gains over monolingual models across all 4 BABEL languages
Link: https://arxiv.org/abs/1810.03459
====================================================
Phrase-Based Attentions (Phi Xuan Nguyen - 30 September, 2018)
We incorporate our phrase-based attentions into the recently proposed Transformer network, and demonstrate that our approach yields improvements of 1.3 BLEU for English-to-German and 0.5 BLEU for German-to-English translation tasks on WMT newstest2014 using WMT'16 training data.
Link: https://arxiv.org/abs/1810.03444
====================================================
State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines (Christian Reul - 8 October, 2018)
The experiments show that training mixed models with real data is superior to training with synthetic data and that the novel OCR engine Calamari outperforms the other engines considerably, on average reducing ABBYYs character error rate (CER) by over 70%, resulting in an average CER below 1%.
Link: https://arxiv.org/abs/1810.03436
====================================================
Deep learning cardiac motion analysis for human survival prediction (Ghalib A. Bello - 8 October, 2018)
This dense motion model formed the input to a supervised denoising autoencoder (4Dsurvival), which is a hybrid network consisting of an autoencoder that learns a task-specific latent code representation trained on observed outcome data, yielding a latent representation optimised for survival prediction. In a study of 302 patients the predictive accuracy (quantified by Harrell's C-index) was significantly higher (p < .0001) for our model C=0.73 (95$\%$ CI: 0.68 - 0.78) than the human benchmark of C=0.59 (95$\%$ CI: 0.53 - 0.65)
Link: https://arxiv.org/abs/1810.03382
====================================================
Multi-Stream Opportunistic Network Decoupling: Relay Selection and Interference Management (Huifa Lin - 8 October, 2018)
For interference management, each source node sends $S \,(1 \le S \le M)$ data streams to selected relay nodes with random beamforming for the first hop, while each destination node receives its desired $S$ streams from the selected relay nodes via opportunistic interference alignment for the second hop, where $M$ is the number of antennas at each source or destination node
Link: https://arxiv.org/abs/1810.03298
====================================================
Guiding Intelligent Surveillance System by learning-by-synthesis gaze estimation (Tongtong Zhao - 8 October, 2018)
We show a significant improvement over using synthetic images, and achieve state-of-the-art results on various datasets including MPIIGaze dataset.
Link: https://arxiv.org/abs/1810.03286
====================================================
A look at the topology of convolutional neural networks (Rickard Brüel Gabrielsson - 7 October, 2018)
In this paper we use topological data analysis to investigate what various CNN's learn. We show that the weights of convolutional layers at depths from 1 through 13 learn simple global structures
Link: https://arxiv.org/abs/1810.03234
====================================================
SVIn2: Sonar Visual-Inertial SLAM with Loop Closure for Underwater Navigation (Sharmin Rahman - 7 October, 2018)
The state-of-the-art visual-inertial state estimation package OKVIS has been significantly augmented to accommodate acoustic data from sonar and depth measurements from pressure sensor, along with visual and inertial data in a non-linear optimization-based framework
Link: https://arxiv.org/abs/1810.03200
====================================================
Reinforcement Evolutionary Learning Method for self-learning (Kumarjit Pathak - 7 October, 2018)
Quantitative research is the most widely spread application of data science in Marketing or financial domain where applicability of state of the art reinforcement learning for auto-learning is less explored paradigm
Link: https://arxiv.org/abs/1810.03198
====================================================
NCARD: Improving Neighborhood Construction by Apollonius Region Algorithm based on Density (Shahin Pourbahrami - 7 October, 2018)
The proposed algorithm is more accurate than the state-of-the-art and well-known algorithms up to almost 8-13% in real and artificial data sets.
Link: https://arxiv.org/abs/1810.03084
====================================================
Online Center of Mass Estimation for a Humanoid Wheeled Inverted Pendulum Robot (Munzir Zafar - 6 October, 2018)
Experiments were performed on a 19 DoF WIP, in which we manually acquired the data for the learned set of poses and show that the mass model produced by a gradient descent produces a CoM estimate that improves overall control and efficiency
Link: https://arxiv.org/abs/1810.03076
====================================================
Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning (Frederik Ebert - 6 October, 2018)
Our real-world experiments demonstrate that a model trained with 160 robot hours of autonomously collected, unlabeled data is able to successfully perform complex manipulation tasks with a wide range of objects not seen during training.
Link: https://arxiv.org/abs/1810.03043
====================================================
Characterizing Deep-Learning I/O Workloads in TensorFlow (Steven W. D. Chien - 6 October, 2018)
The performance of Deep-Learning (DL) computing frameworks rely on the performance of data ingestion and checkpointing. We find that increasing the number of threads increases TensorFlow bandwidth by a maximum of 2.3x and 7.8x on our benchmark environments. The use of a burst buffer to checkpoint to a fast small capacity storage and copy asynchronously the checkpoints to a slower large capacity storage resulted in a performance improvement of 2.6x with respect to checkpointing directly to slower storage on our benchmark environment.
Link: https://arxiv.org/abs/1810.03035
====================================================
Gendered behavior as a disadvantage in open source software development (Balazs Vedres - 6 October, 2018)
Using data on entire careers of users from github.com, we develop a measure to capture the gendered pattern of behavior: We use a random forest prediction of being female (as opposed to being male) by behavioral choices in the level of activity, specialization in programming languages, and choice of partners. We find that 84.5% of women's disadvantage (compared to men) in success and 34.8% of their disadvantage in survival are due to the female pattern of their behavior
Link: https://arxiv.org/abs/1810.03005
====================================================
Camera Model Identification Using Convolutional Neural Networks (Artur Kuzin - 6 October, 2018)
In the current work, we describe our Deep Learning approach to the camera detection task of 10 cameras as a part of the Camera Model Identification Challenge hosted by Kaggle.com where our team finished 2nd out of 582 teams with the accuracy on the unseen data of 98%
Link: https://arxiv.org/abs/1810.02981
====================================================
Dissecting Apple's Meta-CDN during an iOS Update (Jeremias Blendin - 6 October, 2018)
Furthermore, by analyzing data from a European Eyeball ISP, we quantify third-party traffic offloading effects and find third-party CDNs increase their traffic by 438% while saturating seemingly unrelated links.
Link: https://arxiv.org/abs/1810.02978
====================================================
Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets (Abhijit Mahalunkar - 6 October, 2018)
At present, the state-of-the-art computational models across a range of sequential data processing tasks, including language modeling, are based on recurrent neural network architectures. Finally, we demonstrate how understanding the characteristics of the LDDs in a dataset can inform better hyper-parameter selection for current state-of-the-art recurrent neural architectures and also aid in understanding them...
Link: https://arxiv.org/abs/1810.02966
====================================================
Towards Self-Tuning Parameter Servers (Chris Liu - 6 October, 2018)
Nowadays, it is common to see industrial-strength machine learning jobs that involve millions of model parameters, terabytes of training data, and weeks of training. Experiments show that our techniques can reduce the completion times of a variety of long-running TensorFlow jobs from 1.4x to 18x.
Link: https://arxiv.org/abs/1810.02935
====================================================
POIReviewQA: A Semantically Enriched POI Retrieval and Question Answering Dataset (Gengchen Mai - 5 October, 2018)
To study the challenging task of semantically enriching POIs from unstructured data in order to support open-domain search and question answering (QA), we introduce a new dataset POIReviewQA. It consists of 20k questions (e.g."is this restaurant dog friendly?") for 1022 Yelp business types. For each question we sampled 10 reviews, and annotated each sentence in the reviews whether it answers the question and what the corresponding answer is. We build a Lucene-based baseline model, which achieves 77.0% AUC and 48.8% MAP. A sentence embedding-based model achieves 79.2% AUC and 41.8% MAP, indicating that the dataset presents a challenging problem for future research by the GIR community
Link: https://arxiv.org/abs/1810.02802
====================================================
RCCNet: An Efficient Convolutional Neural Network for Histological Routine Colon Cancer Nuclei Classification (Shabbeer Basha S H - 30 September, 2018)
The experiments are conducted over publicly available routine colon cancer histological dataset "CRCHistoPhenotypes". The proposed method has achieved a classification accuracy of 80.61% and 0.7887 weighted average F1 score
Link: https://arxiv.org/abs/1810.02797
====================================================
Automatic Detection of Arousals during Sleep using Multiple Physiological Signals (Saman Parvaneh - 5 October, 2018)
The data for each subject in the training set was split to 30-second epochs with no overlap. A total of 428 features from EEG, EMG, EOG, airflow, and SaO2 in each epoch were extracted and used for creating subject-specific models based on an ensemble of bagged classification trees, resulting in 943 models. For marking arousal and non-arousal regions in the test set, the data in the test set was split to 30-second epochs with 50% overlaps. Using the PhysioNet/CinC Challenge 2018 scoring criteria, AUPRCs of 0.25 and 0.21 were achieved for the in-house test and blind test sets, respectively.
Link: https://arxiv.org/abs/1810.02726
====================================================
Convex Clustering: Model, Theoretical Guarantee and Efficient Algorithm (Defeng Sun - 4 October, 2018)
Extensive numerical experiments on both simulated and real data demonstrate that our algorithm is highly efficient and robust for solving large-scale problems. In particular, our algorithm is able to solve a convex clustering problem with 200,000 points in $\mathbb{R}^3$ in about 6 minutes.
Link: https://arxiv.org/abs/1810.02677
====================================================
FingerVision Tactile Sensor Design and Slip Detection Using Convolutional LSTM Network (Yazhan Zhang - 5 October, 2018)
The data collection process takes advantage of the human sense of slip, during which human hand holds 12 daily objects, interacts with sensor skin and labels data with a slip or non-slip identity based on human feeling of slip. Our slip classification framework performs high accuracy of 97.62% on the test dataset
Link: https://arxiv.org/abs/1810.02653
====================================================
Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation (Xiao Pu - 5 October, 2018)
We first introduce three adaptive clustering algorithms for WSD, based on k-means, Chinese restaurant processes, and random walks, which are then applied to large word contexts represented in a low-rank space and evaluated on SemEval shared-task data. The improvements are above one BLEU point over strong NMT baselines, +4% accuracy over all ambiguous nouns and verbs, or +20% when scored manually over several challenging words.
Link: https://arxiv.org/abs/1810.02614
====================================================
A Comparative Survey of Optical Wireless Technologies: Architectures and Applications (Mostafa Zaman Chowdhury - 5 October, 2018)
A 100 Gb/s data rate has already been demonstrated through OWC. It offers services indoors as well as outdoors, and communication distances range from several nm to more than 10000 km
Link: https://arxiv.org/abs/1810.02594
====================================================
AIRNet: Self-Supervised Affine Registration for 3D Medical Images using Neural Networks (Evelyn Chee - 5 October, 2018)
But since it is costly to manually identify the transformation parameters between any two images, we leverage the abundance of cheap unlabelled data to generate a synthetic dataset for the training of the model. Experiments demonstrate that our approach achieves better overall performance on registration of images from different patients and modalities with 100x speed-up in execution time.
Link: https://arxiv.org/abs/1810.02583
====================================================
Towards High Resolution Video Generation with Progressive Growing of Sliced Wasserstein GANs (Dinesh Acharya - 4 October, 2018)
In addition, our model also reaches a record inception score of 14.57 in unsupervised action recognition dataset UCF-101.
Link: https://arxiv.org/abs/1810.02419
====================================================
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering (Hyeonwoo Noh - 3 October, 2018)
However, it is not straightforward how the visual concepts should be captured and transferred to visual question answering models due to missing link between question dependent answering models and visual data without question or task specification. We tackle this problem in two steps: 1) learning a task conditional visual classifier based on unsupervised task discovery and 2) transferring and adapting the task conditional visual classifier to visual question answering models
Link: https://arxiv.org/abs/1810.02358
====================================================
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding (Kexin Yi - 4 October, 2018)
First, executing programs on a symbolic space is more robust to long program traces; our model can solve complex reasoning tasks better, achieving an accuracy of 99.8% on the CLEVR dataset
Link: https://arxiv.org/abs/1810.02338
====================================================
Computer vision-based framework for extracting geological lineaments from optical remote sensing data (Ehsan Farahbakhsh - 4 October, 2018)
We test the proposed framework on Landsat 8 data of a mineral-rich portion of the Gascoyne Province in Western Australia using different dimension reduction techniques and convolutional filters
Link: https://arxiv.org/abs/1810.02320
====================================================
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks (Sanjeev Arora - 4 October, 2018)
We analyze speed of convergence to global optimum for gradient descent training a deep linear neural network (parameterized as $x\mapsto W_N \cdots W_1x$) by minimizing the $\ell_2$ loss over whitened data. Moreover, in the important case of output dimension 1, i.e
Link: https://arxiv.org/abs/1810.02281
====================================================
DATA Agent (Michael Cerny Green - 28 September, 2018)
Findings from a user study with 30 participants playing through two games of DATA Agent show that the game is easy and fun to play, and that the mysteries it generates are straightforward to solve.
Link: https://arxiv.org/abs/1810.02251
====================================================
Real Differences between OT and CRDT for Co-Editors (Chengzheng Sun - 4 October, 2018)
CRDT (Commutative Replicated Data Type) for co-editors was first proposed around 2006, under the name of WOOT (WithOut Operational Transformation)
Link: https://arxiv.org/abs/1810.02137
====================================================
Learning Finer-class Networks for Universal Representations (Julien Girard - 4 October, 2018)
Many real-world visual recognition use-cases can not directly benefit from state-of-the-art CNN-based approaches because of the lack of many annotated data. We show that our method learns more universal representations than state-of-the-art, leading to significantly better results on 10 target-tasks from multiple domains, using several network architectures, either alone or combined with networks learned at a coarser semantic level.
Link: https://arxiv.org/abs/1810.02126
====================================================
Longest Property-Preserved Common Factor (Lorraine A. K Ayad - 4 October, 2018)
In the first setting, we are given a string $x$ and we are asked to construct a data structure over $x$ answering the following type of on-line queries: given string $y$, find a longest square-free factor common to $x$ and $y$. In the second setting, we are given $k$ strings and an integer $1 < k'\leq k$ and we are asked to find a longest periodic factor common to at least $k'$ strings
Link: https://arxiv.org/abs/1810.02099
====================================================
FSS++ Workshop Report: Handling Uncertainty for Data Quality Management (Anna Wilbik - 4 October, 2018)
This report describes the results of the eSCF Awareness Workshop on Handling Uncertainty for Data Quality Management - Challenges from Transport and Supply Chain Management that was held on June 5, 2018 in Heeze, The Netherlands
Link: https://arxiv.org/abs/1810.02091
====================================================
Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA (Cheng Fu - 4 October, 2018)
Binarized Neural Network (BNN) removes bitwidth redundancy in classical CNN by using a single bit (-1/+1) for network parameters and intermediate representations, which has greatly reduced the off-chip data transfer and storage overhead. By analyzing local properties of images and the learned BNN kernel weights, we observe an average of $\sim$78% input similarity and $\sim$59% weight similarity among weight kernels, measured by our proposed metric in common network architectures
Link: https://arxiv.org/abs/1810.02068
====================================================
Gradient descent aligns the layers of deep linear networks (Ziwei Ji - 3 October, 2018)
This paper establishes risk convergence and asymptotic weight matrix alignment --- a form of implicit regularization --- of gradient flow and gradient descent when applied to deep linear networks on linearly separable data. In more detail, for gradient flow applied to strictly decreasing loss functions (with similar results for gradient descent with particular decreasing step sizes): (i) the risk converges to 0; (ii) the normalized i-th weight matrix asymptotically equals its rank-1 approximation $u_iv_i^{\top}$; (iii) these rank-1 matrices are aligned across layers, meaning $|v_{i+1}^{\top}u_i|\to1$
Link: https://arxiv.org/abs/1810.02032
====================================================
Transfer Incremental Learning using Data Augmentation (Ghouthi Boukli Hacene - 3 October, 2018)
Deep learning-based methods have reached state of the art performances, relying on large quantity of available data and computational power
Link: https://arxiv.org/abs/1810.02020
====================================================
Improving High Contention OLTP Performance via Transaction Scheduling (Guna Prasaad - 3 October, 2018)
We observe that most transactional workloads, including those with high contention, can be divided into clusters of data conflict-free transactions and a small set of residuals. We evaluate Strife against the optimistic concurrency control protocol and several variants of two-phase locking, where the latter is known to perform better than other concurrency protocols under high contention, and show that Strife can improve transactional throughput by up to 2x
Link: https://arxiv.org/abs/1810.01997
====================================================
The Blackbird Dataset: A large-scale dataset for UAV perception in aggressive flight (Amado Antonini - 3 October, 2018)
The Blackbird unmanned aerial vehicle (UAV) dataset is a large-scale, aggressive indoor flight dataset collected using a custom-built quadrotor platform for use in evaluation of agile perception.Inspired by the potential of future high-speed fully-autonomous drone racing, the Blackbird dataset contains over 10 hours of flight data from 168 flights over 17 flight trajectories and 5 environments at velocities up to $7.0ms^-1$
Link: https://arxiv.org/abs/1810.01987
====================================================
CoverBLIP: accelerated and scalable iterative matched-filtering for Magnetic Resonance Fingerprint reconstruction (Mohammad Golbabaee - 3 October, 2018)
Our further examinations on both synthetic and real-world datasets and using different sampling strategies, indicates between 2 to 3 orders of magnitude reduction in total search computations
Link: https://arxiv.org/abs/1810.01967
====================================================
CRED: A Deep Residual Network of Convolutional and Recurrent Units for Earthquake Signal Detection (S. Mostafa Mousavi - 3 October, 2018)
It learns the time-frequency characteristics of the dominant phases in an earthquake signal from three component data recorded on a single station. We train the network using 500,000 seismograms (250k associated with tectonic earthquakes and 250k identified as noise) recorded in Northern California and tested it with an F-score of 99.95. Our model is able to detect more than 700 microearthquakes as small as -1.3 ML induced during hydraulic fracturing far away than the training region
Link: https://arxiv.org/abs/1810.01965
====================================================
Machine Learning Suites for Online Toxicity Detection (David Noever - 3 October, 2018)
We systematically evaluate 62 classifiers representing 19 major algorithmic families against features extracted from the Jigsaw dataset of Wikipedia comments. Among 28 features of syntax, sentiment, emotion and outlier word dictionaries, a simple bad word list proves most predictive of offensive commentary.
Link: https://arxiv.org/abs/1810.01869
====================================================
Deep processing of structured data (Åukasz Maziarka - 3 October, 2018)
Moreover, its direct application to text and graph data allows to obtain results close to SOTA, by simpler networks with smaller number of parameters than competitive models.
Link: https://arxiv.org/abs/1810.01868
====================================================
Robust online identification of thermal models for in-production HPC clusters with machine learning-based data selection (Federico Pittino - 3 October, 2018)
However, we also show that: 1) not all real workloads allow for the identification of a good model; 2) starting from the theory of system identification it is very difficult to evaluate if a trace of data leads to a good estimated model. We also show that only via deep learning techniques these traces can be correctly chosen up to 96% of the times.
Link: https://arxiv.org/abs/1810.01865
====================================================
Task-Oriented Hand Motion Retargeting for Dexterous Manipulation Imitation (Dafni Antotsiou - 3 October, 2018)
Imitating those actions with dexterous hand models involves different important and challenging steps: acquiring human hand information, retargeting it to a hand model, and learning a policy from acquired data. We tackle the retargeting problem from the hand pose to a 29 DoF hand model by combining inverse kinematics and PSO with a task objective optimisation
Link: https://arxiv.org/abs/1810.01845
====================================================
Reinventing Data Stores for Video Analytics (Tiantu Xu - 3 October, 2018)
It streams video data from disks through decoder to operators and runs queries as fast as 362x of video realtime.
Link: https://arxiv.org/abs/1810.01794
====================================================
A Robot Localization Framework Using CNNs for Object Detection and Pose Estimation (Lukas Hoyer - 3 October, 2018)
Additionally, we propose a process to generate the necessary training data. The framework was evaluated with 3 different robot types and various identification patterns. We achieved up to 98% mAP@IOU0.5 and only 1.6° orientation error, running with a frame rate of 50 Hz on a GPU.
Link: https://arxiv.org/abs/1810.01665
====================================================
Extreme Augmentation : Can deep learning based medical image segmentation be trained using a single manually delineated scan? (Bilwaj Gaonkar - 3 October, 2018)
Almost every computer vision model trained on imaging data uses some form of augmentation. In the extreme, we observed that a model trained on patches extracted from just one scan, with each patch augmented 50 times; achieved a Dice score of 0.73 in a validation set of 40 cases. When the initial patches are extracted from nine scans the average Dice coefficient jumps to 0.86 and most of the false positives disappear. While this still falls short of state-of-the-art deep learning based segmentation of discs reported in literature, qualitative examination reveals that it does yield segmentation, which can be amended by expert clinicians with minimal effort to generate additional data for training improved deep models
Link: https://arxiv.org/abs/1810.01621
====================================================
A Deep Learning Architecture for De-identification of Patient Notes: Implementation and Evaluation (Kaung Khin - 2 October, 2018)
We test this architecture on two gold standard datasets and show that the architecture achieves state-of-the-art performance on both data sets while also converging faster than other systems without the use of dictionaries or other knowledge sources.
Link: https://arxiv.org/abs/1810.01570
====================================================
Deep Learning Based Caching for Self-Driving Car in Multi-access Edge Computing (Anselme Ndikumana - 2 October, 2018)
However, retrieving entertainment contents at the Data Center (DC) can hinder content delivery service due to high delay of car-to-DC communication. The simulation results show that the accuracy of our prediction for the contents need to be cached in the areas of the self-driving car is achieved at 98.04% and our approach can minimize delay.
Link: https://arxiv.org/abs/1810.01548
====================================================
An Analysis of Approaches Taken in the ACM RecSys Challenge 2018 for Automatic Music Playlist Continuation (Hamed Zamani - 2 October, 2018)
Given a playlist of arbitrary length with some additional meta-data, the task was to recommend up to 500 tracks that fit the target characteristics of the original playlist. In total, 113 teams submitted 1,228 runs to the main track; 33 teams submitted 239 runs to the creative track. The highest performing team in the main track achieved an R-precision of 0.2241, an NDCG of 0.3946, and an average number of recommended songs clicks of 1.784. In the creative track, an R-precision of 0.2233, an NDCG of 0.3939, and a click rate of 1.785 was obtained by the best team
Link: https://arxiv.org/abs/1810.01520
====================================================
Opinion Formation Threshold Estimates from Different Combinations of Social Media Data-Types (Derrik E. Asher - 2 October, 2018)
The present study estimates population opinion formation thresholds by querying 2222 participants about the number of various social media data-types (i.e., images, videos, and/or messages) that they would need to passively consume to form opinions. Opinion formation is assessed across three dimensions, 1) data-type(s), 2) context, and 3) source
Link: https://arxiv.org/abs/1810.01501
====================================================
Submodular Optimization in the MapReduce Model (Paul Liu - 2 October, 2018)
In practice, these problems often involve large amounts of data, and must be solved in a distributed way. In this paper, we present two simple algorithms for cardinality constrained submodular optimization in the MapReduce model: the first is a $(1/2-o(1))$-approximation in 2 MapReduce rounds, and the second is a $(1-1/e-ε)$-approximation in $\frac{1+o(1)}ε$ MapReduce rounds.
Link: https://arxiv.org/abs/1810.01489
====================================================
CELLO-3D: Estimating the Covariance of ICP in the Real World (David Landry - 2 October, 2018)
Then, we set out to estimate the covariance of ICP registrations through a data-driven approach, with over 5 100 000 registrations on 1020 pairs from real 3D point clouds
Link: https://arxiv.org/abs/1810.01470
====================================================
Unsupervised Machine Learning of Open Source Russian Twitter Data Reveals Global Scope and Operational Characteristics (Christopher Griffin - 2 October, 2018)
We developed and used a collection of statistical methods (unsupervised machine learning) to extract relevant information from a Twitter supplied data set consisting of alleged Russian trolls who (allegedly) attempted to influence the 2016 US Presidential election. Using natural language processing, manifold learning and Fourier analysis, we identify an operation that includes not only the 2016 US election, but also the French National and both local and national German elections
Link: https://arxiv.org/abs/1810.01466
====================================================
Efficient Dialog Policy Learning via Positive Memory Retention (Rui Zhao - 2 October, 2018)
However, the collection of the required data in form of conversations between chat-bots and human agents is time-consuming and expensive. We show that our method is 10 times more sample-efficient than policy gradients in extensive experiments on a new synthetic number guessing game
Link: https://arxiv.org/abs/1810.01371
====================================================
On Self Modulation for Generative Adversarial Networks (Ting Chen - 2 October, 2018)
While reminiscent of other conditioning techniques, it requires no labeled data. In a large-scale empirical study we observe a relative decrease of $5\%-35\%$ in FID. Furthermore, all else being equal, adding this modification to the generator leads to improved performance in $124/144$ ($86\%$) of the studied settings
Link: https://arxiv.org/abs/1810.01365
====================================================
Landmine Detection Using Autoencoders on Multi-polarization GPR Volumetric Data (Paolo Bestagini - 2 October, 2018)
Experiments conducted on real data show that the proposed technique requires little training and no ad-hoc data pre-processing to achieve accuracy higher than 93% on challenging datasets.
Link: https://arxiv.org/abs/1810.01316
====================================================
Findings of the E2E NLG Challenge (OndÅej DuÅ¡ek - 2 October, 2018)
The E2E NLG shared task aims to assess whether these novel approaches can generate better-quality output by learning from a dataset containing higher lexical richness, syntactic complexity and diverse discourse phenomena. We compare 62 systems submitted by 17 institutions, covering a wide range of approaches, including machine learning architectures -- with the majority implementing sequence-to-sequence models (seq2seq) -- as well as systems based on grammatical rules and templates.
Link: https://arxiv.org/abs/1810.01170
====================================================
Target Aware Network Adaptation for Efficient Representation Learning (Yang Zhong - 2 October, 2018)
Experimental results by the method on five datasets (Flower102, CUB200-2011, Dog120, MIT67, and Stanford40) show favorable accuracies over the related state-of-the-art techniques while enhancing the computational and storage efficiency of the transferred model.
Link: https://arxiv.org/abs/1810.01104
====================================================
A Unified Framework for Clustering Constrained Data without Locality Property (Hu Ding - 1 October, 2018)
The simplex lemma (or weaker simplex lemma) enables us to efficiently approximate the mean (or median) point of an unknown set of points by searching a small-size grid, independent of the dimensionality of the space, in a simplex (or the surrounding region of a simplex), and thus can be used to handle high dimensional data. If $k$ and $\frac{1}ε$ are fixed numbers, our framework generates, in nearly linear time ({\em i.e.,} $O(n(\log n)^{k+1}d)$), $O((\log n)^{k})$ $k$-tuple candidates for the $k$ mean or median points, and one of them induces a $(1+ε)$-approximation for $k$-CMeans or $k$-CMedian, where $n$ is the number of points
Link: https://arxiv.org/abs/1810.01049
====================================================
Reinforcement Learning with Perturbed Rewards (Jingkang Wang - 5 October, 2018)
Our framework draws upon approaches for supervised learning with noisy data. For instance, the state-of-the-art PPO algorithm is able to obtain 67.5% and 46.7% improvements in average on five Atari games, when the error rates are 10% and 30% respectively.
Link: https://arxiv.org/abs/1810.01032
====================================================
Fighting Against XSS Attacks: A Usability Evaluation of OWASP ESAPI Output Encoding (Chamila Wijayarathna - 1 October, 2018)
However, XSS still being ranked as one of the most critical vulnerabilities in web applications suggests that programmers are not effectively using those APIs to encode untrusted data. Therefore, we conducted an experimental study with 10 programmers where they attempted to fix XSS vulnerabilities of a web application using the output encoding functionality of OWASP ESAPI. Results revealed 3 types of mistakes that programmers made which resulted in them failing to fix the application by removing XSS vulnerabilities. We also identified 16 usability issues of OWASP ESAPI
Link: https://arxiv.org/abs/1810.01017
====================================================
Heterogeneous MacroTasking (HeMT) for Parallel Processing in the Public Cloud (Yuquan Shan - 1 October, 2018)
As representative results, Spark with HeMT offers about 10% better average completion times for realistic data processing workloads over the default system.
Link: https://arxiv.org/abs/1810.00988
====================================================
Efficient and Accurate Abnormality Mining from Radiology Reports with Customized False Positive Reduction (Nithya Attaluri - 1 October, 2018)
The difficulty is heightened for medical imaging, where data itself is limited in accessibility and labeling requires costly time and effort by trained medical specialists. Using this approach, we label more than 175,000 Head CT studies for the presence of 33 features indicative of 11 clinically relevant conditions. For 27 of the 30 keywords that yielded positive results (3 had no occurrences), the lower bound of the confidence intervals created to estimate the percentage of accurately labeled reports was above 85%, with the average being above 95%
Link: https://arxiv.org/abs/1810.00967
====================================================
RGB-D Object Detection and Semantic Segmentation for Autonomous Manipulation in Clutter (Max Schwarz - 1 October, 2018)
We evaluate our approach on two challenging data sets: one captured for the Amazon Picking Challenge 2016, where our team NimbRo came in second in the Stowing and third in the Picking task, and one captured in disaster-response scenarios
Link: https://arxiv.org/abs/1810.00818
====================================================
Improving the Generalization of Adversarial Training with Domain Adaptation (Chuanbiao Song - 1 October, 2018)
Empirical evaluations demonstrate that ATDA can greatly improve the generalization of adversarial training and achieves state-of-the-art results on standard benchmark datasets.
Link: https://arxiv.org/abs/1810.00740
====================================================
Automatic Data Expansion for Customer-care Spoken Language Understanding (Shahab Jalalvand - 27 September, 2018)
Theprocess starts with training a preliminary NLU model based on logistic regression on the in-domaindata. Using these n-grams, we find the samples in the out-of-domain corpus that1) contain the desired n-gram and/or 2) have similar intent label. Our results on two divergent experimental setups show that the proposed approachreduces by 30% the absolute classification error rate (CER) comparing to the preliminary modelsand it significantly outperforms the traditional data expansion algorithms such as the ones based onsemi-supervised learning, TF-IDF and embedding vectors.
Link: https://arxiv.org/abs/1810.00670
====================================================
Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection (Sudhanshu Kasewa - 26 September, 2018)
Our approach yields error-filled artificial data that helps a vanilla bi-directional LSTM to outperform the previous state of the art at grammatical error detection, and a previously introduced model to gain further improvements of over 5% $F_{0.5}$ score. When attempting to determine if a given sentence is synthetic, a human annotator at best achieves 39.39 $F_1$ score, indicating that our model generates mostly human-like instances.
Link: https://arxiv.org/abs/1810.00668
====================================================
Perfect Match: A Simple Method for Learning Representations For Counterfactual Inference With Neural Networks (Patrick Schwab - 3 October, 2018)
Our experiments demonstrate that PM outperforms a number of more complex state-of-the-art methods in inferring counterfactual outcomes across several real-world and semi-synthetic datasets.
Link: https://arxiv.org/abs/1810.00656
====================================================
One-Click Annotation with Guided Hierarchical Object Detection (Adithya Subramanian - 1 October, 2018)
The experiment conducted on PASCAL VOC dataset revealed that annotation created from our approach achieves a mAP of 0.995 and a recall of 0.903. The Our Approach has shown an overall improvement by 8.5%, 18.6% in mean average precision and recall score for KITTI and 69.6%, 36% for CITYSCAPES dataset
Link: https://arxiv.org/abs/1810.00609
====================================================
Unsupervised Trajectory Segmentation and Promoting of Multi-Modal Surgical Demonstrations (Zhenzhou Shao - 1 October, 2018)
Extensive experiments on a public dataset JIGSAWS show that our method achieves much higher accuracy of segmentation than state-of-the-art methods in the shorter time.
Link: https://arxiv.org/abs/1810.00599
====================================================
Generative Adversarial Network for Medical Images (MI-GAN) (Talha Iqbal - 1 October, 2018)
The proposed model achieves a dice coefficient of 0.837 on STARE dataset and 0.832 on DRIVE dataset which is state-of-the-art performance on both the datasets.
Link: https://arxiv.org/abs/1810.00551
====================================================
End-To-End Alzheimer's Disease Diagnosis and Biomarker Identification (Soheil Esmaeilzadeh - 1 October, 2018)
Our model can diagnose AD with an accuracy of 94.1\% on the popular ADNI dataset using only MRI data, which outperforms the previous state-of-the-art
Link: https://arxiv.org/abs/1810.00523
====================================================
Hybrid Noise Removal in Hyperspectral Imagery With a Spatial-Spectral Gradient Network (Qiang Zhang - 30 September, 2018)
The simulated and real-data experiments undertaken in this study confirmed that the proposed SSGN performs better at mixed noise removal than the other state-of-the-art HSI denoising algorithms, in evaluation indices, visual assessments, and time consumption.
Link: https://arxiv.org/abs/1810.00495
====================================================
AgriColMap: Aerial-Ground Collaborative 3D Mapping for Precision Farming (Ciro Potena - 30 September, 2018)
We evaluate our system using real world data for 3 fields with different crop species
Link: https://arxiv.org/abs/1810.00457
====================================================
Optical Illusions Images Dataset (Robert Max Williams - 30 September, 2018)
In this paper we present a dataset of 6725 illusion images gathered from two websites, and a smaller dataset of 500 hand-picked images
Link: https://arxiv.org/abs/1810.00415
====================================================
Resource Management in Fog/Edge Computing: A Survey (Cheol-Ho Hong - 29 September, 2018)
Contrary to using distant and centralized cloud data center resources, employing decentralized resources at the edge of a network for processing data closer to user devices, such as smartphones and tablets, is an upcoming computing paradigm, referred to as fog/edge computing. This article reviews publications as early as 1991, with 85% of the publications between 2013-2018, to identify and classify the architectures, infrastructure, and underlying algorithms for managing resources in fog/edge computing.
Link: https://arxiv.org/abs/1810.00305
====================================================
DIMENSION: Dynamic MR Imaging with Both K-space and Spatial Prior Knowledge Obtained via Multi-Supervised Network Training (Shanshan Wang - 9 October, 2018)
The comparisons with k-t FOCUSS, k-t SLR, L+S and the state-of-the-art CNN method on in vivo datasets show our method can achieve improved reconstruction results in shorter time.
Link: https://arxiv.org/abs/1810.00302
====================================================
Tithonus: A Bitcoin Based Censorship Resilient System (Ruben Recabarren - 29 September, 2018)
When compared to state-of-the-art Bitcoin writing solutions, Tithonus reduces the cost of transferring data to censored clients by 2 orders of magnitude and increases the goodput by 3 to 5 orders of magnitude
Link: https://arxiv.org/abs/1810.00279
====================================================
MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling (PaweÅ Budzianowski - 29 September, 2018)
To address this fundamental obstacle, we introduce the Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written conversations spanning over multiple domains and topics. At a size of $10$k dialogues, it is at least one order of magnitude larger than all previous annotated task-oriented corpora
Link: https://arxiv.org/abs/1810.00278
====================================================
Towards Better Summarizing Bug Reports with Crowdsourcing Elicited Attributes (He Jiang - 28 September, 2018)
Then, we propose a new method named Crowd-Attribute to infer new effective attributes from the crowdgenerated data in crowdsourcing and develop a new tool named Crowdsourcing Software Engineering Platform to facilitate this method. With Crowd-Attribute, we successfully construct 11 new attributes and propose a new supervised algorithm named Logistic Regression with Crowdsourced Attributes (LRCA). Experiments over both the public data set SDS with 36 manually annotated bug reports and new large-scale data sets demonstrate that LRCA can consistently outperform the state-of-the-art algorithms for bug report summarization.
Link: https://arxiv.org/abs/1810.00125
====================================================
Open-Ended Content-Style Recombination Via Leakage Filtering (Karl Ridgeway - 28 September, 2018)
Using this method for data-set augmentation, we obtain state-of-the-art performance on few-shot learning tasks.
Link: https://arxiv.org/abs/1810.00110
====================================================
Cell Grid Architecture for Maritime Route Prediction on AIS Data Streams (Ciprian Amariei - 28 September, 2018)
The 2018 Grand Challenge targets the problem of accurate predictions on data streams produced by automatic identification system (AIS) equipment, describing naval traffic
Link: https://arxiv.org/abs/1810.00090
====================================================
Active Fairness in Algorithmic Decision Making (Alejandro Noriega-Campero - 28 September, 2018)
We show on real-world datasets that these can achieve: 1) calibration and single error parity (e.g., equal opportunity); and 2) parity in both false positive and false negative rates (i.e., equal odds)
Link: https://arxiv.org/abs/1810.00031
====================================================
Universal and Dynamic Locally Repairable Codes with Maximal Recoverability via Sum-Rank Codes (Umberto MartÃnez-Peñas - 28 September, 2018)
Furthermore, the local linear codes (thus the localities, local distances and local fields) can be efficiently and dynamically modified without global recoding or changes in architecture or outer code, while preserving MR, easily adapting to new hot and cold data. Reed-Solomon codes with local replication and Cartesian products are recovered from the given construction when $ r=1 $ and $ h = 0 $, respectively
Link: https://arxiv.org/abs/1809.11158
====================================================
Learning Recurrent Binary/Ternary Weights (Arash Ardakani - 28 September, 2018)
Recurrent neural networks (RNNs) have shown excellent performance in processing sequence data. Ultimately, we show that LSTMs with binary/ternary weights can achieve up to 12x memory saving and 10x inference speedup compared to the full-precision implementation on an ASIC platform.
Link: https://arxiv.org/abs/1809.11086
====================================================
Reuse and Adaptation for Entity Resolution through Transfer Learning (Saravanan Thirumuruganathan - 28 September, 2018)
Entity resolution (ER) is one of the fundamental problems in data integration, where machine learning (ML) based classifiers often provide the state-of-the-art results. We have performed comprehensive experiments on 12 datasets from 5 different domains (publications, movies, songs, restaurants, and books)
Link: https://arxiv.org/abs/1809.11084
====================================================
Can female fertility management mobile apps be sustainable and contribute to female health care? Harnessing the power of patient generated data ; Analysis of the organizations active in this e-Health segment (Maki Miyamoto - 28 September, 2018)
These patient-generated data (PGD) reflect patients everyday behaviors including physical activity, mood, diet, sleep, and symptoms. The research question: Can female fertility management mobile apps be sustainable and contribute to female health care, is researched by a combination of academic literature study, testing of 7 essential hypotheses, and a limited user driven experimental demand analysis
Link: https://arxiv.org/abs/1809.11042
====================================================
CNNs Fusion for Building Detection in Aerial Images for the Building Detection Challenge (Rémi Delassus - 28 September, 2018)
We enhanced the SpaceNet Challenge winning solution by proposing a new fusion strategy based on a deep combiner using segmentation both results of different CNN and input data to segment. Segmentation results for all cities have been significantly improved (between 1% improvement over the baseline for the smallest one to more than 7% for the largest one)
Link: https://arxiv.org/abs/1809.10976
====================================================
Domain Generalization with Domain-Specific Aggregation Modules (Antonio D'Innocente - 28 September, 2018)
Experiments on two different benchmark databases show the power of our approach, reaching the new state of the art in domain generalization.
Link: https://arxiv.org/abs/1809.10966
====================================================
Pull-based Bloom Filter-based Routing for Information-Centric Networks (Ali Marandi - 28 September, 2018)
In Named Data Networking (NDN), there is a need for routing protocols to populate Forwarding Information Base (FIB) tables so that the Interest messages can be forwarded. Bloom Filter-based Routing approaches like BFR [1], use Bloom Filters (BFs) to advertise all provided content objects, which consumes valuable bandwidth and storage resources
Link: https://arxiv.org/abs/1809.10948
====================================================
cISP: A Speed-of-Light Internet Service Provider (Debopam Bhattacherjee - 10 October, 2018)
We thus explore the design of cost-effective wide-area networks that move data over paths very close to great-circle paths, at speeds very close to the speed of light in vacuum. We show that instantiations of cISP across the contiguous United States and Europe would achieve mean latencies within 5% of that achievable using great-circle paths at the speed of light, over medium and long distances
Link: https://arxiv.org/abs/1809.10897
====================================================
A model for system developers to measure the privacy risk of data (Awanthika Senarath - 28 September, 2018)
In this paper, we propose a model that could be used by system developers to measure the privacy risk perceived by users when they disclose data into software systems. We first derive a model to measure the perceived privacy risk based on existing knowledge and then we test our model through a survey with 151 participants
Link: https://arxiv.org/abs/1809.10884
====================================================
Graph Generation via Scattering (Dongmian Zou - 28 September, 2018)
These results are in contrast to experience with Euclidean data, where it is difficult to form a generative scattering network that performs as well as state-of-the-art methods
Link: https://arxiv.org/abs/1809.10851
====================================================
Generative Adversarial Active Learning for Unsupervised Outlier Detection (Yezheng Liu - 27 September, 2018)
We empirically compare the proposed approach with several state-of-the-art outlier detection methods on both synthetic and real-world datasets
Link: https://arxiv.org/abs/1809.10816
====================================================
FanStore: Enabling Efficient and Scalable I/O for Distributed Deep Learning (Zhao Zhang - 27 September, 2018)
With the techniques of system call interception, distributed metadata management, and generic data compression, FanStore provides a POSIX-compliant interface with native hardware throughput in an efficient and scalable manner. Our experiments with benchmarks and real applications show that FanStore can scale DL training to 512 compute nodes with over 90\% scaling efficiency.
Link: https://arxiv.org/abs/1809.10799
====================================================
Estimation of Personalized Effects Associated With Causal Pathways (Razieh Nabi - 27 September, 2018)
For example, we may wish to maximize the chemical effect of a drug given data from an observational study where the chemical effect of the drug on the outcome is entangled with the indirect effect mediated by differential adherence. [16] shows how to combine mediation analysis and dynamic treatment regime ideas to defines policies associated with causal pathways and counterfactual responses to these policies
Link: https://arxiv.org/abs/1809.10791
====================================================
Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects (Jonathan Tremblay - 27 September, 2018)
Using synthetic data generated in this manner, we introduce a one-shot deep neural network that is able to perform competitively against a state-of-the-art network trained on a combination of real and synthetic data. To our knowledge, this is the first deep network trained only on synthetic data that is able to achieve state-of-the-art performance on 6-DoF object pose estimation
Link: https://arxiv.org/abs/1809.10790
====================================================
Semantic Topic Analysis of Traffic Camera Images (Jeffrey Liu - 27 September, 2018)
We apply the Latent Dirichlet Allocation (LDA) topic model to decompose the label data into a small number of semantic topics. To illustrate our approach, we use freeway camera images collected from the Boston area between December 2017-January 2018
Link: https://arxiv.org/abs/1809.10707
====================================================
dynamicMF: A Matrix Factorization Approach to Monitor Resource Usage in High Performance Computing Systems (Niyazi Sorkunlu - 26 September, 2018)
Results on resource usage data collected from the Lonestar 4 system at the Texas Advanced Computing Center show that the identified anomalies are correlated with actual anomalous events reported in the system log messages.
Link: https://arxiv.org/abs/1809.10624
====================================================
Acoustic Probing for Estimating the Storage Time and Firmness of Tomatoes and Mandarin Oranges (Hidetomo Kataoka - 27 September, 2018)
We performed cross validation by using this data set. The average estimation errors of storage time and firmness for tomatoes were 0.89 days and 9.47 g/mm2. Those for mandarin oranges were 1.67 days and 15.67 g/mm2
Link: https://arxiv.org/abs/1809.10581
====================================================
Closest-Pair Queries in Fat Rectangles (Sang Won Bae - 27 September, 2018)
In the range closest pair problem, we want to construct a data structure storing a set $S$ of $n$ points in the plane, such that for any axes-parallel query rectangle $R$, the closest pair in the set $R \cap S$ can be reported. The currently best result for this problem is by Xue et al.~(SoCG 2018)
Link: https://arxiv.org/abs/1809.10531
====================================================
No New-Net (Fabian Isensee - 27 September, 2018)
By incorporating region based training, additional training data and a simple postprocessing technique, we obtain dice scores of 81.01, 90.83 and 85.44 and Hausdorff Distances (95th percentile) of 2.54, 4.97 and 7.
Link: https://arxiv.org/abs/1809.10483
====================================================
Sample Efficient Adaptive Text-to-Speech (Yutian Chen - 27 September, 2018)
The experiments show that these approaches are successful at adapting the multi-speaker neural network to new speakers, obtaining state-of-the-art results in both sample naturalness and voice similarity with merely a few minutes of audio data from new speakers.
Link: https://arxiv.org/abs/1809.10460
====================================================
Queue-based Resampling for Online Class Imbalance Learning (Kleanthis Malialis - 27 September, 2018)
Results on two popular benchmark datasets demonstrate the effectiveness of queue-based resampling over state-of-the-art methods in terms of learning speed and quality.
Link: https://arxiv.org/abs/1809.10388
====================================================
Deeply Informed Neural Sampling for Robot Motion Planning (Ahmed H. Qureshi - 26 September, 2018)
DeepSMP's neural architecture comprises of a Contractive AutoEncoder which encodes given workspaces directly from a raw point cloud data, and a Dropout-based stochastic deep feedforward neural network which takes the workspace encoding, start and goal configuration, and iteratively generates feasible samples for SMPs to compute end-to-end collision-free optimal paths. The results show that on average our method is at least 7 times faster in point-mass and rigid-body case and about 28 times faster in 6-link robot case than the existing state-of-the-art.
Link: https://arxiv.org/abs/1809.10252
====================================================
Classifying Mammographic Breast Density by Residual Learning (Jingxu Xu - 21 September, 2018)
The proposed method was instantiated with the INbreast dataset and classification accuracies of 92.6% and 96.8% were obtained for the four BI-RADS (Breast Imaging and Reporting Data System) category task and the two BI-RADS category task,respectively
Link: https://arxiv.org/abs/1809.10241
====================================================
Left Ventricle Segmentation and Quantification from Cardiac Cine MR Images via Multi-task Learning (Shusil Dangi - 26 September, 2018)
We performed a five fold cross-validation of the myocardium segmentation obtained from the proposed multi-task network on 97 patient 4-dimensional cardiac cine-MRI datasets available through the STACOM LV segmentation challenge against the provided gold-standard myocardium segmentation, obtaining a Dice overlap of $0.849 \pm 0.036$ and mean surface distance of $0.274 \pm 0.083$ mm, while simultaneously estimating the myocardial area with mean absolute difference error of $205\pm198$ mm$^2$.
Link: https://arxiv.org/abs/1809.10221
====================================================
Unsupervised Adversarial Invariance (Ayush Jaiswal - 26 September, 2018)
Our unsupervised model outperforms state-of-the-art methods, which are supervised, at inducing invariance to inherent nuisance factors, effectively using synthetic data augmentation to learn invariance, and domain adaptation
Link: https://arxiv.org/abs/1809.10083
====================================================
Learning short-term past as predictor of human behavior in commercial buildings (Romana Markovic - 17 September, 2018)
The addressed sequence duration was in the range between 30 and 240 time-steps of indoor climate data, where the applied temporal discretization was one minute. The results pointed out, that the optimal predictive performance was achieved for the case where 60 time-steps of the indoor climate data were used as input. The analysis of the prediction accuracy in the form of F1 score for the different time-lag of future window states dropped from 0.51 to 0.27, when shifting the prediction target from 10 to 60 minutes in future.
Link: https://arxiv.org/abs/1809.10020
====================================================
A Novel Online Stacked Ensemble for Multi-Label Stream Classification (Alican Büyükçakır - 26 September, 2018)
We conduct experiments with 4 GOOWE-ML-based multi-label ensembles and 7 baseline models on 7 real-world datasets from diverse areas of interest
Link: https://arxiv.org/abs/1809.09994
====================================================
Satellite Imagery Multiscale Rapid Detection with Windowed Networks (Adam Van Etten - 24 September, 2018)
The proposed approach allows comparison of the performance of these four frameworks, and can rapidly detect objects of vastly different scales with relatively little training data over multiple sensors. airplanes versus airports) we find that using two different detectors at different scales is very effective with negligible runtime cost.We evaluate large test images at native resolution and find mAP scores of 0.2 to 0.8 for vehicle localization, with the YOLT architecture achieving both the highest mAP and fastest inference speed.
Link: https://arxiv.org/abs/1809.09978
====================================================
Morphed Learning: Towards Privacy-Preserving for Deep Learning Based Applications (Juncheng Shen - 20 September, 2018)
Theoretical analyses on CIFAR-10 dataset and VGG-16 network show that our method is capable of providing 10^89 morphing possibilities with only 5% computational overhead and 10% transmission overhead under limited knowledge attack scenario
Link: https://arxiv.org/abs/1809.09968
====================================================
Time-Series Prediction of Proximal Aggression Onset in Minimally-Verbal Youth with Autism Spectrum Disorder Using Physiological Biosignals (Ozan Ozdenizci - 14 September, 2018)
We implement ridge-regularized logistic regression models on physiological biosensor data wirelessly recorded from 15 MV-ASD youth over 64 independent naturalistic observations in a hospital inpatient unit. Our results demonstrate proof-of-concept, feasibility, and incipient validity predicting aggression onset 1 minute before it occurs using global, person-dependent, and hybrid classifier models.
Link: https://arxiv.org/abs/1809.09948
====================================================
GPU Accelerated Similarity Self-Join for Multi-Dimensional Data (Michael Gowanlock - 26 September, 2018)
Across most scenarios on real-world and synthetic datasets, our algorithm outperforms the parallel state-of-the-art approach
Link: https://arxiv.org/abs/1809.09930
====================================================
Performance and sensitivities of home detection from mobile phone data (Maarten Vanhoof - 26 September, 2018)
In this paper, we present an extensive empirical analysis of home detection methods when performed on a nation-wide mobile phone dataset from France. We analyze the validity of 9 different Home Detection Algorithms (HDAs), and we assess different sources of uncertainty. Based on 225 different set-ups for the home detection of around 18 million users we discuss different measures for validation and investigate sensitivity to user choices such as HDA parameter choice and observation period restriction. Our findings show that nation-wide performance of home detection is moderate at best, with correlations to ground truth maximizing at 0.60 only
Link: https://arxiv.org/abs/1809.09911
====================================================
Active Learning for Deep Object Detection (Clemens-Alexander Brust - 26 September, 2018)
All methods are evaluated systematically in a continuous exploration context on the PASCAL VOC 2012 dataset.
Link: https://arxiv.org/abs/1809.09875
====================================================
Deep contextualized word representations for detecting sarcasm and irony (Suzana IliÄ - 25 September, 2018)
We test our model on 7 different datasets derived from 3 different data sources, providing state-of-the-art performance in 6 of them, and otherwise offering competitive results.
Link: https://arxiv.org/abs/1809.09795
====================================================
Surface Type Estimation from GPS Tracked Bicycle Activities (Nitish Nag - 25 September, 2018)
In this work, we use a computationally inexpensive and simple method by using only GPS data from a human powered cyclist. We show in our methods, the decision trees performed the best with an accuracy of 86\%
Link: https://arxiv.org/abs/1809.09745
====================================================
Optimizing the Human-Machine Partnership with Zooniverse (Lucy Fortson - 25 September, 2018)
With over 120 projects built reaching nearly 1.7 million volunteers, the Zooniverse.org platform has led the way in the application of Citizen Science as a method for closing the Big Data analysis gap. Since the launch in 2007 of the Galaxy Zoo project, the Zooniverse platform has enabled significant contributions across many disciplines; e.g., in ecology, humanities, and astronomy. To cope with the larger datasets looming on the horizon such as astronomy's Large Synoptic Survey Telescope (LSST) or the 100's of TB from ecology projects annually, Zooniverse has been researching a system design that is optimized for efficiency in task assignment and incorporating human and machine classifiers into the classification engine
Link: https://arxiv.org/abs/1809.09738
====================================================
Security and Performance Considerations in ROS 2: A Balancing Act (Jongkil Kim - 24 September, 2018)
Robot Operating System (ROS) 2 is a ground-up re-design of ROS 1 to support performance critical cyber-physical systems (CPSs) using the Data Distribution Service (DDS) middleware. Accordingly, the security of ROS 2 is highly reliant on the security of its DDS communication protocol. To accomplish this, we evaluate the latency and throughput of the communication protocols of ROS 2 in both wired and wireless networks, and measure the efficiency loss caused by the enabling of security protocols such as Virtual Private Network (VPN) and DDS security protocol in ROS 2 in both network setups. The result can be directly used by robotics developers to find the optimal and balanced settings of ROS 2 applications. The results of this work can be used to enhance the security of ROS 2.
Link: https://arxiv.org/abs/1809.09566
====================================================
Fine-Tuning VGG Neural Network For Fine-grained State Recognition of Food Images (Kaoutar Ben Ahmed - 8 September, 2018)
A small-scale dataset consisting of 5978 images of seven categories was constructed and annotated manually
Link: https://arxiv.org/abs/1809.09529
====================================================
Antilizer: Run Time Self-Healing Security for Wireless Sensor Networks (Ivana Tomic - 25 September, 2018)
Our results show that Antilizer reduces data loss down to 1% (4% on average), with operational overheads of less than 1% and provides fast network-wide convergence.
Link: https://arxiv.org/abs/1809.09426
====================================================
RapidHARe: A computationally inexpensive method for real-time human activity recognition from wearable sensors (Roman Chereshnev - 25 September, 2018)
Here, we present a new method called RapidHARe for real-time human activity recognition based on modeling the distribution of a raw data in a half-second context window using dynamic Bayesian networks. Moreover, in performance, RapidHare achieves an F1 score of 94.27\% and accuracy of 98.94\%, and when compared to ANN, RNN, HMM, it reduces the F1-score error rate by 45\%, 65\%, and 63\% and the accuracy error rate by 41\%, 55\%, and 62\%, respectively
Link: https://arxiv.org/abs/1809.09412
====================================================
Pre and Post-hoc Diagnosis and Interpretation of Malignancy from Breast DCE-MRI (Gabriel Maicas - 25 September, 2018)
Relying on experiments on a breast DCE-MRI dataset that contains scans of 117 patients, our results show that the post-hoc method is more accurate for diagnosing the whole volume per patient, achieving an AUC of 0.91, while the pre-hoc method achieves an AUC of 0.81
Link: https://arxiv.org/abs/1809.09404
====================================================
An Efficient Framework for Implementing Persist Data Structures on Remote NVM (Teng Ma - 25 September, 2018)
Specifically, thanks to operation batching, local memory caching and efficient concurrency control, the throughput of operations on eight widely used data structures is improved by 6$\sim$22 $\times$ without lowering the consistency promising.
Link: https://arxiv.org/abs/1809.09395
====================================================
Why scatter plots suggest causality, and what we can do about it (Carl T. Bergstrom - 25 September, 2018)
To avoid suggesting a causal relationship between the x and y values in a scatter plot, we propose a new type of data visualization, the diamond plot. Diamond plots are essentially 45 degree rotations of ordinary scatter plots; by visually jarring the viewer they clearly indicate that she should not draw the usual distinction between independent/predictor variable and dependent/response variable
Link: https://arxiv.org/abs/1809.09328
====================================================
Object Detection from Scratch with Deep Supervision (Zhiqiang Shen - 24 September, 2018)
We evaluate our method on PASCAL VOC 2007, 2012 and COCO datasets
Link: https://arxiv.org/abs/1809.09294
====================================================
Covfefe: A Computer Vision Approach For Estimating Force Exertion (Vaneet Aggarwal - 24 September, 2018)
Based on the data collected from 20 subjects, features extracted from the face videos give 90\% accuracy in classification among the 100\% and the combination of 0\% and 50\% datasets
Link: https://arxiv.org/abs/1809.09293
====================================================
Tunable Measures for Information Leakage and Applications to Privacy-Utility Tradeoffs (Jiachun Liao - 24 September, 2018)
This measure quantifies the maximal gain of an adversary in refining a tilted version of its posterior belief of any (potentially random) function of a data set conditioning on a released data set. For $α\in(1,\infty)$ this measure is shown to be the Arimoto channel capacity. We show that under a hard distortion constraint, both the optimal mechanism and the optimal tradeoff are invariant for any $α>1$, and the tunable leakage measure only behaves as either of the two extrema, i.e., mutual information for $α=1$ and maximal leakage for $α=\infty$.
Link: https://arxiv.org/abs/1809.09231
====================================================
Towards Automated Post-Earthquake Inspections with Deep Learning-based Condition-Aware Models (Vedhus Hoskere - 24 September, 2018)
Researchers typi-cally envisage the use of unmanned aerial vehicles (UAV) for data acquisition and computer vision for data processing to extract actionable information. The proposed methodology was implemented on a damaged building that was sur-veyed by the authors after the Central Mexico Earthquake in September 2017 and qualitative-ly evaluated
Link: https://arxiv.org/abs/1809.09195
====================================================
Deep Confidence: A Computationally Efficient Framework for Calculating Reliable Errors for Deep Neural Networks (Isidro Cortes-Ciriano - 24 September, 2018)
Using a set of 24 diverse IC50 data sets from ChEMBL 23, we show that Snapshot Ensembles perform on par with Random Forest (RF) and ensembles of independently trained deep neural networks
Link: https://arxiv.org/abs/1809.09060
====================================================
Lexical Bias In Essay Level Prediction (Georgios Balikas - 21 September, 2018)
In this work I present the system "balikasg" that achieved the state-of-the-art performance in the CAp 2018 data science challenge among 14 systems
Link: https://arxiv.org/abs/1809.08935
====================================================
Robotics Rights and Ethics Rules (Tuncay Yigit - 24 September, 2018)
With industry 4.0, the internet of things, data analysis and automation have begun to be of great importance in our lives. With the Yapanese version of Industry 5.0, it has come to our attention that machine-human interaction and human intelligence are working in harmony with the cognitive computer
Link: https://arxiv.org/abs/1809.08885
====================================================
Classify, predict, detect, anticipate and synthesize: Hierarchical recurrent latent variable models for human activity modeling (Judith Bütepage - 24 September, 2018)
We train our models on data extracted from depth image streams from the Cornell Activity 120, the UTKinect-Action3D and the Stony Brook University Kinect Interaction Dataset
Link: https://arxiv.org/abs/1809.08875
====================================================
Person Identification using Seismic Signals generated from Footfalls (Bodhibrata Mukhopadhyay - 24 September, 2018)
We have tested our biometric system on an indigenous database (created by us) containing 46000 footfall events from 8 individuals and achieved an accuracy of 73%, 90% and 95% in case of 1, 5 and 10 footsteps per sample. DS8BP compresses the original footfall events (sampled at 8 kHz) by a factor of 108 and also acts as a smoothing filter
Link: https://arxiv.org/abs/1809.08783
====================================================
Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting (Katharina Kann - 23 September, 2018)
On a 52-language benchmark dataset, we outperform the previous state of the art by up to 9.71% absolute accuracy.
Link: https://arxiv.org/abs/1809.08733
====================================================
Recognizing Film Entities in Podcasts (Ahmet Salih Gundogdu - 23 September, 2018)
Evaluating on a diverse set of podcasts, we demonstrate more than a 20% increase in F1 score across three baseline approaches when combining fuzzy-matching with a linear model aware of film-specific metadata.
Link: https://arxiv.org/abs/1809.08711
====================================================
Textually Enriched Neural Module Networks for Visual Question Answering (Khyathi Raghavi Chandu - 23 September, 2018)
We achieve 57.1% overall accuracy on the test-dev open-ended questions from the visual question answering (VQA 1.0) real image dataset.
Link: https://arxiv.org/abs/1809.08697
====================================================
Curvilinear Structure Enhancement by Multiscale Top-Hat Tensor in 2D/3D Images (Shuaa S. Alharbi - 23 September, 2018)
The proposed approach is validated on synthetic and real data and is also compared to the state-of-the-art approaches
Link: https://arxiv.org/abs/1809.08678
====================================================
Detecting Hate Speech and Offensive Language on Twitter using Machine Learning: An N-gram and TFIDF based Approach (Aditya Gaydhani - 23 September, 2018)
After tuning the model giving the best results, we achieve 95.6% accuracy upon evaluating it on test data
Link: https://arxiv.org/abs/1809.08651
====================================================
BrainNet: A Multi-Person Brain-to-Brain Interface for Direct Collaboration Between Brains (Linxing Jiang - 23 September, 2018)
Two of the three subjects are "Senders" whose brain signals are decoded using real-time EEG data analysis to extract decisions about whether to rotate a block in a Tetris-like game before it is dropped to fill a line. Five groups of three subjects successfully used BrainNet to perform the Tetris task, with an average accuracy of 0.813
Link: https://arxiv.org/abs/1809.08632
====================================================
Understanding the Gist of Images - Ranking of Concepts for Multimedia Indexing (Lydia Weiland - 23 September, 2018)
Nowadays, where multimedia data is continuously generated, stored, and distributed, multimedia indexing, with its purpose of group- ing similar data, becomes more important than ever. Finally, with a MAP of 61.42, it can be shown that the multimedia in- dexing task benefits from understanding the gist
Link: https://arxiv.org/abs/1809.08593
====================================================
Multi-View Picking: Next-best-view Reaching for Improved Grasping in Clutter (Douglas Morrison - 23 September, 2018)
Where other approaches use a static camera position or fixed data collection routines, our Multi-View Picking (MVP) controller uses an active perception approach to choose informative viewpoints based directly on a distribution of grasp pose estimates in real time, reducing uncertainty in the grasp poses caused by clutter and occlusions. In trials of grasping 20 objects from clutter, our MVP controller achieves 80% grasp success, outperforming a single-viewpoint grasp detector by 12%
Link: https://arxiv.org/abs/1809.08564
====================================================
Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization (Bao Wang - 22 September, 2018)
This data-dependent activation function remarkably improves both classification accuracy and stability to adversarial perturbations. Together with the total variation minimization of adversarial images and augmented training, under the strongest attack, we achieve up to 20.6$\%$, 50.7$\%$, and 68.7$\%$ accuracy improvement w.r.t
Link: https://arxiv.org/abs/1809.08516
====================================================
SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud (Bichen Wu - 22 September, 2018)
However, due to domain shift, models trained on synthetic data often do not generalize well to the real world. We address this problem with a domain-adaptation training pipeline consisting of three major components: 1) learned intensity rendering, 2) geodesic correlation alignment, and 3) progressive domain calibration. When training our new model on synthetic data using the proposed domain adaptation pipeline, we nearly double test accuracy on real-world data, from 29.0% to 57.4%
Link: https://arxiv.org/abs/1809.08495
====================================================
Geometric Multi-Model Fitting by Deep Reinforcement Learning (Zongliang Zhang - 22 September, 2018)
In this paper, we have compared our method against the state-of-the-art on simulated data
Link: https://arxiv.org/abs/1809.08397
====================================================
The Privacy Policy Landscape After the GDPR (Thomas Linden - 22 September, 2018)
Via a user study with 530 participants on Amazon Mturk, we discover that the visual presentation of privacy policies has slightly improved in limited data-sensitive categories in addition to the top European websites. We also find that the readability of privacy policies suffers under the GDPR, due to almost a 30% more sentences and words, despite the efforts to reduce the reliance on passive sentences. We find evidence for positive changes triggered by the GDPR, with the ambiguity level, averaged over 8 metrics, improving in over 20.5% of the policies. Finally, we show that privacy policies cover more data practices, particularly around data retention, user access rights, and specific audiences, and that an average of 15.2% of the policies improved across 8 compliance metrics
Link: https://arxiv.org/abs/1809.08396
====================================================
Augmenting Input Method Language Model with user Location Type Information (Di He - 21 September, 2018)
This work queried micro-blog posts from Twitter API and location type of these posts from Google Place API, forming a dataset of around 500k samples. An LSTM based prediction experiment found a 2% edge in the accuracy from language models leveraging location type information when compared to a baseline without that information.
Link: https://arxiv.org/abs/1809.08349
====================================================
Generating GraphQL-Wrappers for REST(-like) APIs (Erik Wittern - 21 September, 2018)
We discuss the challenges for creating such wrappers, including dealing with data sanitation, authentication, or handling nested queries. We evaluate OASGraph by running it, as well as an existing open source alternative, against 959 publicly available OAS. This experiment shows that OASGraph outperforms the existing alternative and is able to create a GraphQL wrapper for 89.5% of the APIs -- however, with limitations in many cases
Link: https://arxiv.org/abs/1809.08319
====================================================
A Graphical Bayesian Game for Secure Sensor Activation in Internet of Battlefield Things (Nof Abuzainab - 21 September, 2018)
The utility of each sensor is expressed in terms of the redundancy of the data transmitted, the secrecy capacity and the energy consumed. The reduction in energy consumption reaches up to 98% compared to the baseline, when the number of sensors is 5000.
Link: https://arxiv.org/abs/1809.08207
====================================================
Exclusive Independent Probability Estimation using Deep 3D Fully Convolutional DenseNets for IsoIntense Infant Brain MRI Segmentation (Seyed Raein Hashemi - 27 September, 2018)
Using our training technique based on similarity loss functions and patch prediction fusion we decrease the number of parameters in the network, reduce the complexity of the training process focusing the attention on less number of tasks, while mitigating the effects of data imbalance between labels and inaccuracies near patch borders. By taking advantage of these strategies we were able to perform fast image segmentation, using a network with less parameters than many state-of-the-art networks, being image size independent overcoming issues such as 3D vs 2D training and large vs small patch size selection, while achieving the top performance in segmenting brain tissue among all methods in the 2017 iSeg challenge
Link: https://arxiv.org/abs/1809.08168
====================================================
Sampler Design for Bayesian Personalized Ranking by Leveraging View Data (Jingtao Ding - 21 September, 2018)
Compared to the vanilla BPR that applies a uniform sampler on all candidates, our view-enhanced sampler enhances BPR with a relative improvement over 37.03% and 16.40% on two real-world datasets
Link: https://arxiv.org/abs/1809.08162
====================================================
Learning Recommender Systems from Multi-Behavior Data (Chen Gao - 21 September, 2018)
Extensive experiments on two real-world datasets demonstrate that NMTR significantly outperforms state-of-the-art recommender systems that are designed to learn from both single-behavior data and multi-behavior data
Link: https://arxiv.org/abs/1809.08161
====================================================
Aspects on Finding the Optimal Practical Programming Exercise for MOOCs (Ralf Teusner - 21 September, 2018)
In this paper, we explore the data of three programming courses to find criteria for optimal practical programming exercises. Based on over 3 million executions and scoring runs of participants' task submissions, we aim to deduct exercise difficulty, student patterns in approaching the tasks and potential flaws in task descriptions and preparatory videos
Link: https://arxiv.org/abs/1809.08056
====================================================
SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection (Meijun Sun - 21 September, 2018)
Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.
Link: https://arxiv.org/abs/1809.07988
====================================================
SLIDER: Fast and Efficient Computation of Banded Sequence Alignment (Mohammed Alser - 18 September, 2018)
Motivation: The ability to generate massive amounts of sequencing data continues to overwhelm the processing capacity of existing algorithms and compute infrastructures. The addition of SLIDER as a pre-alignment step reduces the execution time of five state-of-the-art sequence align-ers by up to 18.8x
Link: https://arxiv.org/abs/1809.07858
====================================================
Internet Protocol Version 6: Dead or Alive? (Sumit Maheshwari - 17 August, 2018)
Internet Protocol (IP) is the narrow waist of multilayered Internet protocol stack which defines the rules for data sent across networks. IPv4 is the fourth version of IP and first commercially available for deployment set by ARPANET in 1983 which is a 32 bit long address and can support up to 232 devices. In April 2017, all Regional Internet Registries (RIRs) confirmed that IPv4 addresses are exhausted and cannot be allocated anymore implying any new organization requesting a block of Internet addresses will be allocated IPv6. Currently, when IPv4 is not available, and IPv6 is not adopted for around 20 years, the question arises whether IPv6 will still be accepted by the computer society or will it have an end of life soon with alternate better protocol such as ID based networks taking its place
Link: https://arxiv.org/abs/1809.07836
====================================================
Rapid Customization for Event Extraction (Yee Seng Chan - 20 September, 2018)
Additionally, the system uses the ACE corpus to train an argument model for extracting Actor, Place, and Time arguments for any event types, including ones not seen in its training data. Experiments show that with less than 10 minutes of human effort per event type, the system achieves good performance for 67 novel event types
Link: https://arxiv.org/abs/1809.07783
====================================================
Specimens as research objects: reconciliation across distributed repositories to enable metadata propagation (Nicky Nicolson - 20 September, 2018)
Following a data mining exercise applied to an aggregated dataset of 19,827,998 specimen records from 292 separate specimen repositories, 36% or 7,102,710 specimens are assessed to participate in duplication relationships, allowing the propagation of metadata among the participants in these relationships, totalling: 93,044 type citations, 1,121,865 georeferences, 1,097,168 images and 2,191,179 scientific name determinations
Link: https://arxiv.org/abs/1809.07725
====================================================
Design and Implementation of High-throughput PCIe with DMA Architecture between FPGA and PowerPC (Kun Cheng - 17 September, 2018)
A data throughput of more than 666 MBytes/s(memory write with data from FPGA to PowerPC) has been achieved with the single PCIe Gen1 x8 lanes endpoint of this design, PowerPC and FPGA can send memory write request to each other.
Link: https://arxiv.org/abs/1809.07702
====================================================
A Microbenchmark Characterization of the Emu Chick (Jeffrey Young - 7 September, 2018)
Rather than transferring large amounts of data across power-hungry, high-latency interconnects, the Emu Chick moves lightweight thread contexts to near-memory cores before the beginning of each memory read. AsHES 2018) of the the memory bandwidth characteristics of the system through benchmarks like STREAM, pointer chasing, and sparse matrix-vector multiplication. Moreover, the Emu Chick provides stable, predictable performance with up to 65% of the peak bandwidth utilization on a random-access pointer chasing benchmark with weak locality.
Link: https://arxiv.org/abs/1809.07696
====================================================
Autonomous Driving System Design for Formula Student Driverless Racecar (Hanqing Tian - 19 September, 2018)
Detection algorithm of the racecar also implements a precise and high rate localization method which combines the GPS-INS data and LIDAR odometry. This paper also briefly introduces the Formula Student Autonomous Competition (FSAC) in 2017.
Link: https://arxiv.org/abs/1809.07636
====================================================
DuPLO: A DUal view Point deep Learning architecture for time series classificatiOn (Roberto Interdonato - 20 September, 2018)
Nowadays, modern Earth Observation systems continuously generate huge amounts of data. A notable example is represented by the Sentinel-2 mission, which provides images at high spatial resolution (up to 10m) with high temporal revisit period (every 5 days), which can be organized in Satellite Image Time Series (SITS)
Link: https://arxiv.org/abs/1809.07589
====================================================
Assessing the quality of home detection from mobile phone data for official statistics (Maarten Vanhoof - 20 September, 2018)
We support our argument by analysing the performance of five home detection algorithms (HDAs) that have been applied to a large, French, Call Detailed Record (CDR) dataset (~18 million users, 5 months). Our results show that criteria choice in HDAs influences the detection of home locations for up to about 40% of users, that HDAs perform poorly when compared with a validation dataset (the 35°-gap), and that their performance is sensitive to the time period and the duration of observation
Link: https://arxiv.org/abs/1809.07567
====================================================
OxIOD: The Dataset for Deep Inertial Odometry (Changhao Chen - 20 September, 2018)
Our dataset contains 158 sequences totalling more than 42 km in total distance, much larger than previous inertial datasets
Link: https://arxiv.org/abs/1809.07491
====================================================
SoaAlloc: Accelerating Single-Method Multiple-Objects Applications on GPUs (Matthias Springer - 19 September, 2018)
SoaAlloc is the first allocator for GPUs that (a) arranges allocations in a SIMD-friendly Structure of Arrays (SOA) data layout, (b) provides a do-all operation for maximizing the benefit of SOA, and (c) is on par with state-of-the-art memory allocators for raw (de)allocation time. Our benchmarks show that the SOA layout leads to significantly better memory bandwidth utilization, resulting in a 2x speedup of application code.
Link: https://arxiv.org/abs/1809.07444
====================================================
Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System (Jiaxi Tang - 19 September, 2018)
The experiments on public data sets and state-of-the-art recommendation models showed that RD achieves its design purposes: the student model learnt with RD has a model size less than half of the teacher model while achieving a ranking performance similar to the teacher model and much better than the student model learnt without RD.
Link: https://arxiv.org/abs/1809.07428
====================================================
The Read-Optimized Burrows-Wheeler Transform (Travis Gagie - 19 September, 2018)
The advent of high-throughput sequencing has resulted in massive genomic datasets, some consisting of assembled genomes but others consisting of raw reads. The best current fully-functional index for repetitive collections (Gagie et al., SODA 2018) uses space proportional to this number.
Link: https://arxiv.org/abs/1809.07320
====================================================
MTLE: A Multitask Learning Encoder of Visual Feature Representations for Video and Movie Description (Oliver Nina - 19 September, 2018)
Many of the current state of the art methods for video captioning and movie description rely on simple encoding mechanisms through recurrent neural networks to encode temporal visual information extracted from video data. Our method shows improved performance over current state of the art methods in several metrics on multi-caption and single-caption datasets. Our method demonstrates its robustness on the Large Scale Movie Description Challenge (LSMDC) 2017 where our method won the movie description task and its results were ranked among other competitors as the most helpful for the visually impaired.
Link: https://arxiv.org/abs/1809.07257
====================================================
Unbalanced Three-Phase Distribution Grid Topology Estimation and Bus Phase Identification (Yizheng Liao - 9 October, 2018)
For validation, we extensively simulate on IEEE $37$- and $123$-bus systems using real data from PG\&E, ADRES Project, and Pecan Street
Link: https://arxiv.org/abs/1809.07192
====================================================
NICT's Corpus Filtering Systems for the WMT18 Parallel Corpus Filtering Task (Rui Wang - 19 September, 2018)
Using the clean data of the WMT18 shared news translation task, we designed several features and trained a classifier to score each sentence pairs in the noisy data. Finally, we sampled 100 million and 10 million words and built corresponding NMT systems
Link: https://arxiv.org/abs/1809.07043
====================================================
Generating 3D Adversarial Point Clouds (Chong Xiang - 19 September, 2018)
In addition, we propose 7 perturbation measurement metrics tailored to different attacks and conduct extensive experiments to evaluate the proposed algorithms on the ModelNet40 dataset. Overall, our attack algorithms achieve about 100% attack success rate for all targeted attacks.
Link: https://arxiv.org/abs/1809.07016
====================================================
Wearable-based Mediation State Detection in Individuals with Parkinson's Disease (Murtadha D. Hssayeni - 18 September, 2018)
The developed algorithm is evaluated using a dataset with 19 PD subjects and a total duration of 1,052.24 minutes (17.54 hours). The algorithm resulted in an average classification accuracy of 90.5%, sensitivity of 94.2%, and specificity of 85.4%.
Link: https://arxiv.org/abs/1809.06973
====================================================
A Study on Deep Learning Based Sauvegrain Method for Measurement of Puberty Bone Age (Seung Bin Baik - 18 September, 2018)
The selected reference images were learned without being included in the evaluation data, and at the same time, the data was extended to accommodate the number of cases. The mean absolute error of the Sauvegrain method based on deep learning is 2.8 months and the Mean Absolute Percentage Error (MAPE) is 0.018
Link: https://arxiv.org/abs/1809.06965
====================================================
Multi-Task Learning for Machine Reading Comprehension (Yichong Xu - 18 September, 2018)
Experiments on the Stanford Question Answering Dataset (SQuAD), the Microsoft MAchine Reading COmprehension Dataset (MS MARCO), NewsQA and other datasets show that our multi-task learning approach achieves significant improvement over state-of-the-art models in most MRC tasks.
Link: https://arxiv.org/abs/1809.06963
====================================================
Document Informed Neural Autoregressive Topic Models with Distributional Prior (Pankaj Gupta - 15 September, 2018)
We present novel neural autoregressive topic model variants that consistently outperform state-of-the-art generative topic models in terms of generalization, interpretability (topic coherence) and applicability (retrieval and classification) over 6 long-text and 8 short-text datasets from diverse domains.
Link: https://arxiv.org/abs/1809.06709
====================================================
Capsule Deep Neural Network for Recognition of Historical Graffiti Handwriting (Nikita Gordienko - 11 September, 2018)
CGCL dataset contains >4000 images for glyphs of 34 letters which are hardly recognized by experts even in contrast to notMNIST dataset with the better images of 10 letters taken from different fonts. The area under curve (AUC) values for receiver operating characteristic (ROC) were also higher for the capsule network model than for CNN model: 0.88-0.93 (capsule network) and 0.50 (CNN) without data augmentation, 0.91-0.95 (capsule network) and 0.51 (CNN) with lossless data augmentation, and similar results of 0.91-0.93 (capsule network) and 0.9 (CNN) in the regime of lossless data augmentation only
Link: https://arxiv.org/abs/1809.06693
====================================================
Dynamically Weighted Ensemble-based Prediction System for Adaptively Modeling Driver Reaction Time (Chun-Hsiang Chuang - 18 September, 2018)
This system comprises a set of prediction submodels that are individually trained using groups of data with similar EEG-RT relationships. To obtain a final prediction, the prediction outcomes of the sub-models are then multiplied by weights that are derived from the EEG alpha coherences of 10 channels plus theta band powers of 30 channels, whose changes were found to be indicators of variations in the EEG-RT relationship
Link: https://arxiv.org/abs/1809.06675
====================================================
Attribute Enhanced Face Aging with Wavelet-based Generative Adversarial Networks (Yunfan Liu - 18 September, 2018)
Qualitative results demonstrate the ability of our model to synthesize visually plausible face images, and extensive quantitative evaluation results show that the proposed method achieves state-of-the-art performance on existing databases.
Link: https://arxiv.org/abs/1809.06647
====================================================
Talking to myself: self-dialogues as data for conversational agents (Joachim Fainberg - 19 September, 2018)
This paper presents a novel method for gathering topical, unstructured conversational data in an efficient way: self-dialogues through crowd-sourcing. Alongside this paper, we include a corpus of 3.6 million words across 23 topics
Link: https://arxiv.org/abs/1809.06641
====================================================
Learning Universal Sentence Representations with Mean-Max Attention Autoencoder (Minghua Zhang - 18 September, 2018)
By training our model on a large collection of unlabelled data, we obtain high-quality representations of sentences. Experimental results on a broad range of 10 transfer tasks demonstrate that our model outperforms the state-of-the-art unsupervised single methods, including the classical skip-thoughts and the advanced skip-thoughts+LN model
Link: https://arxiv.org/abs/1809.06590
====================================================
User Information Augmented Semantic Frame Parsing using Coarse-to-Fine Neural Networks (Yilin Shen - 18 September, 2018)
Although state-of-the-art approaches showed good results, they require large annotated training data and long training time. The results show that our approach leverages such simple user information to outperform state-of-the-art approaches by 0.25% for intent detection and 0.31% for slot filling using standard training data. When using smaller training data, the performance improvement on intent detection and slot filling reaches up to 1.35% and 1.20% respectively. We also show that our approach can achieve similar performance as state-of-the-art approaches by using less than 80% annotated training data. Moreover, the training time to achieve the similar performance is also reduced by over 60%.
Link: https://arxiv.org/abs/1809.06559
====================================================
Joint User Association and Resource Allocation Optimization for Ultra Reliable Low Latency HetNets (Mohammad Yousefvand - 18 September, 2018)
In our scheme, CBSs share portions of the available spectrum with SBSs, and they in exchange, provide data service to the users in their coverage area. In our simulations, the spectrum access delay for cellular users is reduced by 93\% and the energy consumption is reduced by 33\%, while maintaining the full service rate.
Link: https://arxiv.org/abs/1809.06550
====================================================
Nanopublications: A Growing Resource of Provenance-Centric Scientific Linked Data (Tobias Kuhn - 18 September, 2018)
More than 10 million such nanopublications have been published, which now form a valuable resource for studies on the domain level of the given Life Science domains as well as on the more technical levels of provenance modeling and heterogeneous Linked Data
Link: https://arxiv.org/abs/1809.06532
====================================================
Active Anomaly Detection via Ensembles (Shubhomoy Das - 17 September, 2018)
Our results show that in addition to discovering significantly more anomalies than state-of-the-art unsupervised baselines, our active learning algorithms under the streaming-data setup are competitive with the batch setup.
Link: https://arxiv.org/abs/1809.06477
====================================================
Bridging the Simulated-to-Real Gap: Benchmarking Super-Resolution on Real Data (Thomas Köhler - 17 September, 2018)
To bridge this simulated-to-real gap, we introduce the Super-Resolution Erlangen (SupER) database, the first comprehensive laboratory SR database of all-real acquisitions with pixel-wise ground truth. It consists of more than 80k images of 14 scenes combining different facets: CMOS sensor noise, real sampling at four resolution levels, nine scene motion types, two photometric conditions, and lossy video coding at five levels. This paper also benchmarks 19 popular single-image and multi-frame algorithms on our data
Link: https://arxiv.org/abs/1809.06420
====================================================
The Rosario Dataset: Multisensor Data for Localization and Mapping in Agricultural Environments (Taihú Pire - 17 September, 2018)
The dataset is motivated by the lack of realistic sensor data gathered by a mobile robot in such environments. It consists of 6 sequences recorded in soybean fields showing real and challenging cases: highly repetitive scenes, reflection and burned images caused by direct sunlight and rough terrain among others
Link: https://arxiv.org/abs/1809.06413
====================================================
Crowdsourcing Lung Nodules Detection and Annotation (Saeed Boorboor - 17 September, 2018)
Using our crowdsourcing workflow, we achieved a lung nodule detection sensitivity of over 90% for 20 patient CT datasets (containing 178 lung nodules with sizes between 1-30mm), and only 47 false positives from a total of 1021 annotations on nodules of all sizes (96% sensitivity for nodules$>$4mm)
Link: https://arxiv.org/abs/1809.06402
====================================================
Effective Predictions of Gaokao Admission Scores for College Applications in Mainland China (Hao Zhang - 12 September, 2018)
Early prediction methods are empirical without the backing of in-depth data studies. We show that our methods significantly outperform the methods commonly used by teachers and experts, and can predict admission scores with an accuracy of 91% within a 7-point margin in an exam of a 750-point grading scale.
Link: https://arxiv.org/abs/1809.06362
====================================================
"FabSearch" : A 3D CAD Model Based Search Engine for Sourcing Manufacturing Services (Atin Angrish - 17 September, 2018)
Second, FabSearch utilizes meta-data about each part, such as material specification, tolerance requirements to help improve the search results based on the specific query model requirements. The algorithm is tested against a repository containing more than 2000 models distributed across various job shop service providers
Link: https://arxiv.org/abs/1809.06329
====================================================
Industrial Smoke Detection and Visualization (Yen-Chia Hsu - 17 September, 2018)
As sensing technology proliferates and becomes affordable to the general public, there is a growing trend in citizen science where scientists and volunteers form a strong partnership in conducting scientific research including problem finding, data collection, analysis, visualization, and storytelling. We have helped the community members build a live camera system which captures and visualizes high resolution timelapse imagery starting from November 2014
Link: https://arxiv.org/abs/1809.06263
====================================================
GANs for Medical Image Analysis (Salome Kazeminia - 13 September, 2018)
Furthermore, their ability to synthesize images at unprecedented levels of realism also gives hope that the chronic scarcity of labeled data in the medical field can be resolved with the help of these generative models. A total of 63 papers published until end of July 2018 are reviewed
Link: https://arxiv.org/abs/1809.06222
====================================================
Context-Dependent Diffusion Network for Visual Relationship Detection (Zhen Cui - 10 September, 2018)
Experiments on two widely-used datasets demonstrate that our proposed method is more effective and achieves the state-of-the-art performance.
Link: https://arxiv.org/abs/1809.06213
====================================================
Study and Observation of the Variation of Accuracies of KNN, SVM, LMNN, ENN Algorithms on Eleven Different Datasets from UCI Machine Learning Repository (Mohammad Mahmudur Rahman Khan - 22 September, 2018)
Machine learning qualifies computers to assimilate with data, without being solely programmed [1, 2]. In supervised learning, computers learn an objective that portrays an input to an output hinged on training input-output pairs [3]
Link: https://arxiv.org/abs/1809.06186
====================================================
Dynamics Estimation Using Recurrent Neural Network (Astha Sharma - 17 September, 2018)
The loss obtained with this test data is 4.5920
Link: https://arxiv.org/abs/1809.06148
====================================================
Feature2Mass: Visual Feature Processing in Latent Space for Realistic Labeled Mass Generation (Jae-Hyeok Lee - 17 September, 2018)
However, in many bioimaging fields, the large-size of labeled dataset is scarcely available. Although a few researches have been dedicated to solving this problem through generative model, there are some problems as follows: 1) The generated bio-image does not seem realistic; 2) the variation of generated bio-image is limited; and 3) additional label annotation task is needed
Link: https://arxiv.org/abs/1809.06147
====================================================
Open Subtitles Paraphrase Corpus for Six Languages (Mathias Creutz - 17 September, 2018)
For each target language, the Opusparcus data have been partitioned into three types of data sets: training, development and test sets. The development and test sets consist of sentence pairs that have been checked manually; each set contains approximately 1000 sentence pairs that have been verified to be acceptable paraphrases by two annotators.
Link: https://arxiv.org/abs/1809.06142
====================================================
Revisit Multinomial Logistic Regression in Deep Learning: Data Dependent Model Initialization for Image Recognition (Bowen Cheng - 17 September, 2018)
Then we adopt this approximate solution to initialize the task-specific linear layer and demonstrate superior performance over random initialization in terms of both accuracy and convergence speed on various tasks and datasets. For example, for image classification, our approach can reduce the training time by 10 times and achieve 3.2% gain in accuracy for Flickr-style classification. For object detection, our approach can also be 10 times faster in training for the same accuracy, or 5% better in terms of mAP for VOC 2007 with slightly longer training.
Link: https://arxiv.org/abs/1809.06131
====================================================
Span error bound for weighted SVM with applications in hyperparameter selection (Ioannis Sarafis - 17 September, 2018)
Experiments on 14 benchmark data sets and data sets with importance scores for the training instances show that: (a) the condition for the existence of span in weighted SVM is satisfied almost always; (b) the span-rule is the most effective method for weighted SVM hyperparameter selection; (c) the span-rule is the best predictor of the test error in the mean square error sense; and (d) the span-rule is efficient and, for certain problems, it can be calculated faster than $K$-fold cross-validation.
Link: https://arxiv.org/abs/1809.06124
====================================================
cf2vec: Collaborative Filtering algorithm selection using graph distributed representations (Tiago Cunha - 17 September, 2018)
Experimental results show that the proposed procedure creates representations that are competitive with state-of-the-art metafeatures, while requiring significantly less data and without virtually any human input.
Link: https://arxiv.org/abs/1809.06120
====================================================
AlSub: Fully Parallel Subdivision for Modeling and Rendering (Daniel Mlakar - 1 October, 2018)
To fully parallelize the subdivision process, we discard traditional linked list data structures in favor of a sparse matrix linear algebra formalism. To substantiate the versatility of our approach we apply it to $\sqrt{3}$, Loop and Catmull-Clark subdivision schemes and show support for adaptive subdivision, semi-sharp creases, and a split evaluation scheme that separates topology and topological changes from positional updates
Link: https://arxiv.org/abs/1809.06047
====================================================
BSE: A Minimal Simulation of a Limit-Order-Book Stock Exchange (Dave Cliff - 17 September, 2018)
Research aimed at understanding the dynamics of this new style of financial market is hampered by the fact that no operational real-world exchange is ever likely to allow experimental probing of that market while it is open and running live, forcing researchers to work primarily from time-series of past trading data. BSE as described here addresses both those needs: it has been successfully used for teaching and research in a leading UK university since 2012, and the BSE program code is freely available as open-source on GitHuB.
Link: https://arxiv.org/abs/1809.06027
====================================================
DASNet: Reducing Pixel-level Annotations for Instance and Semantic Segmentation (Chuang Niu - 17 September, 2018)
Our method demonstrates substantially improved performance compared to existing semi-supervised approaches on PASCAL VOC 2012 dataset.
Link: https://arxiv.org/abs/1809.06013
====================================================
A Distributed Learning Architecture for Scientific Imaging Problems (A. Panousopoulou - 27 September, 2018)
We conduct evaluation studies considering relevant datasets, and the results report at least 60\% improvement in time response against the conventional computing solutions
Link: https://arxiv.org/abs/1809.05956
====================================================
Performance Analysis of Molecular Spatial Modulation (MSM) in Diffusion based Molecular MIMO Communication Systems (Tayyebeh Jahani-Nezhad - 16 September, 2018)
In this paper, we introduce molecular spatial modulation (MSM) in molecular MIMO communication to increase the data rate of the system. Also, for a 2$\times$1 system, we define an optimization problem to obtain the suitable number of molecules for transmitting to reduce BER of this systems
Link: https://arxiv.org/abs/1809.05954
====================================================
Memory Efficient Experience Replay for Streaming Learning (Tyler L. Hayes - 16 September, 2018)
Streaming learning will cause conventional deep neural networks (DNNs) to fail for two reasons: 1) they need multiple passes through the entire dataset; and 2) non-iid data will cause catastrophic forgetting
Link: https://arxiv.org/abs/1809.05922
====================================================
An investigation of a deep learning based malware detection system (Mohit Sewak - 16 September, 2018)
In the investigation, we experiment with different combination of Deep Learning architectures including Auto-Encoders, and Deep Neural Networks with varying layers over Malicia malware dataset on which earlier studies have obtained an accuracy of (98%) with an acceptable False Positive Rates (1.07%). In our proposed approach, besides improving the previous best results (99.21% accuracy and a False Positive Rate of 0.19%) indicates that Deep Learning based systems could deliver an effective defense against malware
Link: https://arxiv.org/abs/1809.05888
====================================================
Energy Efficient Cloud Control and Pricing in Geographically Distributed Data Centers (Dražen LuÄanin - 16 September, 2018)
It is estimated that data centers constitute 1.5% of global electricity usage
Link: https://arxiv.org/abs/1809.05853
====================================================
Performance-Based Pricing in Multi-Core Geo-Distributed Cloud Computing (Dražen LuÄanin - 16 September, 2018)
With such new pricing schemes and the increasing energy costs in data centres, balancing energy savings with performance and revenue losses is a challenging problem for cloud providers. We evaluate the proposed approach using simulations with realistic VM workloads, electricity price and temperature traces and estimate energy savings of up to 14.57%.
Link: https://arxiv.org/abs/1809.05842
====================================================
A Generic Multi-modal Dynamic Gesture Recognition System using Machine Learning (Gautham Krishna G - 16 September, 2018)
From an initial set of seven classifiers, three were chosen to evaluate each dataset across all modes rendering the system towards mode-neutrality and dataset-independence. Moreover, this system was found to run on a low-cost embedded platform - Raspberry Pi Zero (USD 5), making it economically viable.
Link: https://arxiv.org/abs/1809.05839
====================================================
Pervasive Cloud Controller for Geotemporal Inputs (Dražen LuÄanin - 16 September, 2018)
In this paper, we propose a pervasive cloud controller for dynamic resource reallocation adapting to volatile time- and location-dependent factors, while considering the QoS impact of too frequent migrations and the data quality limits of time series forecasting methods. By optimising for these additional factors, we estimate 28.6% energy cost savings compared to baseline dynamic VM consolidation
Link: https://arxiv.org/abs/1809.05838
====================================================
Real-Time, Highly Accurate Robotic Grasp Detection using Fully Convolutional Neural Networks with High-Resolution Images (Dongwon Park - 16 September, 2018)
Robotic grasp detection for novel objects is a challenging task, but for the last few years, deep learning based approaches have achieved remarkable performance improvements, up to 96.1% accuracy, with RGB-D data. Our methods also achieved state-of-the-art detection accuracy (up to 96.6%) with state-of- the-art real-time computation time for high-resolution images (6-20ms per 360x360 image) on Cornell dataset. With accurate vision-robot coordinate calibration through our proposed learning-based, fully automatic approach, our proposed method yielded 90% success rate.
Link: https://arxiv.org/abs/1809.05828
====================================================
Segmenting Unknown 3D Objects from Real Depth Images using Mask R-CNN Trained on Synthetic Point Clouds (Michael Danielczuk - 16 September, 2018)
SD Mask R-CNN outperforms point cloud clustering baselines by an absolute 15% in Average Precision and 20% in Average Recall, and achieves performance levels similar to a Mask RCNN trained on a massive, hand-labeled RGB dataset and fine-tuned on real images from the experimental setup
Link: https://arxiv.org/abs/1809.05825
====================================================
Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines (Boyi Yang - 16 September, 2018)