docs/ResearchResources/arxiv/ArxivData.txt

If there are any errors
please Abort, and run `arxiv_required` for required package installation, and start again
Please wait while we phrase the requested information from global arxiv[arxiv.org] servers 
------------>
---------------------------->
------------------------------------------------------>
 
A Comparative Study of Consistent Snapshot Algorithms for Main-Memory Database Systems (Liang Li - 11 October, 2018)
Formally, the in-memory consistent snapshot problem refers to taking an in-memory consistent time-in-point snapshot with the constraints that 1) clients can read the latest data items and 2) any data item in the snapshot should not be overwritten
Link: https://arxiv.org/abs/1810.04915
====================================================
Perfusion parameter estimation using neural networks and data augmentation (David Robben - 11 October, 2018)
A comparison on simulated CT Perfusion data shows that the neural network provides better estimations for both CBF and Tmax than a state of the art deconvolution method, and this over a wide range of noise levels. The proposed data augmentation enables to achieve these results with less than 100 datasets.
Link: https://arxiv.org/abs/1810.04898
====================================================
Applications of PageRank to Function Comparison and Malware Classification (Michael A. Slawinski - 10 October, 2018)
The model was trained on 2.5 million samples of .NET and has an accuracy of 98.3\% on test data
Link: https://arxiv.org/abs/1810.04789
====================================================
Deep Inertial Poser: Learning to Reconstruct Human Pose from Sparse Inertial Measurements in Real Time (Yinghao Huang - 10 October, 2018)
To evaluate our method, we recorded DIP-IMU, a dataset consisting of $10$ subjects wearing 17 IMUs for validation in $64$ sequences with $330\,000$ time instants; this constitutes the largest IMU dataset publicly available
Link: https://arxiv.org/abs/1810.04703
====================================================
Intrusion Detection Using Mouse Dynamics (Margit Antal - 10 October, 2018)
The Balabit data set was released in 2016 for a data science competition, which against the few subjects, can be considered the first adequate publicly available one. Set of actions-based evaluation achieves 0.92 AUC on the test part of the data set. However, the same type of evaluation conducted on the training part of the data set resulted in maximal AUC (1) using only 13 actions
Link: https://arxiv.org/abs/1810.04668
====================================================
Multimodal Speech Emotion Recognition Using Audio and Text (Seunghyun Yoon - 10 October, 2018)
Our proposed model outperforms previous state-of-the-art methods in assigning data to one of four emotion categories (i.e., angry, happy, sad and neutral) when the model is applied to the IEMOCAP dataset, as reflected by accuracies ranging from 68.8% to 71.8%.
Link: https://arxiv.org/abs/1810.04635
====================================================
Revitalizing Copybacks in Modern SSDs: Why and How (Duwon Hong - 10 October, 2018)
By limiting the number of successive copybacks, it guarantees that no data reliability problem occurs when data is internally migrated using rCopyback. Our evaluation results show that rcFTL can improve the overall I/O throughput by 54% on average over an existing FTL which does not use copybacks.
Link: https://arxiv.org/abs/1810.04603
====================================================
LIRS: Enabling efficient machine learning on NVM-based storage via a lightweight implementation of random shuffling (Zhi-Lin Ke - 10 October, 2018)
With the emerging non-volatile memory-based storage device, such as Intel Optane SSD, which provides fast random accesses, we propose a lightweight implementation of random shuffling (LIRS) to randomly shuffle the indexes of the entire training dataset, and the selected training instances are directly accessed from the storage and packed into batches. Experimental results show that LIRS can reduce the total training time of SVM and DNN by 49.9% and 43.5% on average, and improve the final testing accuracy on DNN by 1.01%.
Link: https://arxiv.org/abs/1810.04509
====================================================
Is your Statement Purposeless? Predicting Computer Science Graduation Admission Acceptance based on Statement Of Purpose (Diptesh Kanojia - 9 October, 2018)
We present a quantitative, data-driven machine learning approach to mitigate the problem of unpredictability of Computer Science Graduate School Admissions. We train a model over fifty manually verified SOPs for which it uses an SVM classifier and achieves the highest accuracy of 92% with 10-fold cross-validation
Link: https://arxiv.org/abs/1810.04502
====================================================
Invariance Analysis of Saliency Models versus Human Gaze During Scene Free Viewing (Zhaohui Che - 10 October, 2018)
In this paper, we first create a large-scale database including eye movements of 10 observers over 1900 images degraded by 19 types of distortions
Link: https://arxiv.org/abs/1810.04456
====================================================
Improving Neural Text Simplification Model with Simplified Corpora (Jipeng Qiang - 10 October, 2018)
We train encoder-decoder model using synthetic sentence pairs and original sentence pairs, which can obtain substantial improvements on the available WikiLarge data and WikiSmall data compared with the state-of-the-art methods.
Link: https://arxiv.org/abs/1810.04428
====================================================
V3C - a Research Video Collection (Luca Rossetto - 10 October, 2018)
V3C comes with a shot segmentation for each video, together with the resulting keyframes in original as well as reduced resolution and additional metadata. It is intended to be used from 2019 at the International large-scale TREC Video Retrieval Evaluation campaign (TRECVid).
Link: https://arxiv.org/abs/1810.04401
====================================================
Filtration Simplification for Persistent Homology via Edge Contraction (Tamal K. Dey - 10 October, 2018)
Persistent homology is a popular data analysis technique that is used to capture the changing topology of a filtration associated with some simplicial complex $K$. The first assumes that the underlying space of $K$ is a $2$-manifold and ensures that simplices are paired with the same simplices in the contracted complex as they are in the original
Link: https://arxiv.org/abs/1810.04388
====================================================
Task Runtime Prediction in Scientific Workflows Using an Online Incremental Learning Approach (Muhammad H. Hilman - 9 October, 2018)
We compare our solution to a state-of-the-art approach that exploits the resources monitoring data based on regression machine learning technique. From our experiments, the proposed strategy improves the performance, in terms of the error, up to 29.89%, compared to the state-of-the-art solutions.
Link: https://arxiv.org/abs/1810.04329
====================================================
Smtlink 2.0 (Yan Peng - 9 October, 2018)
Smtlink 2.0 provides support for FTY defprod, deflist, defalist, and defoption types by using Z3's arrays and user-defined data types
Link: https://arxiv.org/abs/1810.04317
====================================================
Inter-Scanner Harmonization of High Angular Resolution DW-MRI using Null Space Deep Learning (Vishwesh Nath - 9 October, 2018)
To use these data, we propose a new network architecture, the null space deep network (NSDN), to simultaneously learn on traditional observed/truth pairs (e.g., MRI-histology voxels) along with repeated observations without a known truth (e.g., scan-rescan MRI). NSDN significantly improved absolute performance relative to histology by 3.87% over CSD and 1.42% over a recently proposed deep neural network approach. More-over, it improved reproducibility on the paired data by 21.19% over CSD and 10.09% over a recently proposed deep approach. Finally, NSDN improved gen-eralizability of the model to a third in vivo human scanner (which was not used in training) by 16.08% over CSD and 10.41% over a recently proposed deep learn-ing approach
Link: https://arxiv.org/abs/1810.04260
====================================================
Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives (George Tucker - 9 October, 2018)
These approaches maximize a variational lower bound on the intractable log likelihood of the observed data. Counterintuitively, the typical inference network gradient estimator for the IWAE bound performs poorly as the number of samples increases (Rainforth et al., 2018; Le et al., 2018)
Link: https://arxiv.org/abs/1810.04152
====================================================
Semi-supervised Deep Reinforcement Learning in Support of IoT and Smart City Services (Mehdi Mohammadi - 9 October, 2018)
In this paper, we propose a semi-supervised deep reinforcement learning model that fits smart city applications as it consumes both labeled and unlabeled data to improve the performance and accuracy of the learning agent. Our model learns the best action policies that lead to a close estimation of the target locations with an improvement of 23% in terms of distance to the target and at least 67% more received rewards compared to the supervised DRL model.
Link: https://arxiv.org/abs/1810.04118
====================================================
Detecting object region and working state of aerator based on computer vision and machine learning (Yeqi Liu - 9 October, 2018)
In the work state detection module, this paper proposes a novel method called reference frame Kanade-Lucas-Tomasi (RF-KLT) algorithm, and constructs a classification procedure for the unlabeled time series data. The results of this study show that the accuracy of detecting object region and working state of aerator in the complex background is 100% and 99.9% respectively, and the detection speed is 77-333 frames per second (FPS) according to the different types of surveillance camera
Link: https://arxiv.org/abs/1810.04108
====================================================
Geometry meets semantics for semi-supervised monocular depth estimation (Pierluigi Zama Ramirez - 9 October, 2018)
In particular, on the KITTI dataset our network outperforms state-of-the-art methods for monocular depth estimation.
Link: https://arxiv.org/abs/1810.04093
====================================================
Deep Geodesic Learning for Segmentation and Anatomical Landmarking (Neslisah Torosdagli - 6 October, 2018)
In step 1, we propose a deep neu- ral network architecture with carefully designed regularization, and network hyper-parameters to perform image segmentation without the need for data augmentation and complex post- processing refinement. In step 2, we formulate the landmark localization problem directly on the geodesic space for sparsely- spaced anatomical landmarks. In step 3, we propose to use a long short-term memory (LSTM) network to identify closely- spaced landmarks, which is rather difficult to obtain using other standard detection networks. We used a very challenging CBCT dataset of 50 patients with a high-degree of craniomaxillofacial (CMF) variability that is realistic in clinical practice. Complementary to the quantitative analysis, the qualitative visual inspection was conducted for distinct CBCT scans from 250 patients with high anatomical variability. We have also shown feasibility of the proposed work in an independent dataset from MICCAI Head-Neck Challenge (2015) achieving the state-of-the-art performance
Link: https://arxiv.org/abs/1810.04021
====================================================
Glioma Segmentation with Cascaded Unet (Dmitry Lachinov - 9 October, 2018)
We evaluate presented approach on BraTS 2018 dataset and discuss results.
Link: https://arxiv.org/abs/1810.04008
====================================================
Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings (Zhongwei Xie - 4 October, 2018)
Currently,state-of-the-art approaches mainly focus on the task-specific data,neglecting the extra information on the different but related tasks
Link: https://arxiv.org/abs/1810.03989
====================================================
Explicit optimal-length locally repairable codes of distance 5 (Allison Beemer - 9 October, 2018)
Locally repairable codes (LRCs) have received significant recent attention as a method of designing data storage systems robust to server failure. For optimal LRCs with minimum distance greater than or equal to 5, block length is bounded by a polynomial function of alphabet size. In this paper, we give explicit constructions of optimal-length (in terms of alphabet size), optimal LRCs with minimum distance equal to 5.
Link: https://arxiv.org/abs/1810.03980
====================================================
Extended Bit-Plane Compression for Convolutional Neural Network Accelerators (Lukas Cavigelli - 1 October, 2018)
We show that an average compression ratio of 4.4x relative to uncompressed data and a gain of 60% over existing method can be achieved for ResNet-34 with a compression block requiring <300 bit of sequential cells and minimal combinational logic.
Link: https://arxiv.org/abs/1810.03979
====================================================
Improving Myocardium Segmentation in Cardiac CT Angiography using Spectral Information (Steffen Bruns - 27 September, 2018)
We propose augmentation of the training data with virtual mono-energetic reconstructions from a spectral CT scanner which show different attenuation levels of the contrast agent. We train a 3D fully convolutional network (FCN) with 10 conventional CCTA images and corresponding virtual mono-energetic reconstructions acquired on a spectral CT scanner, and evaluate on 40 CCTA scans acquired on a conventional CT scanner. We show that training with data augmentation using virtual mono-energetic images improves upon training with only conventional images (Dice similarity coefficient (DSC) 0.895 $\pm$ 0.039 vs. 0.846 $\pm$ 0.125). In comparison, training with data augmentation using linear scaling improves the DSC to 0.890 $\pm$ 0.039. Moreover, combining the results of both augmentation methods leads to a DSC of 0.901 $\pm$ 0.036, showing that both augmentations lead to different local improvements of the segmentations
Link: https://arxiv.org/abs/1810.03968
====================================================
Conditional Generative Refinement Adversarial Networks for Unbalanced Medical Image Semantic Segmentation (Mina Rezaei - 9 October, 2018)
The proposed architecture shows state-of-the-art results on LiTS-2017 for liver lesion segmentation, and two microscopic cell segmentation datasets MDA231, PhC-HeLa
Link: https://arxiv.org/abs/1810.03871
====================================================
Deep Attentive Tracking via Reciprocative Learning (Shi Pu - 9 October, 2018)
Extensive experiments on large-scale benchmark datasets show that the proposed attentive tracking method performs favorably against the state-of-the-art approaches.
Link: https://arxiv.org/abs/1810.03851
====================================================
Learning Bounds for Greedy Approximation with Explicit Feature Maps from Multiple Kernels (Shahin Shahrampour - 9 October, 2018)
Our empirical results show that given a fixed number of explicit features, the method can achieve a lower test error with a smaller time cost, compared to the state-of-the-art in data-dependent random features.
Link: https://arxiv.org/abs/1810.03817
====================================================
Deep residual networks for automatic sleep stage classification of raw polysomnographic waveforms (Alexander Neergaard Olesen - 8 October, 2018)
Briefly, the raw data is passed through 50 convolutional layers before subsequent classification into one of five sleep stages. Three model configurations were trained on 1850 polysomnogram recordings and subsequently tested on 230 independent recordings. Our best performing model yielded an accuracy of 84.1% and a Cohen's kappa of 0.746, improving on previous reported results by other groups also using only raw polysomnogram data. Most errors were made on non-REM stage 1 and 3 decisions, errors likely resulting from the definition of these stages
Link: https://arxiv.org/abs/1810.03745
====================================================
Bootstrapped CNNs for Building Segmentation on RGB-D Aerial Imagery (Clint Sebastian - 8 October, 2018)
Second, the proposed method outperforms the non-bootstrapped version by utilizing only one-sixth of the original training data and it obtains a precision-recall break-even of 95.10% on our aerial imagery dataset.
Link: https://arxiv.org/abs/1810.03570
====================================================
Multilingual sequence-to-sequence speech recognition: architecture, transfer learning, and language modeling (Jaejin Cho - 4 October, 2018)
In this work, we attempt to use data from 10 BABEL languages to build a multi-lingual seq2seq model as a prior model, and then port them towards 4 other BABEL languages using transfer learning approach. Experimental results show that the transfer learning approach from the multilingual model shows substantial gains over monolingual models across all 4 BABEL languages
Link: https://arxiv.org/abs/1810.03459
====================================================
Phrase-Based Attentions (Phi Xuan Nguyen - 30 September, 2018)
We incorporate our phrase-based attentions into the recently proposed Transformer network, and demonstrate that our approach yields improvements of 1.3 BLEU for English-to-German and 0.5 BLEU for German-to-English translation tasks on WMT newstest2014 using WMT'16 training data.
Link: https://arxiv.org/abs/1810.03444
====================================================
State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines (Christian Reul - 8 October, 2018)
The experiments show that training mixed models with real data is superior to training with synthetic data and that the novel OCR engine Calamari outperforms the other engines considerably, on average reducing ABBYYs character error rate (CER) by over 70%, resulting in an average CER below 1%.
Link: https://arxiv.org/abs/1810.03436
====================================================
Deep learning cardiac motion analysis for human survival prediction (Ghalib A. Bello - 8 October, 2018)
This dense motion model formed the input to a supervised denoising autoencoder (4Dsurvival), which is a hybrid network consisting of an autoencoder that learns a task-specific latent code representation trained on observed outcome data, yielding a latent representation optimised for survival prediction. In a study of 302 patients the predictive accuracy (quantified by Harrell's C-index) was significantly higher (p < .0001) for our model C=0.73 (95$\%$ CI: 0.68 - 0.78) than the human benchmark of C=0.59 (95$\%$ CI: 0.53 - 0.65)
Link: https://arxiv.org/abs/1810.03382
====================================================
Multi-Stream Opportunistic Network Decoupling: Relay Selection and Interference Management (Huifa Lin - 8 October, 2018)
For interference management, each source node sends $S \,(1 \le S \le M)$ data streams to selected relay nodes with random beamforming for the first hop, while each destination node receives its desired $S$ streams from the selected relay nodes via opportunistic interference alignment for the second hop, where $M$ is the number of antennas at each source or destination node
Link: https://arxiv.org/abs/1810.03298
====================================================
Guiding Intelligent Surveillance System by learning-by-synthesis gaze estimation (Tongtong Zhao - 8 October, 2018)
We show a significant improvement over using synthetic images, and achieve state-of-the-art results on various datasets including MPIIGaze dataset.
Link: https://arxiv.org/abs/1810.03286
====================================================
A look at the topology of convolutional neural networks (Rickard BrÃ¼el Gabrielsson - 7 October, 2018)
In this paper we use topological data analysis to investigate what various CNN's learn. We show that the weights of convolutional layers at depths from 1 through 13 learn simple global structures
Link: https://arxiv.org/abs/1810.03234
====================================================
SVIn2: Sonar Visual-Inertial SLAM with Loop Closure for Underwater Navigation (Sharmin Rahman - 7 October, 2018)
The state-of-the-art visual-inertial state estimation package OKVIS has been significantly augmented to accommodate acoustic data from sonar and depth measurements from pressure sensor, along with visual and inertial data in a non-linear optimization-based framework
Link: https://arxiv.org/abs/1810.03200
====================================================
Reinforcement Evolutionary Learning Method for self-learning (Kumarjit Pathak - 7 October, 2018)
Quantitative research is the most widely spread application of data science in Marketing or financial domain where applicability of state of the art reinforcement learning for auto-learning is less explored paradigm
Link: https://arxiv.org/abs/1810.03198
====================================================
NCARD: Improving Neighborhood Construction by Apollonius Region Algorithm based on Density (Shahin Pourbahrami - 7 October, 2018)
The proposed algorithm is more accurate than the state-of-the-art and well-known algorithms up to almost 8-13% in real and artificial data sets.
Link: https://arxiv.org/abs/1810.03084
====================================================
Online Center of Mass Estimation for a Humanoid Wheeled Inverted Pendulum Robot (Munzir Zafar - 6 October, 2018)
Experiments were performed on a 19 DoF WIP, in which we manually acquired the data for the learned set of poses and show that the mass model produced by a gradient descent produces a CoM estimate that improves overall control and efficiency
Link: https://arxiv.org/abs/1810.03076
====================================================
Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning (Frederik Ebert - 6 October, 2018)
Our real-world experiments demonstrate that a model trained with 160 robot hours of autonomously collected, unlabeled data is able to successfully perform complex manipulation tasks with a wide range of objects not seen during training.
Link: https://arxiv.org/abs/1810.03043
====================================================
Characterizing Deep-Learning I/O Workloads in TensorFlow (Steven W. D. Chien - 6 October, 2018)
The performance of Deep-Learning (DL) computing frameworks rely on the performance of data ingestion and checkpointing. We find that increasing the number of threads increases TensorFlow bandwidth by a maximum of 2.3x and 7.8x on our benchmark environments. The use of a burst buffer to checkpoint to a fast small capacity storage and copy asynchronously the checkpoints to a slower large capacity storage resulted in a performance improvement of 2.6x with respect to checkpointing directly to slower storage on our benchmark environment.
Link: https://arxiv.org/abs/1810.03035
====================================================
Gendered behavior as a disadvantage in open source software development (Balazs Vedres - 6 October, 2018)
Using data on entire careers of users from github.com, we develop a measure to capture the gendered pattern of behavior: We use a random forest prediction of being female (as opposed to being male) by behavioral choices in the level of activity, specialization in programming languages, and choice of partners. We find that 84.5% of women's disadvantage (compared to men) in success and 34.8% of their disadvantage in survival are due to the female pattern of their behavior
Link: https://arxiv.org/abs/1810.03005
====================================================
Camera Model Identification Using Convolutional Neural Networks (Artur Kuzin - 6 October, 2018)
In the current work, we describe our Deep Learning approach to the camera detection task of 10 cameras as a part of the Camera Model Identification Challenge hosted by Kaggle.com where our team finished 2nd out of 582 teams with the accuracy on the unseen data of 98%
Link: https://arxiv.org/abs/1810.02981
====================================================
Dissecting Apple's Meta-CDN during an iOS Update (Jeremias Blendin - 6 October, 2018)
Furthermore, by analyzing data from a European Eyeball ISP, we quantify third-party traffic offloading effects and find third-party CDNs increase their traffic by 438% while saturating seemingly unrelated links.
Link: https://arxiv.org/abs/1810.02978
====================================================
Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets (Abhijit Mahalunkar - 6 October, 2018)
At present, the state-of-the-art computational models across a range of sequential data processing tasks, including language modeling, are based on recurrent neural network architectures. Finally, we demonstrate how understanding the characteristics of the LDDs in a dataset can inform better hyper-parameter selection for current state-of-the-art recurrent neural architectures and also aid in understanding them...
Link: https://arxiv.org/abs/1810.02966
====================================================
Towards Self-Tuning Parameter Servers (Chris Liu - 6 October, 2018)
Nowadays, it is common to see industrial-strength machine learning jobs that involve millions of model parameters, terabytes of training data, and weeks of training. Experiments show that our techniques can reduce the completion times of a variety of long-running TensorFlow jobs from 1.4x to 18x.
Link: https://arxiv.org/abs/1810.02935
====================================================
POIReviewQA: A Semantically Enriched POI Retrieval and Question Answering Dataset (Gengchen Mai - 5 October, 2018)
To study the challenging task of semantically enriching POIs from unstructured data in order to support open-domain search and question answering (QA), we introduce a new dataset POIReviewQA. It consists of 20k questions (e.g."is this restaurant dog friendly?") for 1022 Yelp business types. For each question we sampled 10 reviews, and annotated each sentence in the reviews whether it answers the question and what the corresponding answer is. We build a Lucene-based baseline model, which achieves 77.0% AUC and 48.8% MAP. A sentence embedding-based model achieves 79.2% AUC and 41.8% MAP, indicating that the dataset presents a challenging problem for future research by the GIR community
Link: https://arxiv.org/abs/1810.02802
====================================================
RCCNet: An Efficient Convolutional Neural Network for Histological Routine Colon Cancer Nuclei Classification (Shabbeer Basha S H - 30 September, 2018)
The experiments are conducted over publicly available routine colon cancer histological dataset "CRCHistoPhenotypes". The proposed method has achieved a classification accuracy of 80.61% and 0.7887 weighted average F1 score
Link: https://arxiv.org/abs/1810.02797
====================================================
Automatic Detection of Arousals during Sleep using Multiple Physiological Signals (Saman Parvaneh - 5 October, 2018)
The data for each subject in the training set was split to 30-second epochs with no overlap. A total of 428 features from EEG, EMG, EOG, airflow, and SaO2 in each epoch were extracted and used for creating subject-specific models based on an ensemble of bagged classification trees, resulting in 943 models. For marking arousal and non-arousal regions in the test set, the data in the test set was split to 30-second epochs with 50% overlaps. Using the PhysioNet/CinC Challenge 2018 scoring criteria, AUPRCs of 0.25 and 0.21 were achieved for the in-house test and blind test sets, respectively.
Link: https://arxiv.org/abs/1810.02726
====================================================
Convex Clustering: Model, Theoretical Guarantee and Efficient Algorithm (Defeng Sun - 4 October, 2018)
Extensive numerical experiments on both simulated and real data demonstrate that our algorithm is highly efficient and robust for solving large-scale problems. In particular, our algorithm is able to solve a convex clustering problem with 200,000 points in $\mathbb{R}^3$ in about 6 minutes.
Link: https://arxiv.org/abs/1810.02677
====================================================
FingerVision Tactile Sensor Design and Slip Detection Using Convolutional LSTM Network (Yazhan Zhang - 5 October, 2018)
The data collection process takes advantage of the human sense of slip, during which human hand holds 12 daily objects, interacts with sensor skin and labels data with a slip or non-slip identity based on human feeling of slip. Our slip classification framework performs high accuracy of 97.62% on the test dataset
Link: https://arxiv.org/abs/1810.02653
====================================================
Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation (Xiao Pu - 5 October, 2018)
We first introduce three adaptive clustering algorithms for WSD, based on k-means, Chinese restaurant processes, and random walks, which are then applied to large word contexts represented in a low-rank space and evaluated on SemEval shared-task data. The improvements are above one BLEU point over strong NMT baselines, +4% accuracy over all ambiguous nouns and verbs, or +20% when scored manually over several challenging words.
Link: https://arxiv.org/abs/1810.02614
====================================================
A Comparative Survey of Optical Wireless Technologies: Architectures and Applications (Mostafa Zaman Chowdhury - 5 October, 2018)
A 100 Gb/s data rate has already been demonstrated through OWC. It offers services indoors as well as outdoors, and communication distances range from several nm to more than 10000 km
Link: https://arxiv.org/abs/1810.02594
====================================================
AIRNet: Self-Supervised Affine Registration for 3D Medical Images using Neural Networks (Evelyn Chee - 5 October, 2018)
But since it is costly to manually identify the transformation parameters between any two images, we leverage the abundance of cheap unlabelled data to generate a synthetic dataset for the training of the model. Experiments demonstrate that our approach achieves better overall performance on registration of images from different patients and modalities with 100x speed-up in execution time.
Link: https://arxiv.org/abs/1810.02583
====================================================
Towards High Resolution Video Generation with Progressive Growing of Sliced Wasserstein GANs (Dinesh Acharya - 4 October, 2018)
In addition, our model also reaches a record inception score of 14.57 in unsupervised action recognition dataset UCF-101.
Link: https://arxiv.org/abs/1810.02419
====================================================
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering (Hyeonwoo Noh - 3 October, 2018)
However, it is not straightforward how the visual concepts should be captured and transferred to visual question answering models due to missing link between question dependent answering models and visual data without question or task specification. We tackle this problem in two steps: 1) learning a task conditional visual classifier based on unsupervised task discovery and 2) transferring and adapting the task conditional visual classifier to visual question answering models
Link: https://arxiv.org/abs/1810.02358
====================================================
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding (Kexin Yi - 4 October, 2018)
First, executing programs on a symbolic space is more robust to long program traces; our model can solve complex reasoning tasks better, achieving an accuracy of 99.8% on the CLEVR dataset
Link: https://arxiv.org/abs/1810.02338
====================================================
Computer vision-based framework for extracting geological lineaments from optical remote sensing data (Ehsan Farahbakhsh - 4 October, 2018)
We test the proposed framework on Landsat 8 data of a mineral-rich portion of the Gascoyne Province in Western Australia using different dimension reduction techniques and convolutional filters
Link: https://arxiv.org/abs/1810.02320
====================================================
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks (Sanjeev Arora - 4 October, 2018)
We analyze speed of convergence to global optimum for gradient descent training a deep linear neural network (parameterized as $x\mapsto W_N \cdots W_1x$) by minimizing the $\ell_2$ loss over whitened data. Moreover, in the important case of output dimension 1, i.e
Link: https://arxiv.org/abs/1810.02281
====================================================
DATA Agent (Michael Cerny Green - 28 September, 2018)
Findings from a user study with 30 participants playing through two games of DATA Agent show that the game is easy and fun to play, and that the mysteries it generates are straightforward to solve.
Link: https://arxiv.org/abs/1810.02251
====================================================
Real Differences between OT and CRDT for Co-Editors (Chengzheng Sun - 4 October, 2018)
CRDT (Commutative Replicated Data Type) for co-editors was first proposed around 2006, under the name of WOOT (WithOut Operational Transformation)
Link: https://arxiv.org/abs/1810.02137
====================================================
Learning Finer-class Networks for Universal Representations (Julien Girard - 4 October, 2018)
Many real-world visual recognition use-cases can not directly benefit from state-of-the-art CNN-based approaches because of the lack of many annotated data. We show that our method learns more universal representations than state-of-the-art, leading to significantly better results on 10 target-tasks from multiple domains, using several network architectures, either alone or combined with networks learned at a coarser semantic level.
Link: https://arxiv.org/abs/1810.02126
====================================================
Longest Property-Preserved Common Factor (Lorraine A. K Ayad - 4 October, 2018)
In the first setting, we are given a string $x$ and we are asked to construct a data structure over $x$ answering the following type of on-line queries: given string $y$, find a longest square-free factor common to $x$ and $y$. In the second setting, we are given $k$ strings and an integer $1 < k'\leq k$ and we are asked to find a longest periodic factor common to at least $k'$ strings
Link: https://arxiv.org/abs/1810.02099
====================================================
FSS++ Workshop Report: Handling Uncertainty for Data Quality Management (Anna Wilbik - 4 October, 2018)
This report describes the results of the eSCF Awareness Workshop on Handling Uncertainty for Data Quality Management - Challenges from Transport and Supply Chain Management that was held on June 5, 2018 in Heeze, The Netherlands
Link: https://arxiv.org/abs/1810.02091
====================================================
Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA (Cheng Fu - 4 October, 2018)
Binarized Neural Network (BNN) removes bitwidth redundancy in classical CNN by using a single bit (-1/+1) for network parameters and intermediate representations, which has greatly reduced the off-chip data transfer and storage overhead. By analyzing local properties of images and the learned BNN kernel weights, we observe an average of $\sim$78% input similarity and $\sim$59% weight similarity among weight kernels, measured by our proposed metric in common network architectures
Link: https://arxiv.org/abs/1810.02068
====================================================
Gradient descent aligns the layers of deep linear networks (Ziwei Ji - 3 October, 2018)
This paper establishes risk convergence and asymptotic weight matrix alignment --- a form of implicit regularization --- of gradient flow and gradient descent when applied to deep linear networks on linearly separable data. In more detail, for gradient flow applied to strictly decreasing loss functions (with similar results for gradient descent with particular decreasing step sizes): (i) the risk converges to 0; (ii) the normalized i-th weight matrix asymptotically equals its rank-1 approximation $u_iv_i^{\top}$; (iii) these rank-1 matrices are aligned across layers, meaning $|v_{i+1}^{\top}u_i|\to1$
Link: https://arxiv.org/abs/1810.02032
====================================================
Transfer Incremental Learning using Data Augmentation (Ghouthi Boukli Hacene - 3 October, 2018)
Deep learning-based methods have reached state of the art performances, relying on large quantity of available data and computational power
Link: https://arxiv.org/abs/1810.02020
====================================================
Improving High Contention OLTP Performance via Transaction Scheduling (Guna Prasaad - 3 October, 2018)
We observe that most transactional workloads, including those with high contention, can be divided into clusters of data conflict-free transactions and a small set of residuals. We evaluate Strife against the optimistic concurrency control protocol and several variants of two-phase locking, where the latter is known to perform better than other concurrency protocols under high contention, and show that Strife can improve transactional throughput by up to 2x
Link: https://arxiv.org/abs/1810.01997
====================================================
The Blackbird Dataset: A large-scale dataset for UAV perception in aggressive flight (Amado Antonini - 3 October, 2018)
The Blackbird unmanned aerial vehicle (UAV) dataset is a large-scale, aggressive indoor flight dataset collected using a custom-built quadrotor platform for use in evaluation of agile perception.Inspired by the potential of future high-speed fully-autonomous drone racing, the Blackbird dataset contains over 10 hours of flight data from 168 flights over 17 flight trajectories and 5 environments at velocities up to $7.0ms^-1$
Link: https://arxiv.org/abs/1810.01987
====================================================
CoverBLIP: accelerated and scalable iterative matched-filtering for Magnetic Resonance Fingerprint reconstruction (Mohammad Golbabaee - 3 October, 2018)
Our further examinations on both synthetic and real-world datasets and using different sampling strategies, indicates between 2 to 3 orders of magnitude reduction in total search computations
Link: https://arxiv.org/abs/1810.01967
====================================================
CRED: A Deep Residual Network of Convolutional and Recurrent Units for Earthquake Signal Detection (S. Mostafa Mousavi - 3 October, 2018)
It learns the time-frequency characteristics of the dominant phases in an earthquake signal from three component data recorded on a single station. We train the network using 500,000 seismograms (250k associated with tectonic earthquakes and 250k identified as noise) recorded in Northern California and tested it with an F-score of 99.95. Our model is able to detect more than 700 microearthquakes as small as -1.3 ML induced during hydraulic fracturing far away than the training region
Link: https://arxiv.org/abs/1810.01965
====================================================
Machine Learning Suites for Online Toxicity Detection (David Noever - 3 October, 2018)
We systematically evaluate 62 classifiers representing 19 major algorithmic families against features extracted from the Jigsaw dataset of Wikipedia comments. Among 28 features of syntax, sentiment, emotion and outlier word dictionaries, a simple bad word list proves most predictive of offensive commentary.
Link: https://arxiv.org/abs/1810.01869
====================================================
Deep processing of structured data (Åukasz Maziarka - 3 October, 2018)
Moreover, its direct application to text and graph data allows to obtain results close to SOTA, by simpler networks with smaller number of parameters than competitive models.
Link: https://arxiv.org/abs/1810.01868
====================================================
Robust online identification of thermal models for in-production HPC clusters with machine learning-based data selection (Federico Pittino - 3 October, 2018)
However, we also show that: 1) not all real workloads allow for the identification of a good model; 2) starting from the theory of system identification it is very difficult to evaluate if a trace of data leads to a good estimated model. We also show that only via deep learning techniques these traces can be correctly chosen up to 96% of the times.
Link: https://arxiv.org/abs/1810.01865
====================================================
Task-Oriented Hand Motion Retargeting for Dexterous Manipulation Imitation (Dafni Antotsiou - 3 October, 2018)
Imitating those actions with dexterous hand models involves different important and challenging steps: acquiring human hand information, retargeting it to a hand model, and learning a policy from acquired data. We tackle the retargeting problem from the hand pose to a 29 DoF hand model by combining inverse kinematics and PSO with a task objective optimisation
Link: https://arxiv.org/abs/1810.01845
====================================================
Reinventing Data Stores for Video Analytics (Tiantu Xu - 3 October, 2018)
It streams video data from disks through decoder to operators and runs queries as fast as 362x of video realtime.
Link: https://arxiv.org/abs/1810.01794
====================================================
A Robot Localization Framework Using CNNs for Object Detection and Pose Estimation (Lukas Hoyer - 3 October, 2018)
Additionally, we propose a process to generate the necessary training data. The framework was evaluated with 3 different robot types and various identification patterns. We achieved up to 98% mAP@IOU0.5 and only 1.6Â° orientation error, running with a frame rate of 50 Hz on a GPU.
Link: https://arxiv.org/abs/1810.01665
====================================================
Extreme Augmentation : Can deep learning based medical image segmentation be trained using a single manually delineated scan? (Bilwaj Gaonkar - 3 October, 2018)
Almost every computer vision model trained on imaging data uses some form of augmentation. In the extreme, we observed that a model trained on patches extracted from just one scan, with each patch augmented 50 times; achieved a Dice score of 0.73 in a validation set of 40 cases. When the initial patches are extracted from nine scans the average Dice coefficient jumps to 0.86 and most of the false positives disappear. While this still falls short of state-of-the-art deep learning based segmentation of discs reported in literature, qualitative examination reveals that it does yield segmentation, which can be amended by expert clinicians with minimal effort to generate additional data for training improved deep models
Link: https://arxiv.org/abs/1810.01621
====================================================
A Deep Learning Architecture for De-identification of Patient Notes: Implementation and Evaluation (Kaung Khin - 2 October, 2018)
We test this architecture on two gold standard datasets and show that the architecture achieves state-of-the-art performance on both data sets while also converging faster than other systems without the use of dictionaries or other knowledge sources.
Link: https://arxiv.org/abs/1810.01570
====================================================
Deep Learning Based Caching for Self-Driving Car in Multi-access Edge Computing (Anselme Ndikumana - 2 October, 2018)
However, retrieving entertainment contents at the Data Center (DC) can hinder content delivery service due to high delay of car-to-DC communication. The simulation results show that the accuracy of our prediction for the contents need to be cached in the areas of the self-driving car is achieved at 98.04% and our approach can minimize delay.
Link: https://arxiv.org/abs/1810.01548
====================================================
An Analysis of Approaches Taken in the ACM RecSys Challenge 2018 for Automatic Music Playlist Continuation (Hamed Zamani - 2 October, 2018)
Given a playlist of arbitrary length with some additional meta-data, the task was to recommend up to 500 tracks that fit the target characteristics of the original playlist. In total, 113 teams submitted 1,228 runs to the main track; 33 teams submitted 239 runs to the creative track. The highest performing team in the main track achieved an R-precision of 0.2241, an NDCG of 0.3946, and an average number of recommended songs clicks of 1.784. In the creative track, an R-precision of 0.2233, an NDCG of 0.3939, and a click rate of 1.785 was obtained by the best team
Link: https://arxiv.org/abs/1810.01520
====================================================
Opinion Formation Threshold Estimates from Different Combinations of Social Media Data-Types (Derrik E. Asher - 2 October, 2018)
The present study estimates population opinion formation thresholds by querying 2222 participants about the number of various social media data-types (i.e., images, videos, and/or messages) that they would need to passively consume to form opinions. Opinion formation is assessed across three dimensions, 1) data-type(s), 2) context, and 3) source
Link: https://arxiv.org/abs/1810.01501
====================================================
Submodular Optimization in the MapReduce Model (Paul Liu - 2 October, 2018)
In practice, these problems often involve large amounts of data, and must be solved in a distributed way. In this paper, we present two simple algorithms for cardinality constrained submodular optimization in the MapReduce model: the first is a $(1/2-o(1))$-approximation in 2 MapReduce rounds, and the second is a $(1-1/e-Îµ)$-approximation in $\frac{1+o(1)}Îµ$ MapReduce rounds.
Link: https://arxiv.org/abs/1810.01489
====================================================
CELLO-3D: Estimating the Covariance of ICP in the Real World (David Landry - 2 October, 2018)
Then, we set out to estimate the covariance of ICP registrations through a data-driven approach, with over 5 100 000 registrations on 1020 pairs from real 3D point clouds
Link: https://arxiv.org/abs/1810.01470
====================================================
Unsupervised Machine Learning of Open Source Russian Twitter Data Reveals Global Scope and Operational Characteristics (Christopher Griffin - 2 October, 2018)
We developed and used a collection of statistical methods (unsupervised machine learning) to extract relevant information from a Twitter supplied data set consisting of alleged Russian trolls who (allegedly) attempted to influence the 2016 US Presidential election. Using natural language processing, manifold learning and Fourier analysis, we identify an operation that includes not only the 2016 US election, but also the French National and both local and national German elections
Link: https://arxiv.org/abs/1810.01466
====================================================
Efficient Dialog Policy Learning via Positive Memory Retention (Rui Zhao - 2 October, 2018)
However, the collection of the required data in form of conversations between chat-bots and human agents is time-consuming and expensive. We show that our method is 10 times more sample-efficient than policy gradients in extensive experiments on a new synthetic number guessing game
Link: https://arxiv.org/abs/1810.01371
====================================================
On Self Modulation for Generative Adversarial Networks (Ting Chen - 2 October, 2018)
While reminiscent of other conditioning techniques, it requires no labeled data. In a large-scale empirical study we observe a relative decrease of $5\%-35\%$ in FID. Furthermore, all else being equal, adding this modification to the generator leads to improved performance in $124/144$ ($86\%$) of the studied settings
Link: https://arxiv.org/abs/1810.01365
====================================================
Landmine Detection Using Autoencoders on Multi-polarization GPR Volumetric Data (Paolo Bestagini - 2 October, 2018)
Experiments conducted on real data show that the proposed technique requires little training and no ad-hoc data pre-processing to achieve accuracy higher than 93% on challenging datasets.
Link: https://arxiv.org/abs/1810.01316
====================================================
Findings of the E2E NLG Challenge (OndÅej DuÅ¡ek - 2 October, 2018)
The E2E NLG shared task aims to assess whether these novel approaches can generate better-quality output by learning from a dataset containing higher lexical richness, syntactic complexity and diverse discourse phenomena. We compare 62 systems submitted by 17 institutions, covering a wide range of approaches, including machine learning architectures -- with the majority implementing sequence-to-sequence models (seq2seq) -- as well as systems based on grammatical rules and templates.
Link: https://arxiv.org/abs/1810.01170
====================================================
Target Aware Network Adaptation for Efficient Representation Learning (Yang Zhong - 2 October, 2018)
Experimental results by the method on five datasets (Flower102, CUB200-2011, Dog120, MIT67, and Stanford40) show favorable accuracies over the related state-of-the-art techniques while enhancing the computational and storage efficiency of the transferred model.
Link: https://arxiv.org/abs/1810.01104
====================================================
A Unified Framework for Clustering Constrained Data without Locality Property (Hu Ding - 1 October, 2018)
The simplex lemma (or weaker simplex lemma) enables us to efficiently approximate the mean (or median) point of an unknown set of points by searching a small-size grid, independent of the dimensionality of the space, in a simplex (or the surrounding region of a simplex), and thus can be used to handle high dimensional data. If $k$ and $\frac{1}Îµ$ are fixed numbers, our framework generates, in nearly linear time ({\em i.e.,} $O(n(\log n)^{k+1}d)$), $O((\log n)^{k})$ $k$-tuple candidates for the $k$ mean or median points, and one of them induces a $(1+Îµ)$-approximation for $k$-CMeans or $k$-CMedian, where $n$ is the number of points
Link: https://arxiv.org/abs/1810.01049
====================================================
Reinforcement Learning with Perturbed Rewards (Jingkang Wang - 5 October, 2018)
Our framework draws upon approaches for supervised learning with noisy data. For instance, the state-of-the-art PPO algorithm is able to obtain 67.5% and 46.7% improvements in average on five Atari games, when the error rates are 10% and 30% respectively.
Link: https://arxiv.org/abs/1810.01032
====================================================
Fighting Against XSS Attacks: A Usability Evaluation of OWASP ESAPI Output Encoding (Chamila Wijayarathna - 1 October, 2018)
However, XSS still being ranked as one of the most critical vulnerabilities in web applications suggests that programmers are not effectively using those APIs to encode untrusted data. Therefore, we conducted an experimental study with 10 programmers where they attempted to fix XSS vulnerabilities of a web application using the output encoding functionality of OWASP ESAPI. Results revealed 3 types of mistakes that programmers made which resulted in them failing to fix the application by removing XSS vulnerabilities. We also identified 16 usability issues of OWASP ESAPI
Link: https://arxiv.org/abs/1810.01017
====================================================
Heterogeneous MacroTasking (HeMT) for Parallel Processing in the Public Cloud (Yuquan Shan - 1 October, 2018)
As representative results, Spark with HeMT offers about 10% better average completion times for realistic data processing workloads over the default system.
Link: https://arxiv.org/abs/1810.00988
====================================================
Efficient and Accurate Abnormality Mining from Radiology Reports with Customized False Positive Reduction (Nithya Attaluri - 1 October, 2018)
The difficulty is heightened for medical imaging, where data itself is limited in accessibility and labeling requires costly time and effort by trained medical specialists. Using this approach, we label more than 175,000 Head CT studies for the presence of 33 features indicative of 11 clinically relevant conditions. For 27 of the 30 keywords that yielded positive results (3 had no occurrences), the lower bound of the confidence intervals created to estimate the percentage of accurately labeled reports was above 85%, with the average being above 95%
Link: https://arxiv.org/abs/1810.00967
====================================================
RGB-D Object Detection and Semantic Segmentation for Autonomous Manipulation in Clutter (Max Schwarz - 1 October, 2018)
We evaluate our approach on two challenging data sets: one captured for the Amazon Picking Challenge 2016, where our team NimbRo came in second in the Stowing and third in the Picking task, and one captured in disaster-response scenarios
Link: https://arxiv.org/abs/1810.00818
====================================================
Improving the Generalization of Adversarial Training with Domain Adaptation (Chuanbiao Song - 1 October, 2018)
Empirical evaluations demonstrate that ATDA can greatly improve the generalization of adversarial training and achieves state-of-the-art results on standard benchmark datasets.
Link: https://arxiv.org/abs/1810.00740
====================================================
Automatic Data Expansion for Customer-care Spoken Language Understanding (Shahab Jalalvand - 27 September, 2018)
Theprocess starts with training a preliminary NLU model based on logistic regression on the in-domaindata. Using these n-grams, we find the samples in the out-of-domain corpus that1) contain the desired n-gram and/or 2) have similar intent label. Our results on two divergent experimental setups show that the proposed approachreduces by 30% the absolute classification error rate (CER) comparing to the preliminary modelsand it significantly outperforms the traditional data expansion algorithms such as the ones based onsemi-supervised learning, TF-IDF and embedding vectors.
Link: https://arxiv.org/abs/1810.00670
====================================================
Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection (Sudhanshu Kasewa - 26 September, 2018)
Our approach yields error-filled artificial data that helps a vanilla bi-directional LSTM to outperform the previous state of the art at grammatical error detection, and a previously introduced model to gain further improvements of over 5% $F_{0.5}$ score. When attempting to determine if a given sentence is synthetic, a human annotator at best achieves 39.39 $F_1$ score, indicating that our model generates mostly human-like instances.
Link: https://arxiv.org/abs/1810.00668
====================================================
Perfect Match: A Simple Method for Learning Representations For Counterfactual Inference With Neural Networks (Patrick Schwab - 3 October, 2018)
Our experiments demonstrate that PM outperforms a number of more complex state-of-the-art methods in inferring counterfactual outcomes across several real-world and semi-synthetic datasets.
Link: https://arxiv.org/abs/1810.00656
====================================================
One-Click Annotation with Guided Hierarchical Object Detection (Adithya Subramanian - 1 October, 2018)
The experiment conducted on PASCAL VOC dataset revealed that annotation created from our approach achieves a mAP of 0.995 and a recall of 0.903. The Our Approach has shown an overall improvement by 8.5%, 18.6% in mean average precision and recall score for KITTI and 69.6%, 36% for CITYSCAPES dataset
Link: https://arxiv.org/abs/1810.00609
====================================================
Unsupervised Trajectory Segmentation and Promoting of Multi-Modal Surgical Demonstrations (Zhenzhou Shao - 1 October, 2018)
Extensive experiments on a public dataset JIGSAWS show that our method achieves much higher accuracy of segmentation than state-of-the-art methods in the shorter time.
Link: https://arxiv.org/abs/1810.00599
====================================================
Generative Adversarial Network for Medical Images (MI-GAN) (Talha Iqbal - 1 October, 2018)
The proposed model achieves a dice coefficient of 0.837 on STARE dataset and 0.832 on DRIVE dataset which is state-of-the-art performance on both the datasets.
Link: https://arxiv.org/abs/1810.00551
====================================================
End-To-End Alzheimer's Disease Diagnosis and Biomarker Identification (Soheil Esmaeilzadeh - 1 October, 2018)
Our model can diagnose AD with an accuracy of 94.1\% on the popular ADNI dataset using only MRI data, which outperforms the previous state-of-the-art
Link: https://arxiv.org/abs/1810.00523
====================================================
Hybrid Noise Removal in Hyperspectral Imagery With a Spatial-Spectral Gradient Network (Qiang Zhang - 30 September, 2018)
The simulated and real-data experiments undertaken in this study confirmed that the proposed SSGN performs better at mixed noise removal than the other state-of-the-art HSI denoising algorithms, in evaluation indices, visual assessments, and time consumption.
Link: https://arxiv.org/abs/1810.00495
====================================================
AgriColMap: Aerial-Ground Collaborative 3D Mapping for Precision Farming (Ciro Potena - 30 September, 2018)
We evaluate our system using real world data for 3 fields with different crop species
Link: https://arxiv.org/abs/1810.00457
====================================================
Optical Illusions Images Dataset (Robert Max Williams - 30 September, 2018)
In this paper we present a dataset of 6725 illusion images gathered from two websites, and a smaller dataset of 500 hand-picked images
Link: https://arxiv.org/abs/1810.00415
====================================================
Resource Management in Fog/Edge Computing: A Survey (Cheol-Ho Hong - 29 September, 2018)
Contrary to using distant and centralized cloud data center resources, employing decentralized resources at the edge of a network for processing data closer to user devices, such as smartphones and tablets, is an upcoming computing paradigm, referred to as fog/edge computing. This article reviews publications as early as 1991, with 85% of the publications between 2013-2018, to identify and classify the architectures, infrastructure, and underlying algorithms for managing resources in fog/edge computing.
Link: https://arxiv.org/abs/1810.00305
====================================================
DIMENSION: Dynamic MR Imaging with Both K-space and Spatial Prior Knowledge Obtained via Multi-Supervised Network Training (Shanshan Wang - 9 October, 2018)
The comparisons with k-t FOCUSS, k-t SLR, L+S and the state-of-the-art CNN method on in vivo datasets show our method can achieve improved reconstruction results in shorter time.
Link: https://arxiv.org/abs/1810.00302
====================================================
Tithonus: A Bitcoin Based Censorship Resilient System (Ruben Recabarren - 29 September, 2018)
When compared to state-of-the-art Bitcoin writing solutions, Tithonus reduces the cost of transferring data to censored clients by 2 orders of magnitude and increases the goodput by 3 to 5 orders of magnitude
Link: https://arxiv.org/abs/1810.00279
====================================================
MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling (PaweÅ Budzianowski - 29 September, 2018)
To address this fundamental obstacle, we introduce the Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written conversations spanning over multiple domains and topics. At a size of $10$k dialogues, it is at least one order of magnitude larger than all previous annotated task-oriented corpora
Link: https://arxiv.org/abs/1810.00278
====================================================
Towards Better Summarizing Bug Reports with Crowdsourcing Elicited Attributes (He Jiang - 28 September, 2018)
Then, we propose a new method named Crowd-Attribute to infer new effective attributes from the crowdgenerated data in crowdsourcing and develop a new tool named Crowdsourcing Software Engineering Platform to facilitate this method. With Crowd-Attribute, we successfully construct 11 new attributes and propose a new supervised algorithm named Logistic Regression with Crowdsourced Attributes (LRCA). Experiments over both the public data set SDS with 36 manually annotated bug reports and new large-scale data sets demonstrate that LRCA can consistently outperform the state-of-the-art algorithms for bug report summarization.
Link: https://arxiv.org/abs/1810.00125
====================================================
Open-Ended Content-Style Recombination Via Leakage Filtering (Karl Ridgeway - 28 September, 2018)
Using this method for data-set augmentation, we obtain state-of-the-art performance on few-shot learning tasks.
Link: https://arxiv.org/abs/1810.00110
====================================================
Cell Grid Architecture for Maritime Route Prediction on AIS Data Streams (Ciprian Amariei - 28 September, 2018)
The 2018 Grand Challenge targets the problem of accurate predictions on data streams produced by automatic identification system (AIS) equipment, describing naval traffic
Link: https://arxiv.org/abs/1810.00090
====================================================
Active Fairness in Algorithmic Decision Making (Alejandro Noriega-Campero - 28 September, 2018)
We show on real-world datasets that these can achieve: 1) calibration and single error parity (e.g., equal opportunity); and 2) parity in both false positive and false negative rates (i.e., equal odds)
Link: https://arxiv.org/abs/1810.00031
====================================================
Universal and Dynamic Locally Repairable Codes with Maximal Recoverability via Sum-Rank Codes (Umberto MartÃ­nez-PeÃ±as - 28 September, 2018)
Furthermore, the local linear codes (thus the localities, local distances and local fields) can be efficiently and dynamically modified without global recoding or changes in architecture or outer code, while preserving MR, easily adapting to new hot and cold data. Reed-Solomon codes with local replication and Cartesian products are recovered from the given construction when $ r=1 $ and $ h = 0 $, respectively
Link: https://arxiv.org/abs/1809.11158
====================================================
Learning Recurrent Binary/Ternary Weights (Arash Ardakani - 28 September, 2018)
Recurrent neural networks (RNNs) have shown excellent performance in processing sequence data. Ultimately, we show that LSTMs with binary/ternary weights can achieve up to 12x memory saving and 10x inference speedup compared to the full-precision implementation on an ASIC platform.
Link: https://arxiv.org/abs/1809.11086
====================================================
Reuse and Adaptation for Entity Resolution through Transfer Learning (Saravanan Thirumuruganathan - 28 September, 2018)
Entity resolution (ER) is one of the fundamental problems in data integration, where machine learning (ML) based classifiers often provide the state-of-the-art results. We have performed comprehensive experiments on 12 datasets from 5 different domains (publications, movies, songs, restaurants, and books)
Link: https://arxiv.org/abs/1809.11084
====================================================
Can female fertility management mobile apps be sustainable and contribute to female health care? Harnessing the power of patient generated data ; Analysis of the organizations active in this e-Health segment (Maki Miyamoto - 28 September, 2018)
These patient-generated data (PGD) reflect patients everyday behaviors including physical activity, mood, diet, sleep, and symptoms. The research question: Can female fertility management mobile apps be sustainable and contribute to female health care, is researched by a combination of academic literature study, testing of 7 essential hypotheses, and a limited user driven experimental demand analysis
Link: https://arxiv.org/abs/1809.11042
====================================================
CNNs Fusion for Building Detection in Aerial Images for the Building Detection Challenge (RÃ©mi Delassus - 28 September, 2018)
We enhanced the SpaceNet Challenge winning solution by proposing a new fusion strategy based on a deep combiner using segmentation both results of different CNN and input data to segment. Segmentation results for all cities have been significantly improved (between 1% improvement over the baseline for the smallest one to more than 7% for the largest one)
Link: https://arxiv.org/abs/1809.10976
====================================================
Domain Generalization with Domain-Specific Aggregation Modules (Antonio D&#39;Innocente - 28 September, 2018)
Experiments on two different benchmark databases show the power of our approach, reaching the new state of the art in domain generalization.
Link: https://arxiv.org/abs/1809.10966
====================================================
Pull-based Bloom Filter-based Routing for Information-Centric Networks (Ali Marandi - 28 September, 2018)
In Named Data Networking (NDN), there is a need for routing protocols to populate Forwarding Information Base (FIB) tables so that the Interest messages can be forwarded. Bloom Filter-based Routing approaches like BFR [1], use Bloom Filters (BFs) to advertise all provided content objects, which consumes valuable bandwidth and storage resources
Link: https://arxiv.org/abs/1809.10948
====================================================
cISP: A Speed-of-Light Internet Service Provider (Debopam Bhattacherjee - 10 October, 2018)
We thus explore the design of cost-effective wide-area networks that move data over paths very close to great-circle paths, at speeds very close to the speed of light in vacuum. We show that instantiations of cISP across the contiguous United States and Europe would achieve mean latencies within 5% of that achievable using great-circle paths at the speed of light, over medium and long distances
Link: https://arxiv.org/abs/1809.10897
====================================================
A model for system developers to measure the privacy risk of data (Awanthika Senarath - 28 September, 2018)
In this paper, we propose a model that could be used by system developers to measure the privacy risk perceived by users when they disclose data into software systems. We first derive a model to measure the perceived privacy risk based on existing knowledge and then we test our model through a survey with 151 participants
Link: https://arxiv.org/abs/1809.10884
====================================================
Graph Generation via Scattering (Dongmian Zou - 28 September, 2018)
These results are in contrast to experience with Euclidean data, where it is difficult to form a generative scattering network that performs as well as state-of-the-art methods
Link: https://arxiv.org/abs/1809.10851
====================================================
Generative Adversarial Active Learning for Unsupervised Outlier Detection (Yezheng Liu - 27 September, 2018)
We empirically compare the proposed approach with several state-of-the-art outlier detection methods on both synthetic and real-world datasets
Link: https://arxiv.org/abs/1809.10816
====================================================
FanStore: Enabling Efficient and Scalable I/O for Distributed Deep Learning (Zhao Zhang - 27 September, 2018)
With the techniques of system call interception, distributed metadata management, and generic data compression, FanStore provides a POSIX-compliant interface with native hardware throughput in an efficient and scalable manner. Our experiments with benchmarks and real applications show that FanStore can scale DL training to 512 compute nodes with over 90\% scaling efficiency.
Link: https://arxiv.org/abs/1809.10799
====================================================
Estimation of Personalized Effects Associated With Causal Pathways (Razieh Nabi - 27 September, 2018)
For example, we may wish to maximize the chemical effect of a drug given data from an observational study where the chemical effect of the drug on the outcome is entangled with the indirect effect mediated by differential adherence. [16] shows how to combine mediation analysis and dynamic treatment regime ideas to defines policies associated with causal pathways and counterfactual responses to these policies
Link: https://arxiv.org/abs/1809.10791
====================================================
Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects (Jonathan Tremblay - 27 September, 2018)
Using synthetic data generated in this manner, we introduce a one-shot deep neural network that is able to perform competitively against a state-of-the-art network trained on a combination of real and synthetic data. To our knowledge, this is the first deep network trained only on synthetic data that is able to achieve state-of-the-art performance on 6-DoF object pose estimation
Link: https://arxiv.org/abs/1809.10790
====================================================
Semantic Topic Analysis of Traffic Camera Images (Jeffrey Liu - 27 September, 2018)
We apply the Latent Dirichlet Allocation (LDA) topic model to decompose the label data into a small number of semantic topics. To illustrate our approach, we use freeway camera images collected from the Boston area between December 2017-January 2018
Link: https://arxiv.org/abs/1809.10707
====================================================
dynamicMF: A Matrix Factorization Approach to Monitor Resource Usage in High Performance Computing Systems (Niyazi Sorkunlu - 26 September, 2018)
Results on resource usage data collected from the Lonestar 4 system at the Texas Advanced Computing Center show that the identified anomalies are correlated with actual anomalous events reported in the system log messages.
Link: https://arxiv.org/abs/1809.10624
====================================================
Acoustic Probing for Estimating the Storage Time and Firmness of Tomatoes and Mandarin Oranges (Hidetomo Kataoka - 27 September, 2018)
We performed cross validation by using this data set. The average estimation errors of storage time and firmness for tomatoes were 0.89 days and 9.47 g/mm2. Those for mandarin oranges were 1.67 days and 15.67 g/mm2
Link: https://arxiv.org/abs/1809.10581
====================================================
Closest-Pair Queries in Fat Rectangles (Sang Won Bae - 27 September, 2018)
In the range closest pair problem, we want to construct a data structure storing a set $S$ of $n$ points in the plane, such that for any axes-parallel query rectangle $R$, the closest pair in the set $R \cap S$ can be reported. The currently best result for this problem is by Xue et al.~(SoCG 2018)
Link: https://arxiv.org/abs/1809.10531
====================================================
No New-Net (Fabian Isensee - 27 September, 2018)
By incorporating region based training, additional training data and a simple postprocessing technique, we obtain dice scores of 81.01, 90.83 and 85.44 and Hausdorff Distances (95th percentile) of 2.54, 4.97 and 7.
Link: https://arxiv.org/abs/1809.10483
====================================================
Sample Efficient Adaptive Text-to-Speech (Yutian Chen - 27 September, 2018)
The experiments show that these approaches are successful at adapting the multi-speaker neural network to new speakers, obtaining state-of-the-art results in both sample naturalness and voice similarity with merely a few minutes of audio data from new speakers.
Link: https://arxiv.org/abs/1809.10460
====================================================
Queue-based Resampling for Online Class Imbalance Learning (Kleanthis Malialis - 27 September, 2018)
Results on two popular benchmark datasets demonstrate the effectiveness of queue-based resampling over state-of-the-art methods in terms of learning speed and quality.
Link: https://arxiv.org/abs/1809.10388
====================================================
Deeply Informed Neural Sampling for Robot Motion Planning (Ahmed H. Qureshi - 26 September, 2018)
DeepSMP's neural architecture comprises of a Contractive AutoEncoder which encodes given workspaces directly from a raw point cloud data, and a Dropout-based stochastic deep feedforward neural network which takes the workspace encoding, start and goal configuration, and iteratively generates feasible samples for SMPs to compute end-to-end collision-free optimal paths. The results show that on average our method is at least 7 times faster in point-mass and rigid-body case and about 28 times faster in 6-link robot case than the existing state-of-the-art.
Link: https://arxiv.org/abs/1809.10252
====================================================
Classifying Mammographic Breast Density by Residual Learning (Jingxu Xu - 21 September, 2018)
The proposed method was instantiated with the INbreast dataset and classification accuracies of 92.6% and 96.8% were obtained for the four BI-RADS (Breast Imaging and Reporting Data System) category task and the two BI-RADS category task,respectively
Link: https://arxiv.org/abs/1809.10241
====================================================
Left Ventricle Segmentation and Quantification from Cardiac Cine MR Images via Multi-task Learning (Shusil Dangi - 26 September, 2018)
We performed a five fold cross-validation of the myocardium segmentation obtained from the proposed multi-task network on 97 patient 4-dimensional cardiac cine-MRI datasets available through the STACOM LV segmentation challenge against the provided gold-standard myocardium segmentation, obtaining a Dice overlap of $0.849 \pm 0.036$ and mean surface distance of $0.274 \pm 0.083$ mm, while simultaneously estimating the myocardial area with mean absolute difference error of $205\pm198$ mm$^2$.
Link: https://arxiv.org/abs/1809.10221
====================================================
Unsupervised Adversarial Invariance (Ayush Jaiswal - 26 September, 2018)
Our unsupervised model outperforms state-of-the-art methods, which are supervised, at inducing invariance to inherent nuisance factors, effectively using synthetic data augmentation to learn invariance, and domain adaptation
Link: https://arxiv.org/abs/1809.10083
====================================================
Learning short-term past as predictor of human behavior in commercial buildings (Romana Markovic - 17 September, 2018)
The addressed sequence duration was in the range between 30 and 240 time-steps of indoor climate data, where the applied temporal discretization was one minute. The results pointed out, that the optimal predictive performance was achieved for the case where 60 time-steps of the indoor climate data were used as input. The analysis of the prediction accuracy in the form of F1 score for the different time-lag of future window states dropped from 0.51 to 0.27, when shifting the prediction target from 10 to 60 minutes in future.
Link: https://arxiv.org/abs/1809.10020
====================================================
A Novel Online Stacked Ensemble for Multi-Label Stream Classification (Alican BÃ¼yÃ¼kÃ§akÄ±r - 26 September, 2018)
We conduct experiments with 4 GOOWE-ML-based multi-label ensembles and 7 baseline models on 7 real-world datasets from diverse areas of interest
Link: https://arxiv.org/abs/1809.09994
====================================================
Satellite Imagery Multiscale Rapid Detection with Windowed Networks (Adam Van Etten - 24 September, 2018)
The proposed approach allows comparison of the performance of these four frameworks, and can rapidly detect objects of vastly different scales with relatively little training data over multiple sensors. airplanes versus airports) we find that using two different detectors at different scales is very effective with negligible runtime cost.We evaluate large test images at native resolution and find mAP scores of 0.2 to 0.8 for vehicle localization, with the YOLT architecture achieving both the highest mAP and fastest inference speed.
Link: https://arxiv.org/abs/1809.09978
====================================================
Morphed Learning: Towards Privacy-Preserving for Deep Learning Based Applications (Juncheng Shen - 20 September, 2018)
Theoretical analyses on CIFAR-10 dataset and VGG-16 network show that our method is capable of providing 10^89 morphing possibilities with only 5% computational overhead and 10% transmission overhead under limited knowledge attack scenario
Link: https://arxiv.org/abs/1809.09968
====================================================
Time-Series Prediction of Proximal Aggression Onset in Minimally-Verbal Youth with Autism Spectrum Disorder Using Physiological Biosignals (Ozan Ozdenizci - 14 September, 2018)
We implement ridge-regularized logistic regression models on physiological biosensor data wirelessly recorded from 15 MV-ASD youth over 64 independent naturalistic observations in a hospital inpatient unit. Our results demonstrate proof-of-concept, feasibility, and incipient validity predicting aggression onset 1 minute before it occurs using global, person-dependent, and hybrid classifier models.
Link: https://arxiv.org/abs/1809.09948
====================================================
GPU Accelerated Similarity Self-Join for Multi-Dimensional Data (Michael Gowanlock - 26 September, 2018)
Across most scenarios on real-world and synthetic datasets, our algorithm outperforms the parallel state-of-the-art approach
Link: https://arxiv.org/abs/1809.09930
====================================================
Performance and sensitivities of home detection from mobile phone data (Maarten Vanhoof - 26 September, 2018)
In this paper, we present an extensive empirical analysis of home detection methods when performed on a nation-wide mobile phone dataset from France. We analyze the validity of 9 different Home Detection Algorithms (HDAs), and we assess different sources of uncertainty. Based on 225 different set-ups for the home detection of around 18 million users we discuss different measures for validation and investigate sensitivity to user choices such as HDA parameter choice and observation period restriction. Our findings show that nation-wide performance of home detection is moderate at best, with correlations to ground truth maximizing at 0.60 only
Link: https://arxiv.org/abs/1809.09911
====================================================
Active Learning for Deep Object Detection (Clemens-Alexander Brust - 26 September, 2018)
All methods are evaluated systematically in a continuous exploration context on the PASCAL VOC 2012 dataset.
Link: https://arxiv.org/abs/1809.09875
====================================================
Deep contextualized word representations for detecting sarcasm and irony (Suzana IliÄ - 25 September, 2018)
We test our model on 7 different datasets derived from 3 different data sources, providing state-of-the-art performance in 6 of them, and otherwise offering competitive results.
Link: https://arxiv.org/abs/1809.09795
====================================================
Surface Type Estimation from GPS Tracked Bicycle Activities (Nitish Nag - 25 September, 2018)
In this work, we use a computationally inexpensive and simple method by using only GPS data from a human powered cyclist. We show in our methods, the decision trees performed the best with an accuracy of 86\%
Link: https://arxiv.org/abs/1809.09745
====================================================
Optimizing the Human-Machine Partnership with Zooniverse (Lucy Fortson - 25 September, 2018)
With over 120 projects built reaching nearly 1.7 million volunteers, the Zooniverse.org platform has led the way in the application of Citizen Science as a method for closing the Big Data analysis gap. Since the launch in 2007 of the Galaxy Zoo project, the Zooniverse platform has enabled significant contributions across many disciplines; e.g., in ecology, humanities, and astronomy. To cope with the larger datasets looming on the horizon such as astronomy's Large Synoptic Survey Telescope (LSST) or the 100's of TB from ecology projects annually, Zooniverse has been researching a system design that is optimized for efficiency in task assignment and incorporating human and machine classifiers into the classification engine
Link: https://arxiv.org/abs/1809.09738
====================================================
Security and Performance Considerations in ROS 2: A Balancing Act (Jongkil Kim - 24 September, 2018)
Robot Operating System (ROS) 2 is a ground-up re-design of ROS 1 to support performance critical cyber-physical systems (CPSs) using the Data Distribution Service (DDS) middleware. Accordingly, the security of ROS 2 is highly reliant on the security of its DDS communication protocol. To accomplish this, we evaluate the latency and throughput of the communication protocols of ROS 2 in both wired and wireless networks, and measure the efficiency loss caused by the enabling of security protocols such as Virtual Private Network (VPN) and DDS security protocol in ROS 2 in both network setups. The result can be directly used by robotics developers to find the optimal and balanced settings of ROS 2 applications. The results of this work can be used to enhance the security of ROS 2.
Link: https://arxiv.org/abs/1809.09566
====================================================
Fine-Tuning VGG Neural Network For Fine-grained State Recognition of Food Images (Kaoutar Ben Ahmed - 8 September, 2018)
A small-scale dataset consisting of 5978 images of seven categories was constructed and annotated manually
Link: https://arxiv.org/abs/1809.09529
====================================================
Antilizer: Run Time Self-Healing Security for Wireless Sensor Networks (Ivana Tomic - 25 September, 2018)
Our results show that Antilizer reduces data loss down to 1% (4% on average), with operational overheads of less than 1% and provides fast network-wide convergence.
Link: https://arxiv.org/abs/1809.09426
====================================================
RapidHARe: A computationally inexpensive method for real-time human activity recognition from wearable sensors (Roman Chereshnev - 25 September, 2018)
Here, we present a new method called RapidHARe for real-time human activity recognition based on modeling the distribution of a raw data in a half-second context window using dynamic Bayesian networks. Moreover, in performance, RapidHare achieves an F1 score of 94.27\% and accuracy of 98.94\%, and when compared to ANN, RNN, HMM, it reduces the F1-score error rate by 45\%, 65\%, and 63\% and the accuracy error rate by 41\%, 55\%, and 62\%, respectively
Link: https://arxiv.org/abs/1809.09412
====================================================
Pre and Post-hoc Diagnosis and Interpretation of Malignancy from Breast DCE-MRI (Gabriel Maicas - 25 September, 2018)
Relying on experiments on a breast DCE-MRI dataset that contains scans of 117 patients, our results show that the post-hoc method is more accurate for diagnosing the whole volume per patient, achieving an AUC of 0.91, while the pre-hoc method achieves an AUC of 0.81
Link: https://arxiv.org/abs/1809.09404
====================================================
An Efficient Framework for Implementing Persist Data Structures on Remote NVM (Teng Ma - 25 September, 2018)
Specifically, thanks to operation batching, local memory caching and efficient concurrency control, the throughput of operations on eight widely used data structures is improved by 6$\sim$22 $\times$ without lowering the consistency promising.
Link: https://arxiv.org/abs/1809.09395
====================================================
Why scatter plots suggest causality, and what we can do about it (Carl T. Bergstrom - 25 September, 2018)
To avoid suggesting a causal relationship between the x and y values in a scatter plot, we propose a new type of data visualization, the diamond plot. Diamond plots are essentially 45 degree rotations of ordinary scatter plots; by visually jarring the viewer they clearly indicate that she should not draw the usual distinction between independent/predictor variable and dependent/response variable
Link: https://arxiv.org/abs/1809.09328
====================================================
Object Detection from Scratch with Deep Supervision (Zhiqiang Shen - 24 September, 2018)
We evaluate our method on PASCAL VOC 2007, 2012 and COCO datasets
Link: https://arxiv.org/abs/1809.09294
====================================================
Covfefe: A Computer Vision Approach For Estimating Force Exertion (Vaneet Aggarwal - 24 September, 2018)
Based on the data collected from 20 subjects, features extracted from the face videos give 90\% accuracy in classification among the 100\% and the combination of 0\% and 50\% datasets
Link: https://arxiv.org/abs/1809.09293
====================================================
Tunable Measures for Information Leakage and Applications to Privacy-Utility Tradeoffs (Jiachun Liao - 24 September, 2018)
This measure quantifies the maximal gain of an adversary in refining a tilted version of its posterior belief of any (potentially random) function of a data set conditioning on a released data set. For $Î±\in(1,\infty)$ this measure is shown to be the Arimoto channel capacity. We show that under a hard distortion constraint, both the optimal mechanism and the optimal tradeoff are invariant for any $Î±>1$, and the tunable leakage measure only behaves as either of the two extrema, i.e., mutual information for $Î±=1$ and maximal leakage for $Î±=\infty$.
Link: https://arxiv.org/abs/1809.09231
====================================================
Towards Automated Post-Earthquake Inspections with Deep Learning-based Condition-Aware Models (Vedhus Hoskere - 24 September, 2018)
Researchers typi-cally envisage the use of unmanned aerial vehicles (UAV) for data acquisition and computer vision for data processing to extract actionable information. The proposed methodology was implemented on a damaged building that was sur-veyed by the authors after the Central Mexico Earthquake in September 2017 and qualitative-ly evaluated
Link: https://arxiv.org/abs/1809.09195
====================================================
Deep Confidence: A Computationally Efficient Framework for Calculating Reliable Errors for Deep Neural Networks (Isidro Cortes-Ciriano - 24 September, 2018)
Using a set of 24 diverse IC50 data sets from ChEMBL 23, we show that Snapshot Ensembles perform on par with Random Forest (RF) and ensembles of independently trained deep neural networks
Link: https://arxiv.org/abs/1809.09060
====================================================
Lexical Bias In Essay Level Prediction (Georgios Balikas - 21 September, 2018)
In this work I present the system "balikasg" that achieved the state-of-the-art performance in the CAp 2018 data science challenge among 14 systems
Link: https://arxiv.org/abs/1809.08935
====================================================
Robotics Rights and Ethics Rules (Tuncay Yigit - 24 September, 2018)
With industry 4.0, the internet of things, data analysis and automation have begun to be of great importance in our lives. With the Yapanese version of Industry 5.0, it has come to our attention that machine-human interaction and human intelligence are working in harmony with the cognitive computer
Link: https://arxiv.org/abs/1809.08885
====================================================
Classify, predict, detect, anticipate and synthesize: Hierarchical recurrent latent variable models for human activity modeling (Judith BÃ¼tepage - 24 September, 2018)
We train our models on data extracted from depth image streams from the Cornell Activity 120, the UTKinect-Action3D and the Stony Brook University Kinect Interaction Dataset
Link: https://arxiv.org/abs/1809.08875
====================================================
Person Identification using Seismic Signals generated from Footfalls (Bodhibrata Mukhopadhyay - 24 September, 2018)
We have tested our biometric system on an indigenous database (created by us) containing 46000 footfall events from 8 individuals and achieved an accuracy of 73%, 90% and 95% in case of 1, 5 and 10 footsteps per sample. DS8BP compresses the original footfall events (sampled at 8 kHz) by a factor of 108 and also acts as a smoothing filter
Link: https://arxiv.org/abs/1809.08783
====================================================
Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting (Katharina Kann - 23 September, 2018)
On a 52-language benchmark dataset, we outperform the previous state of the art by up to 9.71% absolute accuracy.
Link: https://arxiv.org/abs/1809.08733
====================================================
Recognizing Film Entities in Podcasts (Ahmet Salih Gundogdu - 23 September, 2018)
Evaluating on a diverse set of podcasts, we demonstrate more than a 20% increase in F1 score across three baseline approaches when combining fuzzy-matching with a linear model aware of film-specific metadata.
Link: https://arxiv.org/abs/1809.08711
====================================================
Textually Enriched Neural Module Networks for Visual Question Answering (Khyathi Raghavi Chandu - 23 September, 2018)
We achieve 57.1% overall accuracy on the test-dev open-ended questions from the visual question answering (VQA 1.0) real image dataset.
Link: https://arxiv.org/abs/1809.08697
====================================================
Curvilinear Structure Enhancement by Multiscale Top-Hat Tensor in 2D/3D Images (Shuaa S. Alharbi - 23 September, 2018)
The proposed approach is validated on synthetic and real data and is also compared to the state-of-the-art approaches
Link: https://arxiv.org/abs/1809.08678
====================================================
Detecting Hate Speech and Offensive Language on Twitter using Machine Learning: An N-gram and TFIDF based Approach (Aditya Gaydhani - 23 September, 2018)
After tuning the model giving the best results, we achieve 95.6% accuracy upon evaluating it on test data
Link: https://arxiv.org/abs/1809.08651
====================================================
BrainNet: A Multi-Person Brain-to-Brain Interface for Direct Collaboration Between Brains (Linxing Jiang - 23 September, 2018)
Two of the three subjects are "Senders" whose brain signals are decoded using real-time EEG data analysis to extract decisions about whether to rotate a block in a Tetris-like game before it is dropped to fill a line. Five groups of three subjects successfully used BrainNet to perform the Tetris task, with an average accuracy of 0.813
Link: https://arxiv.org/abs/1809.08632
====================================================
Understanding the Gist of Images - Ranking of Concepts for Multimedia Indexing (Lydia Weiland - 23 September, 2018)
Nowadays, where multimedia data is continuously generated, stored, and distributed, multimedia indexing, with its purpose of group- ing similar data, becomes more important than ever. Finally, with a MAP of 61.42, it can be shown that the multimedia in- dexing task benefits from understanding the gist
Link: https://arxiv.org/abs/1809.08593
====================================================
Multi-View Picking: Next-best-view Reaching for Improved Grasping in Clutter (Douglas Morrison - 23 September, 2018)
Where other approaches use a static camera position or fixed data collection routines, our Multi-View Picking (MVP) controller uses an active perception approach to choose informative viewpoints based directly on a distribution of grasp pose estimates in real time, reducing uncertainty in the grasp poses caused by clutter and occlusions. In trials of grasping 20 objects from clutter, our MVP controller achieves 80% grasp success, outperforming a single-viewpoint grasp detector by 12%
Link: https://arxiv.org/abs/1809.08564
====================================================
Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization (Bao Wang - 22 September, 2018)
This data-dependent activation function remarkably improves both classification accuracy and stability to adversarial perturbations. Together with the total variation minimization of adversarial images and augmented training, under the strongest attack, we achieve up to 20.6$\%$, 50.7$\%$, and 68.7$\%$ accuracy improvement w.r.t
Link: https://arxiv.org/abs/1809.08516
====================================================
SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud (Bichen Wu - 22 September, 2018)
However, due to domain shift, models trained on synthetic data often do not generalize well to the real world. We address this problem with a domain-adaptation training pipeline consisting of three major components: 1) learned intensity rendering, 2) geodesic correlation alignment, and 3) progressive domain calibration. When training our new model on synthetic data using the proposed domain adaptation pipeline, we nearly double test accuracy on real-world data, from 29.0% to 57.4%
Link: https://arxiv.org/abs/1809.08495
====================================================
Geometric Multi-Model Fitting by Deep Reinforcement Learning (Zongliang Zhang - 22 September, 2018)
In this paper, we have compared our method against the state-of-the-art on simulated data
Link: https://arxiv.org/abs/1809.08397
====================================================
The Privacy Policy Landscape After the GDPR (Thomas Linden - 22 September, 2018)
Via a user study with 530 participants on Amazon Mturk, we discover that the visual presentation of privacy policies has slightly improved in limited data-sensitive categories in addition to the top European websites. We also find that the readability of privacy policies suffers under the GDPR, due to almost a 30% more sentences and words, despite the efforts to reduce the reliance on passive sentences. We find evidence for positive changes triggered by the GDPR, with the ambiguity level, averaged over 8 metrics, improving in over 20.5% of the policies. Finally, we show that privacy policies cover more data practices, particularly around data retention, user access rights, and specific audiences, and that an average of 15.2% of the policies improved across 8 compliance metrics
Link: https://arxiv.org/abs/1809.08396
====================================================
Augmenting Input Method Language Model with user Location Type Information (Di He - 21 September, 2018)
This work queried micro-blog posts from Twitter API and location type of these posts from Google Place API, forming a dataset of around 500k samples. An LSTM based prediction experiment found a 2% edge in the accuracy from language models leveraging location type information when compared to a baseline without that information.
Link: https://arxiv.org/abs/1809.08349
====================================================
Generating GraphQL-Wrappers for REST(-like) APIs (Erik Wittern - 21 September, 2018)
We discuss the challenges for creating such wrappers, including dealing with data sanitation, authentication, or handling nested queries. We evaluate OASGraph by running it, as well as an existing open source alternative, against 959 publicly available OAS. This experiment shows that OASGraph outperforms the existing alternative and is able to create a GraphQL wrapper for 89.5% of the APIs -- however, with limitations in many cases
Link: https://arxiv.org/abs/1809.08319
====================================================
A Graphical Bayesian Game for Secure Sensor Activation in Internet of Battlefield Things (Nof Abuzainab - 21 September, 2018)
The utility of each sensor is expressed in terms of the redundancy of the data transmitted, the secrecy capacity and the energy consumed. The reduction in energy consumption reaches up to 98% compared to the baseline, when the number of sensors is 5000.
Link: https://arxiv.org/abs/1809.08207
====================================================
Exclusive Independent Probability Estimation using Deep 3D Fully Convolutional DenseNets for IsoIntense Infant Brain MRI Segmentation (Seyed Raein Hashemi - 27 September, 2018)
Using our training technique based on similarity loss functions and patch prediction fusion we decrease the number of parameters in the network, reduce the complexity of the training process focusing the attention on less number of tasks, while mitigating the effects of data imbalance between labels and inaccuracies near patch borders. By taking advantage of these strategies we were able to perform fast image segmentation, using a network with less parameters than many state-of-the-art networks, being image size independent overcoming issues such as 3D vs 2D training and large vs small patch size selection, while achieving the top performance in segmenting brain tissue among all methods in the 2017 iSeg challenge
Link: https://arxiv.org/abs/1809.08168
====================================================
Sampler Design for Bayesian Personalized Ranking by Leveraging View Data (Jingtao Ding - 21 September, 2018)
Compared to the vanilla BPR that applies a uniform sampler on all candidates, our view-enhanced sampler enhances BPR with a relative improvement over 37.03% and 16.40% on two real-world datasets
Link: https://arxiv.org/abs/1809.08162
====================================================
Learning Recommender Systems from Multi-Behavior Data (Chen Gao - 21 September, 2018)
Extensive experiments on two real-world datasets demonstrate that NMTR significantly outperforms state-of-the-art recommender systems that are designed to learn from both single-behavior data and multi-behavior data
Link: https://arxiv.org/abs/1809.08161
====================================================
Aspects on Finding the Optimal Practical Programming Exercise for MOOCs (Ralf Teusner - 21 September, 2018)
In this paper, we explore the data of three programming courses to find criteria for optimal practical programming exercises. Based on over 3 million executions and scoring runs of participants' task submissions, we aim to deduct exercise difficulty, student patterns in approaching the tasks and potential flaws in task descriptions and preparatory videos
Link: https://arxiv.org/abs/1809.08056
====================================================
SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection (Meijun Sun - 21 September, 2018)
Extensive experiments in comparison with 11 state-of-the-art methods are carried out, and the results show that our proposed model outperforms all 11 methods across a number of publicly available datasets.
Link: https://arxiv.org/abs/1809.07988
====================================================
SLIDER: Fast and Efficient Computation of Banded Sequence Alignment (Mohammed Alser - 18 September, 2018)
Motivation: The ability to generate massive amounts of sequencing data continues to overwhelm the processing capacity of existing algorithms and compute infrastructures. The addition of SLIDER as a pre-alignment step reduces the execution time of five state-of-the-art sequence align-ers by up to 18.8x
Link: https://arxiv.org/abs/1809.07858
====================================================
Internet Protocol Version 6: Dead or Alive? (Sumit Maheshwari - 17 August, 2018)
Internet Protocol (IP) is the narrow waist of multilayered Internet protocol stack which defines the rules for data sent across networks. IPv4 is the fourth version of IP and first commercially available for deployment set by ARPANET in 1983 which is a 32 bit long address and can support up to 232 devices. In April 2017, all Regional Internet Registries (RIRs) confirmed that IPv4 addresses are exhausted and cannot be allocated anymore implying any new organization requesting a block of Internet addresses will be allocated IPv6. Currently, when IPv4 is not available, and IPv6 is not adopted for around 20 years, the question arises whether IPv6 will still be accepted by the computer society or will it have an end of life soon with alternate better protocol such as ID based networks taking its place
Link: https://arxiv.org/abs/1809.07836
====================================================
Rapid Customization for Event Extraction (Yee Seng Chan - 20 September, 2018)
Additionally, the system uses the ACE corpus to train an argument model for extracting Actor, Place, and Time arguments for any event types, including ones not seen in its training data. Experiments show that with less than 10 minutes of human effort per event type, the system achieves good performance for 67 novel event types
Link: https://arxiv.org/abs/1809.07783
====================================================
Specimens as research objects: reconciliation across distributed repositories to enable metadata propagation (Nicky Nicolson - 20 September, 2018)
Following a data mining exercise applied to an aggregated dataset of 19,827,998 specimen records from 292 separate specimen repositories, 36% or 7,102,710 specimens are assessed to participate in duplication relationships, allowing the propagation of metadata among the participants in these relationships, totalling: 93,044 type citations, 1,121,865 georeferences, 1,097,168 images and 2,191,179 scientific name determinations
Link: https://arxiv.org/abs/1809.07725
====================================================
Design and Implementation of High-throughput PCIe with DMA Architecture between FPGA and PowerPC (Kun Cheng - 17 September, 2018)
A data throughput of more than 666 MBytes/s(memory write with data from FPGA to PowerPC) has been achieved with the single PCIe Gen1 x8 lanes endpoint of this design, PowerPC and FPGA can send memory write request to each other.
Link: https://arxiv.org/abs/1809.07702
====================================================
A Microbenchmark Characterization of the Emu Chick (Jeffrey Young - 7 September, 2018)
Rather than transferring large amounts of data across power-hungry, high-latency interconnects, the Emu Chick moves lightweight thread contexts to near-memory cores before the beginning of each memory read. AsHES 2018) of the the memory bandwidth characteristics of the system through benchmarks like STREAM, pointer chasing, and sparse matrix-vector multiplication. Moreover, the Emu Chick provides stable, predictable performance with up to 65% of the peak bandwidth utilization on a random-access pointer chasing benchmark with weak locality.
Link: https://arxiv.org/abs/1809.07696
====================================================
Autonomous Driving System Design for Formula Student Driverless Racecar (Hanqing Tian - 19 September, 2018)
Detection algorithm of the racecar also implements a precise and high rate localization method which combines the GPS-INS data and LIDAR odometry. This paper also briefly introduces the Formula Student Autonomous Competition (FSAC) in 2017.
Link: https://arxiv.org/abs/1809.07636
====================================================
DuPLO: A DUal view Point deep Learning architecture for time series classificatiOn (Roberto Interdonato - 20 September, 2018)
Nowadays, modern Earth Observation systems continuously generate huge amounts of data. A notable example is represented by the Sentinel-2 mission, which provides images at high spatial resolution (up to 10m) with high temporal revisit period (every 5 days), which can be organized in Satellite Image Time Series (SITS)
Link: https://arxiv.org/abs/1809.07589
====================================================
Assessing the quality of home detection from mobile phone data for official statistics (Maarten Vanhoof - 20 September, 2018)
We support our argument by analysing the performance of five home detection algorithms (HDAs) that have been applied to a large, French, Call Detailed Record (CDR) dataset (~18 million users, 5 months). Our results show that criteria choice in HDAs influences the detection of home locations for up to about 40% of users, that HDAs perform poorly when compared with a validation dataset (the 35Â°-gap), and that their performance is sensitive to the time period and the duration of observation
Link: https://arxiv.org/abs/1809.07567
====================================================
OxIOD: The Dataset for Deep Inertial Odometry (Changhao Chen - 20 September, 2018)
Our dataset contains 158 sequences totalling more than 42 km in total distance, much larger than previous inertial datasets
Link: https://arxiv.org/abs/1809.07491
====================================================
SoaAlloc: Accelerating Single-Method Multiple-Objects Applications on GPUs (Matthias Springer - 19 September, 2018)
SoaAlloc is the first allocator for GPUs that (a) arranges allocations in a SIMD-friendly Structure of Arrays (SOA) data layout, (b) provides a do-all operation for maximizing the benefit of SOA, and (c) is on par with state-of-the-art memory allocators for raw (de)allocation time. Our benchmarks show that the SOA layout leads to significantly better memory bandwidth utilization, resulting in a 2x speedup of application code.
Link: https://arxiv.org/abs/1809.07444
====================================================
Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System (Jiaxi Tang - 19 September, 2018)
The experiments on public data sets and state-of-the-art recommendation models showed that RD achieves its design purposes: the student model learnt with RD has a model size less than half of the teacher model while achieving a ranking performance similar to the teacher model and much better than the student model learnt without RD.
Link: https://arxiv.org/abs/1809.07428
====================================================
The Read-Optimized Burrows-Wheeler Transform (Travis Gagie - 19 September, 2018)
The advent of high-throughput sequencing has resulted in massive genomic datasets, some consisting of assembled genomes but others consisting of raw reads. The best current fully-functional index for repetitive collections (Gagie et al., SODA 2018) uses space proportional to this number.
Link: https://arxiv.org/abs/1809.07320
====================================================
MTLE: A Multitask Learning Encoder of Visual Feature Representations for Video and Movie Description (Oliver Nina - 19 September, 2018)
Many of the current state of the art methods for video captioning and movie description rely on simple encoding mechanisms through recurrent neural networks to encode temporal visual information extracted from video data. Our method shows improved performance over current state of the art methods in several metrics on multi-caption and single-caption datasets. Our method demonstrates its robustness on the Large Scale Movie Description Challenge (LSMDC) 2017 where our method won the movie description task and its results were ranked among other competitors as the most helpful for the visually impaired.
Link: https://arxiv.org/abs/1809.07257
====================================================
Unbalanced Three-Phase Distribution Grid Topology Estimation and Bus Phase Identification (Yizheng Liao - 9 October, 2018)
For validation, we extensively simulate on IEEE $37$- and $123$-bus systems using real data from PG\&E, ADRES Project, and Pecan Street
Link: https://arxiv.org/abs/1809.07192
====================================================
NICT's Corpus Filtering Systems for the WMT18 Parallel Corpus Filtering Task (Rui Wang - 19 September, 2018)
Using the clean data of the WMT18 shared news translation task, we designed several features and trained a classifier to score each sentence pairs in the noisy data. Finally, we sampled 100 million and 10 million words and built corresponding NMT systems
Link: https://arxiv.org/abs/1809.07043
====================================================
Generating 3D Adversarial Point Clouds (Chong Xiang - 19 September, 2018)
In addition, we propose 7 perturbation measurement metrics tailored to different attacks and conduct extensive experiments to evaluate the proposed algorithms on the ModelNet40 dataset. Overall, our attack algorithms achieve about 100% attack success rate for all targeted attacks.
Link: https://arxiv.org/abs/1809.07016
====================================================
Wearable-based Mediation State Detection in Individuals with Parkinson's Disease (Murtadha D. Hssayeni - 18 September, 2018)
The developed algorithm is evaluated using a dataset with 19 PD subjects and a total duration of 1,052.24 minutes (17.54 hours). The algorithm resulted in an average classification accuracy of 90.5%, sensitivity of 94.2%, and specificity of 85.4%.
Link: https://arxiv.org/abs/1809.06973
====================================================
A Study on Deep Learning Based Sauvegrain Method for Measurement of Puberty Bone Age (Seung Bin Baik - 18 September, 2018)
The selected reference images were learned without being included in the evaluation data, and at the same time, the data was extended to accommodate the number of cases. The mean absolute error of the Sauvegrain method based on deep learning is 2.8 months and the Mean Absolute Percentage Error (MAPE) is 0.018
Link: https://arxiv.org/abs/1809.06965
====================================================
Multi-Task Learning for Machine Reading Comprehension (Yichong Xu - 18 September, 2018)
Experiments on the Stanford Question Answering Dataset (SQuAD), the Microsoft MAchine Reading COmprehension Dataset (MS MARCO), NewsQA and other datasets show that our multi-task learning approach achieves significant improvement over state-of-the-art models in most MRC tasks.
Link: https://arxiv.org/abs/1809.06963
====================================================
Document Informed Neural Autoregressive Topic Models with Distributional Prior (Pankaj Gupta - 15 September, 2018)
We present novel neural autoregressive topic model variants that consistently outperform state-of-the-art generative topic models in terms of generalization, interpretability (topic coherence) and applicability (retrieval and classification) over 6 long-text and 8 short-text datasets from diverse domains.
Link: https://arxiv.org/abs/1809.06709
====================================================
Capsule Deep Neural Network for Recognition of Historical Graffiti Handwriting (Nikita Gordienko - 11 September, 2018)
CGCL dataset contains >4000 images for glyphs of 34 letters which are hardly recognized by experts even in contrast to notMNIST dataset with the better images of 10 letters taken from different fonts. The area under curve (AUC) values for receiver operating characteristic (ROC) were also higher for the capsule network model than for CNN model: 0.88-0.93 (capsule network) and 0.50 (CNN) without data augmentation, 0.91-0.95 (capsule network) and 0.51 (CNN) with lossless data augmentation, and similar results of 0.91-0.93 (capsule network) and 0.9 (CNN) in the regime of lossless data augmentation only
Link: https://arxiv.org/abs/1809.06693
====================================================
Dynamically Weighted Ensemble-based Prediction System for Adaptively Modeling Driver Reaction Time (Chun-Hsiang Chuang - 18 September, 2018)
This system comprises a set of prediction submodels that are individually trained using groups of data with similar EEG-RT relationships. To obtain a final prediction, the prediction outcomes of the sub-models are then multiplied by weights that are derived from the EEG alpha coherences of 10 channels plus theta band powers of 30 channels, whose changes were found to be indicators of variations in the EEG-RT relationship
Link: https://arxiv.org/abs/1809.06675
====================================================
Attribute Enhanced Face Aging with Wavelet-based Generative Adversarial Networks (Yunfan Liu - 18 September, 2018)
Qualitative results demonstrate the ability of our model to synthesize visually plausible face images, and extensive quantitative evaluation results show that the proposed method achieves state-of-the-art performance on existing databases.
Link: https://arxiv.org/abs/1809.06647
====================================================
Talking to myself: self-dialogues as data for conversational agents (Joachim Fainberg - 19 September, 2018)
This paper presents a novel method for gathering topical, unstructured conversational data in an efficient way: self-dialogues through crowd-sourcing. Alongside this paper, we include a corpus of 3.6 million words across 23 topics
Link: https://arxiv.org/abs/1809.06641
====================================================
Learning Universal Sentence Representations with Mean-Max Attention Autoencoder (Minghua Zhang - 18 September, 2018)
By training our model on a large collection of unlabelled data, we obtain high-quality representations of sentences. Experimental results on a broad range of 10 transfer tasks demonstrate that our model outperforms the state-of-the-art unsupervised single methods, including the classical skip-thoughts and the advanced skip-thoughts+LN model
Link: https://arxiv.org/abs/1809.06590
====================================================
User Information Augmented Semantic Frame Parsing using Coarse-to-Fine Neural Networks (Yilin Shen - 18 September, 2018)
Although state-of-the-art approaches showed good results, they require large annotated training data and long training time. The results show that our approach leverages such simple user information to outperform state-of-the-art approaches by 0.25% for intent detection and 0.31% for slot filling using standard training data. When using smaller training data, the performance improvement on intent detection and slot filling reaches up to 1.35% and 1.20% respectively. We also show that our approach can achieve similar performance as state-of-the-art approaches by using less than 80% annotated training data. Moreover, the training time to achieve the similar performance is also reduced by over 60%.
Link: https://arxiv.org/abs/1809.06559
====================================================
Joint User Association and Resource Allocation Optimization for Ultra Reliable Low Latency HetNets (Mohammad Yousefvand - 18 September, 2018)
In our scheme, CBSs share portions of the available spectrum with SBSs, and they in exchange, provide data service to the users in their coverage area. In our simulations, the spectrum access delay for cellular users is reduced by 93\% and the energy consumption is reduced by 33\%, while maintaining the full service rate.
Link: https://arxiv.org/abs/1809.06550
====================================================
Nanopublications: A Growing Resource of Provenance-Centric Scientific Linked Data (Tobias Kuhn - 18 September, 2018)
More than 10 million such nanopublications have been published, which now form a valuable resource for studies on the domain level of the given Life Science domains as well as on the more technical levels of provenance modeling and heterogeneous Linked Data
Link: https://arxiv.org/abs/1809.06532
====================================================
Active Anomaly Detection via Ensembles (Shubhomoy Das - 17 September, 2018)
Our results show that in addition to discovering significantly more anomalies than state-of-the-art unsupervised baselines, our active learning algorithms under the streaming-data setup are competitive with the batch setup.
Link: https://arxiv.org/abs/1809.06477
====================================================
Bridging the Simulated-to-Real Gap: Benchmarking Super-Resolution on Real Data (Thomas KÃ¶hler - 17 September, 2018)
To bridge this simulated-to-real gap, we introduce the Super-Resolution Erlangen (SupER) database, the first comprehensive laboratory SR database of all-real acquisitions with pixel-wise ground truth. It consists of more than 80k images of 14 scenes combining different facets: CMOS sensor noise, real sampling at four resolution levels, nine scene motion types, two photometric conditions, and lossy video coding at five levels. This paper also benchmarks 19 popular single-image and multi-frame algorithms on our data
Link: https://arxiv.org/abs/1809.06420
====================================================
The Rosario Dataset: Multisensor Data for Localization and Mapping in Agricultural Environments (TaihÃº Pire - 17 September, 2018)
The dataset is motivated by the lack of realistic sensor data gathered by a mobile robot in such environments. It consists of 6 sequences recorded in soybean fields showing real and challenging cases: highly repetitive scenes, reflection and burned images caused by direct sunlight and rough terrain among others
Link: https://arxiv.org/abs/1809.06413
====================================================
Crowdsourcing Lung Nodules Detection and Annotation (Saeed Boorboor - 17 September, 2018)
Using our crowdsourcing workflow, we achieved a lung nodule detection sensitivity of over 90% for 20 patient CT datasets (containing 178 lung nodules with sizes between 1-30mm), and only 47 false positives from a total of 1021 annotations on nodules of all sizes (96% sensitivity for nodules$>$4mm)
Link: https://arxiv.org/abs/1809.06402
====================================================
Effective Predictions of Gaokao Admission Scores for College Applications in Mainland China (Hao Zhang - 12 September, 2018)
Early prediction methods are empirical without the backing of in-depth data studies. We show that our methods significantly outperform the methods commonly used by teachers and experts, and can predict admission scores with an accuracy of 91% within a 7-point margin in an exam of a 750-point grading scale.
Link: https://arxiv.org/abs/1809.06362
====================================================
"FabSearch" : A 3D CAD Model Based Search Engine for Sourcing Manufacturing Services (Atin Angrish - 17 September, 2018)
Second, FabSearch utilizes meta-data about each part, such as material specification, tolerance requirements to help improve the search results based on the specific query model requirements. The algorithm is tested against a repository containing more than 2000 models distributed across various job shop service providers
Link: https://arxiv.org/abs/1809.06329
====================================================
Industrial Smoke Detection and Visualization (Yen-Chia Hsu - 17 September, 2018)
As sensing technology proliferates and becomes affordable to the general public, there is a growing trend in citizen science where scientists and volunteers form a strong partnership in conducting scientific research including problem finding, data collection, analysis, visualization, and storytelling. We have helped the community members build a live camera system which captures and visualizes high resolution timelapse imagery starting from November 2014
Link: https://arxiv.org/abs/1809.06263
====================================================
GANs for Medical Image Analysis (Salome Kazeminia - 13 September, 2018)
Furthermore, their ability to synthesize images at unprecedented levels of realism also gives hope that the chronic scarcity of labeled data in the medical field can be resolved with the help of these generative models. A total of 63 papers published until end of July 2018 are reviewed
Link: https://arxiv.org/abs/1809.06222
====================================================
Context-Dependent Diffusion Network for Visual Relationship Detection (Zhen Cui - 10 September, 2018)
Experiments on two widely-used datasets demonstrate that our proposed method is more effective and achieves the state-of-the-art performance.
Link: https://arxiv.org/abs/1809.06213
====================================================
Study and Observation of the Variation of Accuracies of KNN, SVM, LMNN, ENN Algorithms on Eleven Different Datasets from UCI Machine Learning Repository (Mohammad Mahmudur Rahman Khan - 22 September, 2018)
Machine learning qualifies computers to assimilate with data, without being solely programmed [1, 2]. In supervised learning, computers learn an objective that portrays an input to an output hinged on training input-output pairs [3]
Link: https://arxiv.org/abs/1809.06186
====================================================
Dynamics Estimation Using Recurrent Neural Network (Astha Sharma - 17 September, 2018)
The loss obtained with this test data is 4.5920
Link: https://arxiv.org/abs/1809.06148
====================================================
Feature2Mass: Visual Feature Processing in Latent Space for Realistic Labeled Mass Generation (Jae-Hyeok Lee - 17 September, 2018)
However, in many bioimaging fields, the large-size of labeled dataset is scarcely available. Although a few researches have been dedicated to solving this problem through generative model, there are some problems as follows: 1) The generated bio-image does not seem realistic; 2) the variation of generated bio-image is limited; and 3) additional label annotation task is needed
Link: https://arxiv.org/abs/1809.06147
====================================================
Open Subtitles Paraphrase Corpus for Six Languages (Mathias Creutz - 17 September, 2018)
For each target language, the Opusparcus data have been partitioned into three types of data sets: training, development and test sets. The development and test sets consist of sentence pairs that have been checked manually; each set contains approximately 1000 sentence pairs that have been verified to be acceptable paraphrases by two annotators.
Link: https://arxiv.org/abs/1809.06142
====================================================
Revisit Multinomial Logistic Regression in Deep Learning: Data Dependent Model Initialization for Image Recognition (Bowen Cheng - 17 September, 2018)
Then we adopt this approximate solution to initialize the task-specific linear layer and demonstrate superior performance over random initialization in terms of both accuracy and convergence speed on various tasks and datasets. For example, for image classification, our approach can reduce the training time by 10 times and achieve 3.2% gain in accuracy for Flickr-style classification. For object detection, our approach can also be 10 times faster in training for the same accuracy, or 5% better in terms of mAP for VOC 2007 with slightly longer training.
Link: https://arxiv.org/abs/1809.06131
====================================================
Span error bound for weighted SVM with applications in hyperparameter selection (Ioannis Sarafis - 17 September, 2018)
Experiments on 14 benchmark data sets and data sets with importance scores for the training instances show that: (a) the condition for the existence of span in weighted SVM is satisfied almost always; (b) the span-rule is the most effective method for weighted SVM hyperparameter selection; (c) the span-rule is the best predictor of the test error in the mean square error sense; and (d) the span-rule is efficient and, for certain problems, it can be calculated faster than $K$-fold cross-validation.
Link: https://arxiv.org/abs/1809.06124
====================================================
cf2vec: Collaborative Filtering algorithm selection using graph distributed representations (Tiago Cunha - 17 September, 2018)
Experimental results show that the proposed procedure creates representations that are competitive with state-of-the-art metafeatures, while requiring significantly less data and without virtually any human input.
Link: https://arxiv.org/abs/1809.06120
====================================================
AlSub: Fully Parallel Subdivision for Modeling and Rendering (Daniel Mlakar - 1 October, 2018)
To fully parallelize the subdivision process, we discard traditional linked list data structures in favor of a sparse matrix linear algebra formalism. To substantiate the versatility of our approach we apply it to $\sqrt{3}$, Loop and Catmull-Clark subdivision schemes and show support for adaptive subdivision, semi-sharp creases, and a split evaluation scheme that separates topology and topological changes from positional updates
Link: https://arxiv.org/abs/1809.06047
====================================================
BSE: A Minimal Simulation of a Limit-Order-Book Stock Exchange (Dave Cliff - 17 September, 2018)
Research aimed at understanding the dynamics of this new style of financial market is hampered by the fact that no operational real-world exchange is ever likely to allow experimental probing of that market while it is open and running live, forcing researchers to work primarily from time-series of past trading data. BSE as described here addresses both those needs: it has been successfully used for teaching and research in a leading UK university since 2012, and the BSE program code is freely available as open-source on GitHuB.
Link: https://arxiv.org/abs/1809.06027
====================================================
DASNet: Reducing Pixel-level Annotations for Instance and Semantic Segmentation (Chuang Niu - 17 September, 2018)
Our method demonstrates substantially improved performance compared to existing semi-supervised approaches on PASCAL VOC 2012 dataset.
Link: https://arxiv.org/abs/1809.06013
====================================================
A Distributed Learning Architecture for Scientific Imaging Problems (A. Panousopoulou - 27 September, 2018)
We conduct evaluation studies considering relevant datasets, and the results report at least 60\% improvement in time response against the conventional computing solutions
Link: https://arxiv.org/abs/1809.05956
====================================================
Performance Analysis of Molecular Spatial Modulation (MSM) in Diffusion based Molecular MIMO Communication Systems (Tayyebeh Jahani-Nezhad - 16 September, 2018)
In this paper, we introduce molecular spatial modulation (MSM) in molecular MIMO communication to increase the data rate of the system. Also, for a 2$\times$1 system, we define an optimization problem to obtain the suitable number of molecules for transmitting to reduce BER of this systems
Link: https://arxiv.org/abs/1809.05954
====================================================
Memory Efficient Experience Replay for Streaming Learning (Tyler L. Hayes - 16 September, 2018)
Streaming learning will cause conventional deep neural networks (DNNs) to fail for two reasons: 1) they need multiple passes through the entire dataset; and 2) non-iid data will cause catastrophic forgetting
Link: https://arxiv.org/abs/1809.05922
====================================================
An investigation of a deep learning based malware detection system (Mohit Sewak - 16 September, 2018)
In the investigation, we experiment with different combination of Deep Learning architectures including Auto-Encoders, and Deep Neural Networks with varying layers over Malicia malware dataset on which earlier studies have obtained an accuracy of (98%) with an acceptable False Positive Rates (1.07%). In our proposed approach, besides improving the previous best results (99.21% accuracy and a False Positive Rate of 0.19%) indicates that Deep Learning based systems could deliver an effective defense against malware
Link: https://arxiv.org/abs/1809.05888
====================================================
Energy Efficient Cloud Control and Pricing in Geographically Distributed Data Centers (DraÅ¾en LuÄanin - 16 September, 2018)
It is estimated that data centers constitute 1.5% of global electricity usage
Link: https://arxiv.org/abs/1809.05853
====================================================
Performance-Based Pricing in Multi-Core Geo-Distributed Cloud Computing (DraÅ¾en LuÄanin - 16 September, 2018)
With such new pricing schemes and the increasing energy costs in data centres, balancing energy savings with performance and revenue losses is a challenging problem for cloud providers. We evaluate the proposed approach using simulations with realistic VM workloads, electricity price and temperature traces and estimate energy savings of up to 14.57%.
Link: https://arxiv.org/abs/1809.05842
====================================================
A Generic Multi-modal Dynamic Gesture Recognition System using Machine Learning (Gautham Krishna G - 16 September, 2018)
From an initial set of seven classifiers, three were chosen to evaluate each dataset across all modes rendering the system towards mode-neutrality and dataset-independence. Moreover, this system was found to run on a low-cost embedded platform - Raspberry Pi Zero (USD 5), making it economically viable.
Link: https://arxiv.org/abs/1809.05839
====================================================
Pervasive Cloud Controller for Geotemporal Inputs (DraÅ¾en LuÄanin - 16 September, 2018)
In this paper, we propose a pervasive cloud controller for dynamic resource reallocation adapting to volatile time- and location-dependent factors, while considering the QoS impact of too frequent migrations and the data quality limits of time series forecasting methods. By optimising for these additional factors, we estimate 28.6% energy cost savings compared to baseline dynamic VM consolidation
Link: https://arxiv.org/abs/1809.05838
====================================================
Real-Time, Highly Accurate Robotic Grasp Detection using Fully Convolutional Neural Networks with High-Resolution Images (Dongwon Park - 16 September, 2018)
Robotic grasp detection for novel objects is a challenging task, but for the last few years, deep learning based approaches have achieved remarkable performance improvements, up to 96.1% accuracy, with RGB-D data. Our methods also achieved state-of-the-art detection accuracy (up to 96.6%) with state-of- the-art real-time computation time for high-resolution images (6-20ms per 360x360 image) on Cornell dataset. With accurate vision-robot coordinate calibration through our proposed learning-based, fully automatic approach, our proposed method yielded 90% success rate.
Link: https://arxiv.org/abs/1809.05828
====================================================
Segmenting Unknown 3D Objects from Real Depth Images using Mask R-CNN Trained on Synthetic Point Clouds (Michael Danielczuk - 16 September, 2018)
SD Mask R-CNN outperforms point cloud clustering baselines by an absolute 15% in Average Precision and 20% in Average Recall, and achieves performance levels similar to a Mask RCNN trained on a massive, hand-labeled RGB dataset and fine-tuned on real images from the experimental setup
Link: https://arxiv.org/abs/1809.05825
====================================================
Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines (Boyi Yang - 16 September, 2018)
The data used are 2,000 EHR progress notes retrieved from patients with diabetes and all notes were annotated manually as diabetic or non-diabetic. The convolutional neural network (CNN) model with a separable convolution layer accurately identified diabetes-related notes in the Brigham and Womens Hospital testing set with the highest AUC of 0.975
Link: https://arxiv.org/abs/1809.05814
====================================================
Accident Forecasting in CCTV Traffic Camera Videos (Ankit Shah - 15 September, 2018)
Our Car Accident Detection and Prediction(CADP) dataset consists of 1,416 video segments collected from YouTube, with 205 video segments having full spatio-temporal annotations. For person(pedestrian) category, we observed significant improvements:+46.45 % for CM and 45.22 % for ACM, compared to Faster R-CNN. We achieved an average of 1.359 seconds in terms of Time-To-Accident measure with an Average Precision of 47.36 %
Link: https://arxiv.org/abs/1809.05782
====================================================
Using Artificial Intelligence to Support Compliance with the General Data Protection Regulation (John KC Kingston - 15 September, 2018)
The General Data Protection Regulation (GDPR) is a European Union regulation that will replace the existing Data Protection Directive on 25 May 2018
Link: https://arxiv.org/abs/1809.05762
====================================================
Graph Convolutional Networks for Text Classification (Liang Yao - 15 September, 2018)
Our experimental results on multiple benchmark datasets demonstrate that a vanilla Text GCN without any external word embeddings or knowledge outperforms state-of-the-art methods for text classification. In addition, experimental results show that the improvement of Text GCN over state-of-the-art comparison methods become more prominent as we lower the percentage of training data, suggesting the robustness of Text GCN to less training data in text classification.
Link: https://arxiv.org/abs/1809.05679
====================================================
Wasserstein Autoencoders for Collaborative Filtering (Jingbin Zhong - 19 September, 2018)
Experiments are valuated on three widely adopted data sets, i.e., ML-20M, Netflix and LASTFM. The performance of the proposed approach outperforms the compared methods with respect to evaluation criteria Recall@1, Recall@5 and NDCG@10, and this demonstrates the efficacy of the proposed approach.
Link: https://arxiv.org/abs/1809.05662
====================================================
Detecting and Explaining Drifts in Yearly Grant Applications (Stephen Pauwels - 15 September, 2018)
We test our approach on the BPI Challenge 2018 data con- sisting of applications for EU direct payment from farmers in Germany where we use it to detect Concept Drift
Link: https://arxiv.org/abs/1809.05650
====================================================
Media Accessibility Policy in Theory and Reality: Empirical Outreach to Audio Description Users in the United States (Philipp Jordan - 14 September, 2018)
Yet this paper presents quantitative and qualitative survey data on its challenges and opportunities, through the analysis of responses from 483 participants in a national sample, with 334 of these respondents being blind
Link: https://arxiv.org/abs/1809.05585
====================================================
Socially Aware Kalman Neural Networks for Trajectory Prediction (Ce Ju - 14 September, 2018)
The evaluation of our approach on NGSIM dataset demonstrates that SAKNN performs state-of-the-art on prediction effectiveness in a relatively long-term horizon and significantly improves the signal-to-noise ratio of the predicted signal.
Link: https://arxiv.org/abs/1809.05408
====================================================
Multi-Kernel Diffusion CNNs for Graph-Based Learning on Point Clouds (Lasse Hansen - 14 September, 2018)
They are predestined to overcome certain limitations of conventional grid-based architectures and will enable efficient handling of point clouds or related graphical data representations, e.g. We validated our approach for learning point descriptors as well as semantic classification on real 3D point clouds of human poses and demonstrate an improvement from 85% to 95% in Dice overlap with our multi-kernel approach.
Link: https://arxiv.org/abs/1809.05370
====================================================
Numeral Understanding in Financial Tweets for Fine-grained Crowd-based Forecasting (Chung-Chi Chen - 14 September, 2018)
This work is the first attempt to understand numerals in financial social media data, and we provide the first comparison of fine-grained opinion of individual investors and analysts based on their forecast price. The numeral corpus used in our experiments, called FinNum 1.0 , is available for research purposes.
Link: https://arxiv.org/abs/1809.05356
====================================================
A Domain Agnostic Normalization Layer for Unsupervised Adversarial Domain Adaptation (Rob Romijnders - 14 September, 2018)
In our evaluation, we adapt from the synthetic GTA5 data set to the real Cityscapes data set, a common benchmark experiment, and surpass the state-of-the-art
Link: https://arxiv.org/abs/1809.05298
====================================================
Random Warping Series: A Random Features Method for Time-Series Embedding (Lingfei Wu - 14 September, 2018)
Our extensive experiments on 16 benchmark datasets demonstrate that RWS outperforms or matches state-of-the-art classification and clustering methods in both accuracy and computational time
Link: https://arxiv.org/abs/1809.05259
====================================================
Distributed and Efficient Resource Balancing Among Many Suppliers and Consumers (Kamal Chaturvedi - 14 September, 2018)
Achieving a balance of supply and demand in a multi-agent system with many individual self-interested and rational agents that act as suppliers and consumers is a natural problem in a variety of real-life domains---smart power grids, data centers, and others. Each agent has a concave utility function whose derivative tends to 0 when an optimum quantity is supplied/consumed
Link: https://arxiv.org/abs/1809.05245
====================================================
Enhanced Optic Disk and Cup Segmentation with Glaucoma Screening from Fundus Images using Position encoded CNNs (Vismay Agrawal - 13 September, 2018)
On the REFUGE validation data (n=400), the segmentation network achieved a dice score of 0.88 and 0.64 for optic disc and optic cup respectively. For the tasking differentiating images affected with glaucoma from healthy images, the area under the ROC curve was observed to be 0.85.
Link: https://arxiv.org/abs/1809.05216
====================================================
A Time Series Graph Cut Image Segmentation Scheme for Liver Tumors (Laramie Paxton - 13 September, 2018)
First, we create a feature vector for each pixel in a novel way that consists of the 59 intensity values in the time series data and propose a simplified perimeter cost term in the energy functional. It was evaluated against the ground truth on a clinical CT dataset of 10 tumors and yielded segmentations with a mean Dice similarity coefficient (DSC) of .77 and mean volume overlap error (VOE) of 36.7%. The average processing time was 1.25 minutes per slice.
Link: https://arxiv.org/abs/1809.05210
====================================================
An Incentive Mechanism for Crowd Sensing with Colluding Agents (Susu Xu - 13 September, 2018)
Experiments based on synthesized data and real-world data reveal gains of over 30\% attained by our mechanism compared to past literature.
Link: https://arxiv.org/abs/1809.05161
====================================================
Learning under Misspecified Objective Spaces (Andreea Bobu - 11 October, 2018)
We test our inference method in an experiment with human interaction data, and demonstrate that this alleviates unintended learning in an in-person user study with a 7DoF robot manipulator.
Link: https://arxiv.org/abs/1810.05157
====================================================
A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice (Hendrik Fichtenberger - 11 October, 2018)
This concept is crucial in many areas of data analysis and data processing, e.g., computer vision, document retrieval and machine learning
Link: https://arxiv.org/abs/1810.05064
====================================================
Deep Learning for Image Denoising: A Survey (Chunwei Tian - 11 October, 2018)
Since the proposal of big data analysis and Graphic Processing Unit (GPU), the deep learning technology has received a great deal of attention and has been widely applied in the field of imaging processing
Link: https://arxiv.org/abs/1810.05052
====================================================
Towards Cytoskeleton Computers. A proposal (Andrew Adamatzky - 11 October, 2018)
Data are fed into the AF/MT computing networks via electrical and optical means. Data signals are travelling localisations (solitons, conformational defects) at the network terminals
Link: https://arxiv.org/abs/1810.04981
====================================================
Which Generation Shows the Most Prudent Data Sharing Behaviour? (Wolfgang Leister - 11 October, 2018)
We report from a study performed in ten European countries, where we asked about attitudes and behaviour towards data sharing behaviour. The use of learning and practising tools seems the right way to increase the privacy and data sharing awareness of citizen.
Link: https://arxiv.org/abs/1810.04964
====================================================
Globally Continuous and Non-Markovian Activity Analysis from Videos (He Wang - 11 October, 2018)
Given video data, we discover recurring activity patterns that appear, peak, wane and disappear over time. Also, our method fits data better and detects anomalies that were difficult to detect previously.
Link: https://arxiv.org/abs/1810.04954
====================================================
Online Visual Robot Tracking and Identification using Deep LSTM Networks (Hafez Farazi - 11 October, 2018)
A deep LSTM network was trained on a simulated dataset and fine-tuned on small set of real data. Experiments on two challenging datasets, one synthetic and one real, which include long-term occlusions, show promising results.
Link: https://arxiv.org/abs/1810.04941
====================================================
MOANOFS: Multi-Objective Automated Negotiation based Online Feature Selection System for Big Data Classification (Fatma Ben Said - 11 October, 2018)
Considering the huge amount number of features in real-world applications, FS methods using batch learning technique can't resolve big data problem especially when data arrive sequentially
Link: https://arxiv.org/abs/1810.04903
====================================================
Dense Object Reconstruction from RGBD Images with Embedded Deep Shape Representations (Lan Hu - 11 October, 2018)
We demonstrate a general ability to improve mapping accuracy with respect to each modality alone, and present a successful application to real data.
Link: https://arxiv.org/abs/1810.04891
====================================================
Monitoring spatial sustainable development: Semi-automated analysis of satellite and aerial images for energy transition and sustainability indicators (R. L. Curier - 11 October, 2018)
Further, this project takes place in a wider framework which investigates how official statistics can benefit from new digital data sources
Link: https://arxiv.org/abs/1810.04881
====================================================
Generating Shared Latent Variables for Robots to Imitate Human Movements and Understand their Physical Limitations (Maxime Devanne - 11 October, 2018)
Our model is able to map visual human body features to robot data in order to facilitate the robot learning and imitation
Link: https://arxiv.org/abs/1810.04879
====================================================
Learning a Set of Interrelated Tasks by Using Sequences of Motor Policies for a Strategic Intrinsically Motivated Learner (Nicolas Duminy - 11 October, 2018)
Our model is able to map visual human body features to robot data in order to facilitate the robot learning and imitation
Link: https://arxiv.org/abs/1810.04877
====================================================
A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies (Homanga Bharadhwaj - 11 October, 2018)
Learning effective visuomotor policies for robots purely from data is challenging, but also appealing since a learning-based system should not require manual tuning or calibration. For this reason, it is desirable to be able to leverage \textit{simulation} and \textit{off-policy} data to the extent possible to train the robot
Link: https://arxiv.org/abs/1810.04871
====================================================
Sequence-to-Sequence Models for Data-to-Text Natural Language Generation: Word- vs. Character-based Processing and Output Diversity (Glorianna Jagfeld - 11 October, 2018)
On the datasets of two recent generation challenges, our models achieve comparable or better automatic evaluation results than the best challenge submissions. In a controlled experiment with synthetic training data generated from templates, we demonstrate the ability of neural models to learn novel combinations of the templates and thereby generalize beyond the linguistic structures they were trained on.
Link: https://arxiv.org/abs/1810.04864
====================================================
A new clustering algorithm for prolonging the lifetime of wireless sensor networks (Seyedakbar Mostafavi - 10 October, 2018)
ARO-WSN algorithm which has been extensively used in the field of image processing, runs in the order of O(n) for a large data set, therefore it can be applied on WSN
Link: https://arxiv.org/abs/1810.04831
====================================================
A Blended Deep Learning Approach for Predicting User Intended Actions (Fei Tan - 10 October, 2018)
We evaluate our methodology on two public data repositories and one private user usage dataset provided by Adobe Creative Cloud
Link: https://arxiv.org/abs/1810.04824
====================================================
Persistent 1-Cycles: Definition, Computation, and Its Application (Tamal K. Dey - 10 October, 2018)
Persistence diagrams, which summarize the birth and death of homological features extracted from data, are employed as stable signatures for applications in image analysis and other areas. We design a software which applies our algorithm to various datasets
Link: https://arxiv.org/abs/1810.04807
====================================================
Redirect2Own: Protecting the Intellectual Property of User-uploaded Content through Off-site Indirect Access (Georgios Kontaxis - 10 October, 2018)
Our design suggests that user data are kept off the social networking service, in third parties that enable the hosting of user-generated content under terms of service and overall environment (e.g., a different location) that better suit the user's needs and wishes. At the same time, indirection schemata are seamlessly integrated in the social networking service, without any cooperation from the server side necessary, so that users can transparently access the off-site data just as they would if hosted in-site
Link: https://arxiv.org/abs/1810.04779
====================================================
Probabilistic Safety Analysis using Traffic Microscopic Simulation (Carlos Lima Azevedo - 10 October, 2018)
The model was estimated and validated using simulated microscopic data. To obtained the consistent simulated data, a two-step simulation calibration procedure was adopted: (1) using real trajectories collected on site for detailed behavior representation; and (2) using aggregate data from each event used in safety model estimation
Link: https://arxiv.org/abs/1810.04776
====================================================
Towards Differentially Private Truth Discovery for Crowd Sensing Systems (Yaliang Li - 10 October, 2018)
The key idea of the proposed mechanism is to perturb data from each user independently and then conduct weighted aggregation among users' perturbed data. We formally quantify utility and privacy trade-off and further verify the claim by experiments on both synthetic data and a real-world crowd sensing system.
Link: https://arxiv.org/abs/1810.04760
====================================================
Technical Report: KNN Joins Using a Hybrid Approach: Exploiting CPU/GPU Workload Characteristics (Michael Gowanlock - 10 October, 2018)
Since the CPU and GPU are considerably different architectures that are best exploited using different algorithms, we advocate for splitting the work between both architectures based on the characteristic workloads defined by the query points in the dataset. Critically, we find that the relative performance gains over the reference implementation across four real-world datasets are a function of the data properties (size, dimensionality, distribution), and number of neighbors, K.
Link: https://arxiv.org/abs/1810.04758
====================================================
Learning Tensor Latent Features (Sung-En Chang - 10 October, 2018)
In this work, we formulate a tensor latent feature learning problem by representing the data as a mixture of high-order latent features and binary codes, which are memory efficient and easy to interpret. When evaluated on both synthetic and real datasets, our experiments show superior performance over baseline methods.
Link: https://arxiv.org/abs/1810.04754
====================================================
End-to-End Content and Plan Selection for Data-to-Text Generation (Sebastian Gehrmann - 10 October, 2018)
Learning to generate fluent natural language from structured data with neural networks has become an common approach for NLG. This problem can be challenging when the form of the structured data varies between examples
Link: https://arxiv.org/abs/1810.04700
====================================================
Time Efficient Data Migration among Clouds (Syeda Munazza Marium - 24 September, 2018)
Desired objective to achieve time efficiency during data migration has been accomplished. Results obtained when data transmission occur among azure and gear host cloud implementation of proposed framework with some size limitations.
Link: https://arxiv.org/abs/1810.04609
====================================================
Blockchain access control Ecosystem for Big Data security (Uchi Ugobame Uchibeke - 10 October, 2018)
In recent years, the advancement in modern technologies has experienced an explosion of huge data sets being captured and recorded in different fields, but also given rise to concerns the security and protection of data storage, transmission, processing, and access to data. In this paper, we have developed a blockchain access control ecosystem that gives asset owners the sovereign right to effectively manage access control of large data sets and protect against data breaches
Link: https://arxiv.org/abs/1810.04607
====================================================
Building an Ontology for the Domain of Plant Science using ProtÃ©gÃ© (Sara Hosseinzadeh Kassani - 11 October, 2018)
Biological data is also growing in terms of the quantity and quality of data considerably. Despite the attempts for building a uniform platform to handle data management in Plant Science, researchers are facing the challenge of not only accessing and integrating data stored in heterogeneous data sources but also representing the implicit and explicit domain knowledge based on the available plant genomic and phenomic data
Link: https://arxiv.org/abs/1810.04606
====================================================
A Similarity Measure for Weaving Patterns in Textiles (Sven Helmer - 10 October, 2018)
We evaluate the different variants of our similarity measure experimentally, showing that it can be implemented efficiently and illustrating its quality using it to cluster and query a data set containing more than a thousand textile samples.
Link: https://arxiv.org/abs/1810.04604
====================================================
Understanding Data Science Lifecycle Provenance via Graph Segmentation and Summarization (Hui Miao - 10 October, 2018)
Increasingly modern data science platforms today have non-intrusive and extensible provenance ingestion mechanisms to collect rich provenance and context information, handle modifications to the same file using distinguishable versions, and use graph data models (e.g., property graphs) and query languages (e.g., Cypher) to represent and manipulate the stored provenance/context information. Due to the schema-later nature of the metadata, multiple versions of the same files, and unfamiliar artifacts introduced by team members, the "provenance graph" is verbose and evolving, and hard to understand; using standard graph query model, it is difficult to compose queries and utilize this valuable information.
Link: https://arxiv.org/abs/1810.04599
====================================================
Broadband Internet and Social Capital (Andrea Geraci - 9 October, 2018)
We study how the diffusion of broadband Internet affects social capital using two data sets from the UK. Merging unique information about the topology of the voice network with geocoded longitudinal data about individual social capital, we show that access to broadband Internet caused a significant decline in forms of offline interaction and civic engagement
Link: https://arxiv.org/abs/1810.04575
====================================================
Building a Reproducible Machine Learning Pipeline (Peter Sugimura - 9 October, 2018)
The framework is comprised of four main components (data, feature, scoring, and evaluation layers), which are themselves comprised of well defined transformations
Link: https://arxiv.org/abs/1810.04570
====================================================
Improvement of K Mean Clustering Algorithm Based on Density (Su Chang - 9 October, 2018)
In the traditional K mean clustering algorithm, the initial clustering centers are generated randomly in the data set
Link: https://arxiv.org/abs/1810.04559
====================================================
A Deep Learning Approach to the Inversion of Borehole Resistivity Measurements (M. Shahriari - 5 October, 2018)
Herein, we build a DNN that approximates the following inverse problem: given a set of borehole resistivity measurements, the DNN is designed to deliver a physically meaningful and data-consistent piecewise one-dimensional layered model of the surrounding subsurface. We illustrate the performance of DNN of logging-while-drilling measurements acquired on high-angle wells via synthetic data.
Link: https://arxiv.org/abs/1810.04522
====================================================
Deep Reinforcement Learning for Time Scheduling in RF-Powered Backscatter Cognitive Radio Networks (Tran The Anh - 3 October, 2018)
In an RF-powered backscatter cognitive radio network, multiple secondary users communicate with a secondary gateway by backscattering or harvesting energy and actively transmitting their data depending on the primary channel state
Link: https://arxiv.org/abs/1810.04520
====================================================
Multi-class Classification Model Inspired by Quantum Detection Theory (Prayag Tiwari - 10 October, 2018)
Machine Learning has become very famous currently which assist in identifying the patterns from the raw data
Link: https://arxiv.org/abs/1810.04491
====================================================
Domain Confusion with Self Ensembling for Unsupervised Adaptation (Jiawei Wang - 10 October, 2018)
Data collection and annotation are time-consuming in machine learning, expecially for large scale problem
Link: https://arxiv.org/abs/1810.04472
====================================================
Cutting Throughput on the Edge:App-Aware Placement in Fog Computing (Francescomaria Faticanti - 10 October, 2018)
By displacing workloads from the central cloud to the edge devices, fog computing overcomes communication bottlenecks avoiding raw data transfer to the central cloud, thus paving the way for the next generation IoT-based applications. It results into a mixed integer non linear problem involving constraints on both application data flows and computation placement
Link: https://arxiv.org/abs/1810.04442
====================================================
Global Search with Bernoulli Alternation Kernel for Task-oriented Grasping Informed by Simulation (Rika Antonova - 10 October, 2018)
We learn task scores from a labeled dataset with a convolutional network, which is used to construct an informed kernel for our variant of Bayesian optimization. Experiments on an ABB Yumi robot with real sensor data demonstrate success of our approach, despite the challenge of fulfilling task requirements and high uncertainty over physical properties of objects.
Link: https://arxiv.org/abs/1810.04438
====================================================
Performance analysis and optimization of the JOREK code for many-core CPUs (T. B. FehÃ©r - 10 October, 2018)
The matrix construction subroutine was vectorized, and the data locality was also improved
Link: https://arxiv.org/abs/1810.04413
====================================================
Fast Approximation of EEG Forward Problem and Application to Tissue Conductivity Estimation (Kostiantyn Maksymenko - 10 October, 2018)
Our method is tested for brain and skull conductivity estimation , with simulated and measured EEG data, corresponding to evoked somato-sensory potentials
Link: https://arxiv.org/abs/1810.04410
====================================================
Semi-supervised clustering for de-duplication (Shrinu Kushagra - 10 October, 2018)
Data de-duplication is the task of detecting multiple records that correspond to the same real-world entity in a database
Link: https://arxiv.org/abs/1810.04361
====================================================
Using ACL2 in the Design of Efficient, Verifiable Data Structures for High-Assurance Systems (David Hardin - 9 October, 2018)
 Proof techniques for these data structures exist, but are  oriented to unbounded, functional realizations, which are not  typically efficient in either space or time.   Furthermore, high-assurance design rules frown on dynamic memory  allocation, preferring simple array-based data structure  implementations.
Link: https://arxiv.org/abs/1810.04312
====================================================
Multi-Institutional Deep Learning Modeling Without Sharing Patient Data: A Feasibility Study on Brain Tumor Segmentation (Micah J Sheller - 9 October, 2018)
In this study, we introduce the first use of federated learning for multi-institutional collaboration, enabling deep learning modeling without sharing patient data. Our quantitative results demonstrate that the performance of federated semantic segmentation models (Dice=0.852) on multimodal brain scans is similar to that of models trained by sharing data (Dice=0.862)
Link: https://arxiv.org/abs/1810.04304
====================================================
Batch Active Preference-Based Learning of Reward Functions (Erdem BÄ±yÄ±k - 9 October, 2018)
Data generation and labeling are usually an expensive part of learning for robotics. In this paper, we will develop a new algorithm, batch active preference-based learning, that enables efficient learning of reward functions using as few data samples as possible while still having short query generation times
Link: https://arxiv.org/abs/1810.04303
====================================================
Deep clustering: On the link between discriminative models and K-means (Mohammed Jabi - 9 October, 2018)
It is generally acknowledged that discriminative objective functions (e.g., those based on the mutual information or the KL divergence) are more flexible than generative approaches (e.g., K-means) in the sense that they make fewer assumptions about the data distributions and, typically, yield much better unsupervised deep learning results. Our theoretical analysis not only connects directly several recent state-of-the-art discriminative models to K-means, but also leads to a new soft and regularized deep K-means algorithm, which yields competitive performance on several image clustering benchmarks.
Link: https://arxiv.org/abs/1810.04246
====================================================
Autonomous Urban Localization and Navigation with Limited Information (Jordan Chipka - 9 October, 2018)
Detailed a priori maps of the environment with sufficient information for autonomous navigation typically require driving the area multiple times to collect large amounts of data, substantial post-processing on that data to obtain the map, and then maintaining updates on the map as the environment changes
Link: https://arxiv.org/abs/1810.04243
====================================================
Rethinking multiscale cardiac electrophysiology with machine learning and predictive modelling (Chris D. Cantwell - 9 October, 2018)
We review some of the latest approaches to analysing cardiac electrophysiology data using machine learning and predictive modelling
Link: https://arxiv.org/abs/1810.04227
====================================================
Seeing Beyond Appearance - Mapping Real Images into Geometrical Domains for Unsupervised CAD-based Recognition (Benjamin Planche - 9 October, 2018)
As this mapping is easier to learn than the opposite one (ie to learn to generate realistic features to augment the source samples), we demonstrate how our whole solution can be trained purely on augmented synthetic data, and still perform better than methods trained with domain-relevant information (eg real images or realistic textures for the 3D models). Applying our approach to object recognition from texture-less CAD data, we present a custom generative network which fully utilizes the purely geometrical information to learn robust features and achieve a more refined mapping for unseen color images.
Link: https://arxiv.org/abs/1810.04158
====================================================
Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs (Yogesh Balaji - 9 October, 2018)
GANs, however, compute a generative model by minimizing a distance between observed and generated probability distributions without considering an explicit model for the observed data. Our numerical results on several datasets demonstrate consistent trends with the proposed theory.
Link: https://arxiv.org/abs/1810.04147
====================================================
Learning One-hidden-layer Neural Networks under General Input Distributions (Weihao Gao - 9 October, 2018)
However, existing approaches to address this issue crucially rely on a restrictive assumption: the training data is drawn from a Gaussian distribution
Link: https://arxiv.org/abs/1810.04133
====================================================
Discovering General-Purpose Active Learning Strategies (Ksenia Konyushkova - 9 October, 2018)
We propose a general-purpose approach to discovering active learning (AL) strategies from data. We evaluate the learned strategies on multiple unrelated domains and show that they consistently outperform state-of-the-art baselines.
Link: https://arxiv.org/abs/1810.04114
====================================================
Enabling Cognitive Smart Cities Using Big Data and Machine Learning: Approaches and Challenges (Mehdi Mohammadi - 9 October, 2018)
We also propose a three-level learning framework for smart cities that matches the hierarchical nature of big data generated by smart cities with a goal of providing different levels of knowledge abstractions. Fundamentally, the framework benefits from semi-supervised deep reinforcement learning where a small amount of data that has users' feedback serves as labeled data while a larger amount is without such users' feedback serves as unlabeled data
Link: https://arxiv.org/abs/1810.04107
====================================================
A Family of Maximum Margin Criterion for Adaptive Learning (Miao Cheng - 9 October, 2018)
In the literature, it has been quite familiar with high dimensionality of data samples, but either such characteristics or large data have become usual sense in real-world applications. Experimental results on a diversity of data sets demonstrate the discriminant ability of proposed MMC methods are compenent to be adopted in complicated application scenarios.
Link: https://arxiv.org/abs/1810.04064
====================================================
On Learning and Learned Representation with Dynamic Routing in Capsule Networks (Ancheng Lin - 7 October, 2018)
In this work, we investigate i) how the routing affects the CapsNet model fitting, ii) how the representation by capsules helps discover global structures in data distribution and iii) how learned data representation adapts and generalizes to new tasks. Our investigation shows: i) routing operation determines the certainty with which one layer of capsules pass information to the layer above, and the appropriate level of certainty is related to the model fitness, ii) in a designed experiment using data with a known 2D structure, capsule representations allow more meaningful 2D manifold embedding than neurons in a standard CNN do and iii) compared to neurons of standard CNN, capsules of successive layers are less coupled and more adaptive to new data distribution.
Link: https://arxiv.org/abs/1810.04041
====================================================
Person-Job Fit: Adapting the Right Talent for the Right Job with Joint Representation Learning (Chen Zhu - 8 October, 2018)
Finally, the extensive experiments on a large-scale real-world dataset clearly validate the performance of PJFNN in terms of Person-Job Fit prediction. Also, we provide effective data visualization to show some job and talent benchmark insights obtained by PJFNN.
Link: https://arxiv.org/abs/1810.04040
====================================================
Conversational Group Detection With Deep Convolutional Networks (Mason Swofford - 7 October, 2018)
We present accuracies which demonstrate the ability to rival and sometimes outperform the best models, but due to a data imbalance issue we do not yet outperform existing models in our test results.
Link: https://arxiv.org/abs/1810.04039
====================================================
Coloured and task-based stencil codes (Benjamin Hazelwood - 9 October, 2018)
New OpenMP versions alternatively allow users to specify data dependencies explicitly and to outsource the decision how to distribute the work to the runtime system
Link: https://arxiv.org/abs/1810.04033
====================================================
Selective Distillation of Weakly Annotated GTD for Vision-based Slab Identification System (Sang Jun Lee - 9 October, 2018)
In the development a deep-learning based system, manual labeling for preparing ground truth data (GTD) is an important but expensive task. Experiments were thoroughly conducted on actual industry data collected at a steelworks to demonstrate the effectiveness of the proposed method.
Link: https://arxiv.org/abs/1810.04029
====================================================
Learning Converged Propagations with Deep Prior Ensemble for Image Enhancement (Risheng Liu - 9 October, 2018)
Therefore, DPE actually provides a generic ensemble methodology to integrate both knowledge and data-based cues for different image enhancement tasks. Experimental results demonstrate that the proposed DPE outperforms state-of-the-arts on a variety of image enhancement tasks in terms of both quantitative measure and visual perception quality.
Link: https://arxiv.org/abs/1810.04012
====================================================
Computationally Efficient Cascaded Training for Deep Unrolled Network in CT Imaging (Dufan Wu - 5 October, 2018)
Local image patches could be utilized for the neural network training, which made it fully scalable to 3D CT data. The proposed method was validated with both simulated and real data and demonstrated competing performance against the end-to-end networks.
Link: https://arxiv.org/abs/1810.03999
====================================================
Learning Noun Cases Using Sequential Neural Networks (Sina Ahmadi - 9 October, 2018)
Given the challenge of data sparsity in processing morphologically rich languages and also, the flexibility of sentence structures in such languages, we believe that modeling morphological dependencies can improve the performance of neural network models
Link: https://arxiv.org/abs/1810.03996
====================================================
Local Frequency Interpretation and Non-Local Self-Similarity on Graph for Point Cloud Inpainting (Zeqing Fu - 28 September, 2018)
However, point clouds usually exhibit holes of missing data, mainly due to the limitation of acquisition techniques and complicated structure
Link: https://arxiv.org/abs/1810.03973
====================================================
Adaptive Image Stream Classification via Convolutional Neural Network with Intrinsic Similarity Metrics (Yang Gao - 27 September, 2018)
Unfortunately, this assumption may not have large support when dealing with high dimensional data such as images. We empirically measure the performance of CSIM over multiple realworld image datasets and demonstrate its superiority by comparing its performance with existing semi-supervised methods.
Link: https://arxiv.org/abs/1810.03966
====================================================
Fixing Variational Bayes: Deterministic Variational Inference for Bayesian Neural Networks (Anqi Wu - 9 October, 2018)
Bayesian neural networks (BNNs) hold great promise as a flexible and principled solution to deal with uncertainty when learning from finite data
Link: https://arxiv.org/abs/1810.03958
====================================================
Cognitive Architecture for a Connected World (Shaun C. D&#39;Souza - 26 September, 2018)
It takes computing out of the data center and into end user platform
Link: https://arxiv.org/abs/1810.03955
====================================================
textTOvec: Deep Contextualized Neural Autoregressive Models of Language with Distributed Compositional Prior (Pankaj Gupta - 10 October, 2018)
(2) Limited Context and/or Smaller training corpus of documents: In settings with a small number of word occurrences (i.e., lack of context) in short text or data sparsity in a corpus of few documents, the application of TMs is challenging
Link: https://arxiv.org/abs/1810.03947
====================================================
Convolutional Neural Networks In Convolution (Xiaobo Huang - 9 October, 2018)
In contrast, We propose a novel wider Convolutional Neural Networks (CNN) architecture, motivated by the Multi-column Deep Neural Networks and the Network In Network(NIN), aiming for higher accuracy without input data transmutation
Link: https://arxiv.org/abs/1810.03946
====================================================
Studies on the energy and deep memory behaviour of a cache-oblivious, task-based hyperbolic PDE solver (Dominic E. Charrier - 9 October, 2018)
We thus propose that upcoming supercomputing simulation codes with dynamic, inhomogeneous task graphs require algorithms and task schedulers which actively prefetch data, intermix tasks of different character, and apply frequency control to or switch off the memory if appropriate.
Link: https://arxiv.org/abs/1810.03940
====================================================
A Survey on Recent Advances in Transport Layer Protocols (Michele Polese - 9 October, 2018)
Over the years, the Internet has been enriched with new available communication technologies, for both fixed and mobile networks and devices, exhibiting an impressive growth in terms of performance, with steadily increasing available data rates
Link: https://arxiv.org/abs/1810.03884
====================================================
Continual State Representation Learning for Reinforcement Learning using Generative Replay (Hugo Caselles-DuprÃ© - 9 October, 2018)
The resulting model is capable of incrementally learning information without using past data and with a bounded system size.
Link: https://arxiv.org/abs/1810.03880
====================================================
Functionally Modular and Interpretable Temporal Filtering for Robust Segmentation (JÃ¶rg Wagner - 9 October, 2018)
These failures are often introduced by data-inherent perturbations, which significantly reduce the information provided to the perception system. Using photorealistic, synthetic video data, we show the ability of the proposed architecture to overcome data-inherent perturbations
Link: https://arxiv.org/abs/1810.03867
====================================================
Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives (Abhik Singla - 9 October, 2018)
D-RL is a data driven approach, which has been shown to be very effective for realizing all kinds of robust locomotion behaviors, both in simulation and in experiment
Link: https://arxiv.org/abs/1810.03842
====================================================
How FAIR can you get? Image Retrieval as a Use Case to calculate FAIR Metrics (Tobias Weber - 9 October, 2018)
Suggestions on how to increase the score include automatic annotation based on the metadata inside the image file and support for content negotiation to retrieve the images. These and other insights can lead to an improvement of data integration workflows, resulting in a better and more FAIR approach to manage research data.
Link: https://arxiv.org/abs/1810.03824
====================================================
A software-defined architecture for control of IoT Cyberphysical Systems (Ala&#39; Darabseh - 9 October, 2018)
Our design especially capitalizes on the computational units possessed by smart agents, which may be utilized for decentralized control and in-network data processing. We characterize the data flow, communication flow, and control flow that assimilate a set of components such as sensors, actuators, controllers, and coordinators in a systemic programmable fashion
Link: https://arxiv.org/abs/1810.03822
====================================================
The Adversarial Attack and Detection under the Fisher Information Metric (Chenxiao Zhao - 9 October, 2018)
By considering the data space as a non-linear space with the Fisher information metric induced from a neural network, we first propose an adversarial attack algorithm termed one-step spectral attack (OSSA). Both our attack and detection algorithms are numerically optimized to work efficiently on large datasets
Link: https://arxiv.org/abs/1810.03806
====================================================
What made you do this? Understanding black-box decisions with sufficient input subsets (Brandon Carter - 9 October, 2018)
General principles that globally govern a model's decision-making can also be revealed by searching for clusters of such input patterns across many data points. We demonstrate the utility of our interpretation method on various neural network models trained on text, image, and genomic data.
Link: https://arxiv.org/abs/1810.03805
====================================================
Average Margin Regularization for Classifiers (Matt Olfat - 8 October, 2018)
We conclude by using both synthetic and real data to empirically show that AM regularization can strictly improve both accuracy and robustness for support vector machine's (SVM's) and deep neural networks, relative to unregularized classifiers and adversarially trained classifiers.
Link: https://arxiv.org/abs/1810.03773
====================================================
SPIGAN: Privileged Adversarial Learning from Simulation (Kuan-Hui Lee - 8 October, 2018)
Wetrain the networks on real-world Cityscapes and Vistas datasets, using only unla-beled real-world images and synthetic labeled data with z-buffer (depth) PI fromthe SYNTHIA dataset. Our method improves over no adaptation and state-of-the-art unsupervised domain adaptation techniques.
Link: https://arxiv.org/abs/1810.03756
====================================================
Efficient Two-Step Adversarial Defense for Deep Neural Networks (Ting-Jui Chang - 8 October, 2018)
Adversarial training, which augments the training data with adversarial examples during the training process, is a well known defense to improve the robustness of the model against adversarial attacks
Link: https://arxiv.org/abs/1810.03739
====================================================
Deep Tractable Probabilistic Models for Moral Responsibility (Lewis Hammond - 8 October, 2018)
From the viewpoint of automated systems, the urgent questions are: (a) How can models of moral scenarios and blameworthiness be extracted and learnt automatically from data? (b) How can judgements be computed tractably, given the split-second decision points faced by the system? By building on deep tractable probabilistic learning, we propose a learning regime for inducing models of such scenarios automatically from data and reasoning tractably from them
Link: https://arxiv.org/abs/1810.03736
====================================================
Find the dimension that counts: Fast dimension estimation and Krylov PCA (Shashanka Ubaru - 8 October, 2018)
The proposed method avoids forming the sample covariance matrix (associated with the data) explicitly and computing the complete eigen-decomposition. Therefore, the method is inexpensive, which is particularly advantageous in modern data applications where the covariance matrices can be very large
Link: https://arxiv.org/abs/1810.03733
====================================================
Efficient Non-parametric Bayesian Hawkes Processes (Rui Zhang - 8 October, 2018)
On synthetic data, we show our method to be flexible and scalable, and on two largescale Twitter diffusion datasets, we show our method to outperform the parametric Hawkes model
Link: https://arxiv.org/abs/1810.03730
====================================================
Saliency Prediction in the Deep Learning Era: An Empirical Investigation (Ali Borji - 8 October, 2018)
In this work, I explore the landscape of the field emphasizing on new deep saliency models, benchmarks, and datasets. A large number of image and video saliency models are reviewed and compared over two image benchmarks and two large scale video datasets
Link: https://arxiv.org/abs/1810.03716
====================================================
A Hybrid Approach for Trajectory Control Design (Luigi Freda - 8 October, 2018)
The proposed approach has the benefit of avoiding complex terramechanics analysis to directly estimate from data the robot dynamics on a wide class of trajectories
Link: https://arxiv.org/abs/1810.03711
====================================================
The Wireless Control Plane: An Overview and Directions for Future Research (EmadelDin A. Mazied - 8 October, 2018)
Software-defined networking (SDN), which has been successfully deployed in the management of complex data centers, has recently been incorporated into a myriad of 5G networks to intelligently manage a wide range of heterogeneous wireless devices, software systems, and wireless access technologies. Thus, the SDN control plane needs to communicate wirelessly with the wireless data plane either directly or indirectly
Link: https://arxiv.org/abs/1810.03670
====================================================
Understanding the Origins of Bias in Word Embeddings (Marc-Etienne Brunet - 8 October, 2018)
Although methods have been developed to measure these biases and alter word embeddings to mitigate their biased representations, there is a lack of understanding in how word embedding bias depends on the training data
Link: https://arxiv.org/abs/1810.03611
====================================================
SFV: Reinforcement Learning of Physical Skills from Videos (Xue Bin Peng - 8 October, 2018)
Motion capture remains the most popular source of motion data, but collecting mocap data typically requires heavily instrumented environments and actors. Our approach, based on deep pose estimation and deep reinforcement learning, allows data-driven animation to leverage the abundance of publicly available video clips from the web, such as those from YouTube
Link: https://arxiv.org/abs/1810.03599
====================================================
Towards Robot-Centric Conceptual Knowledge Acquisition (Georg JÃ¤ger - 8 October, 2018)
To overcome this discrepancy, we propose an approach to enable robots to generate robot-centric symbolic knowledge about objects from their own sensory data, thus, allowing them to assemble their own conceptual understanding of objects
Link: https://arxiv.org/abs/1810.03583
====================================================
The Long Road to Computational Location Privacy: A Survey (Primault Vincent - 8 October, 2018)
Indeed, mobility data can reveal sensitive information about users, among which one's home, work place or even religious and political preferences
Link: https://arxiv.org/abs/1810.03568
====================================================
Zero-Resource Multilingual Model Transfer: Learning What to Share (Xilun Chen - 8 October, 2018)
In this work, we propose the first zero-resource multilingual transfer learning model that can utilize training data in multiple source languages, while not requiring target language training data nor cross-lingual supervision. It results in significant performance gains over prior art, as shown in an extensive set of experiments over multiple text classification and sequence tagging tasks including a large-scale real-world industry dataset.
Link: https://arxiv.org/abs/1810.03552
====================================================
Meta-Learning: A Survey (Joaquin Vanschoren - 8 October, 2018)
Not only does this dramatically speed up and improve the design of machine learning pipelines or neural architectures, it also allows us to replace hand-engineered algorithms with novel approaches learned in a data-driven way. In this chapter, we provide an overview of the state of the art in this fascinating and continuously evolving field.
Link: https://arxiv.org/abs/1810.03548
====================================================
Effective Parallelisation for Machine Learning (Michael Kamp - 8 October, 2018)
We present a novel parallelisation scheme that simplifies the adaptation of learning algorithms to growing amounts of data as well as growing needs for accurate and confident predictions in critical applications
Link: https://arxiv.org/abs/1810.03530
====================================================
Trace Quotient with Sparsity Priors for Learning Low Dimensional Image Representations (Xian Wei - 8 October, 2018)
The former is a well-known powerful tool to identify underlying self-explanatory factors of data, while the latter is known for disentangling underlying low dimensional discriminative factors in data. Performance of the proposed SparLow algorithmic framework is investigated on several image processing tasks, such as 3D data visualization, face/digit recognition, and object/scene categorization.
Link: https://arxiv.org/abs/1810.03523
====================================================
Handover between Macrocell and Femtocell for UMTS based Networks (Mostafa Zaman Chowdhury - 4 October, 2018)
The femtocell networks that use home base station and existing xDSL or other cable line as backhaul connectivity can fulfill the upcoming demand of high data rate for wireless communication system as well as can extend the coverage area
Link: https://arxiv.org/abs/1810.03469
====================================================
Interface Selection for Power Management in UMTS/WLAN Overlaying Network (Mostafa Zaman Chowdhury - 4 October, 2018)
The access of both interfaces simultaneously can reduce the handover latency and data loss in heterogeneous handover
Link: https://arxiv.org/abs/1810.03468
====================================================
Wide and Deep Learning for Peer-to-Peer Lending (Kaveh Bastani - 8 October, 2018)
Extensive numerical studies are conducted based on real-world data to verify the effectiveness of the proposed approach
Link: https://arxiv.org/abs/1810.03466
====================================================
A Query Tool for Efficiently Investigating Risky Software Behaviors (Peng Gao - 4 October, 2018)
In particular, AIQL provides: (1) domain-specific data model and storage for storing the massive system monitoring data, (2) a domain-specific query language, Attack Investigation Query Language, which integrates critical primitives for risky behavior specification, and (3) an optimized query engine based on the characteristics of the data and the query to efficiently schedule the execution
Link: https://arxiv.org/abs/1810.03464
====================================================
Robust 6D Object Pose Estimation in Cluttered Scenes using Semantic Segmentation and Pose Regression Networks (Arul Selvam Periyasamy - 8 October, 2018)
We present a pipeline that requires minimal human intervention and circumvents the reliance on the availability of 3D models by a fast data acquisition method and a synthetic data generation procedure. The proposed method is evaluated on a synthetic validation dataset and cluttered real-world scenes.
Link: https://arxiv.org/abs/1810.03410
====================================================
On Breiman's Dilemma in Neural Networks: Phase Transitions of Margin Dynamics (Weizhi Zhu - 8 October, 2018)
When data complexity is comparable to the model expressiveness in the sense that both training and test data share similar phase transitions in normalized margin dynamics, two efficient ways are derived to predict the trend of generalization or test error via classic margin-based generalization bounds with restricted Rademacher complexities. Experiments are conducted to show the validity of the proposed method with some basic convolutional networks, AlexNet, VGG-16, and ResNet-18, on several datasets including Cifar10/100 and mini-ImageNet.
Link: https://arxiv.org/abs/1810.03389
====================================================
IriTrack: Liveness Detection Using Irises Tracking for Preventing Face Spoofing Attacks (Meng Shen - 8 October, 2018)
IriTrack allows checking liveness by using data collected during user-device interactions
Link: https://arxiv.org/abs/1810.03323
====================================================
Sanity Checks for Saliency Maps (Julius Adebayo - 8 October, 2018)
Consequently, methods that fail the proposed tests are inadequate for tasks that are sensitive to either data or model, such as, finding outliers in the data, explaining the relationship between inputs and outputs that the model learned, and debugging the model. We interpret our findings through an analogy with edge detection in images, a technique that requires neither training data nor model
Link: https://arxiv.org/abs/1810.03292
====================================================
Internet Congestion Control via Deep Reinforcement Learning (Nathan Jay - 7 October, 2018)
Congestion control is the core networking task of modulating traffic sources' data-transmission rates so as to efficiently and fairly allocate network resources
Link: https://arxiv.org/abs/1810.03259
====================================================
Fully Homomorphic Image Processing (William Fu - 7 October, 2018)
Fully homomorphic encryption has allowed devices to outsource computation to third parties while preserving the secrecy of the data being computed on. Then, we introduce our schemes for JPEG encoding and decoding, as well as schemes for bilinear and bicubic image resizing, as well as some data and analysis of our homomorphic schemes
Link: https://arxiv.org/abs/1810.03249
====================================================
Diagnosing Convolutional Neural Networks using their Spectral Response (Victor Stamatescu - 7 October, 2018)
We argue that the gain of CNNs can act as a diagnostic tool and potential replacement for the validation loss when hold-out validation data are not available.
Link: https://arxiv.org/abs/1810.03241
====================================================
Rethinking Recurrent Latent Variable Model for Music Composition (Eunjeong Stella Koh - 7 October, 2018)
To generate sequential data, the model uses an encoder-decoder architecture with latent probabilistic connections to capture the hidden structure of music. Our results suggest that the proposed model has a better statistical resemblance to the musical structure of the training data, which improves the creation of new sequences of music in the style of the originals.
Link: https://arxiv.org/abs/1810.03226
====================================================
Principled Deep Neural Network Training through Linear Programming (Daniel Bienstock - 7 October, 2018)
In this work we show that large classes of deep neural networks with various architectures (e.g., DNNs, CNNs, Binary Neural Networks, and ResNets), activation functions (e.g., ReLUs and leaky ReLUs), and loss functions (e.g., Hinge loss, Euclidean loss, etc) can be trained to near optimality with desired target accuracy using linear programming in time that is exponential in the size of the architecture and polynomial in the size of the data set; this is the best one can hope for due to the NP-Hardness of the problem and in line with previous work
Link: https://arxiv.org/abs/1810.03218
====================================================
Recycled ADMM: Improve Privacy and Accuracy with Less Computation in Distributed Algorithms (Xueru Zhang - 7 October, 2018)
In distributed settings, each node performs computation with its local data and the local results are exchanged among neighboring nodes in an iterative fashion. During this iterative process the leakage of data privacy arises and can accumulate significantly over many iterations, making it difficult to balance the privacy-utility tradeoff
Link: https://arxiv.org/abs/1810.03197
====================================================
Efficient Crowd Exploration of Large Networks: The Case of Causal Attribution (Daniel Berenberg - 7 October, 2018)
Worker interactions reveal important characteristics of causal perception and the network data they generate can improve our understanding of causality and causal inference.
Link: https://arxiv.org/abs/1810.03163
====================================================
Real-Time Workload Classification during Driving using HyperNetworks (Ruohan Wang - 7 October, 2018)
The problem is challenging due to the data variability among individual users, and sensor artefacts. Evaluating the proposed approach on an eye-gaze pattern dataset collected from simulated driving scenarios of different cognitive demands, we show that the proposed framework outperforms previous baseline methods and achieves 83.9\% precision and 87.8\% recall during test
Link: https://arxiv.org/abs/1810.03145
====================================================
European Court of Human Right Open Data project (Alexandre Quemy - 7 October, 2018)
Contrarily to many datasets, the creation process, from the collection of raw data to the feature transformation, is provided under the form of a collection of fully automated and open-source scripts. It ensures reproducibility and a high level of confidence in the processed data which is some of the most important issues in data governance nowadays.
Link: https://arxiv.org/abs/1810.03115
====================================================
A Fast Text Similarity Measure for Large Document Collections using Multi-reference Cosine and Genetic Algorithm (Hamid Mohammadi - 7 October, 2018)
The proposed method is examined on popular text document data-sets such as CiteseerX, Enron, Gold Set of Near-duplicate News Articles and etc
Link: https://arxiv.org/abs/1810.03102
====================================================
A Survey of Neighbourhood Construction Models for Categorizing Data Points (Shahin Pourbahrami - 7 October, 2018)
Finding data point neighbourhood in data mining and pattern recognition should generally improve knowledge extraction from databases. Several algorithms of data point neighbourhood construction have been proposed to analyse the data in this sense
Link: https://arxiv.org/abs/1810.03083
====================================================
Graph Classification with Geometric Scattering (Feng Gao - 6 October, 2018)
Furthermore, ConvNets inspired recent advances in geometric deep learning, which aim to generalize these networks to graph data by applying notions from graph signal processing to learn deep graph filter cascades. We demonstrate the utility of features extracted with this designed deep filter bank in graph classification, and show its competitive performance relative to other methods, including graph kernel methods and geometric deep learning ones, on both social and biochemistry data.
Link: https://arxiv.org/abs/1810.03068
====================================================
Geocoding Without Geotags: A Text-based Approach for reddit (Keith Harrigian - 6 October, 2018)
After evaluating the accuracy of our labeling procedure, we train and test several geolocation inference models across our reddit data set and three benchmark Twitter geolocation data sets. Ultimately, we show that geolocation models trained and applied on the same domain substantially outperform models attempting to transfer training data across domains, even more so on reddit where platform-specific interest-group metadata can be used to improve inferences.
Link: https://arxiv.org/abs/1810.03067
====================================================
Deep Model-Based 6D Pose Refinement in RGB (Fabian Manhardt - 6 October, 2018)
The approach can run in real-time and produces pose accuracies that come close to 3D ICP without the need for depth data. Furthermore, our networks are trained from purely synthetic data and will be published together with the refinement code to ensure reproducibility.
Link: https://arxiv.org/abs/1810.03065
====================================================
Eiffel: Efficient and Flexible Software Packet Scheduling (Ahmed Saeed - 6 October, 2018)
To support flexibility, Eiffel introduces novel programming abstractions to express scheduling policies that cannot be captured by current, state-of-the-art scheduler programming models. We show that it outperforms state of the art systems by 3-40x in terms of either number of cores utilized for network processing or number of flows given fixed processing capacity.
Link: https://arxiv.org/abs/1810.03060
====================================================
Supporting High-Performance and High-Throughput Computing for Experimental Science (E. A. Huerta - 6 October, 2018)
The advent of experimental science facilities, instruments and observatories, such as the Large Hadron Collider (LHC), the Laser Interferometer Gravitational Wave Observatory (LIGO), and the upcoming Large Synoptic Survey Telescope (LSST), has brought about challenging, large-scale computational and data processing requirements
Link: https://arxiv.org/abs/1810.03056
====================================================
Subspace Tracking from Missing and Outlier Corrupted Data (Praneeth Narayanamurthy - 6 October, 2018)
In recent work, we have studied the RST problem without missing data. This means we are able to show that, under assumptions on only the algorithm inputs (input data and/or initialization), the output subspace estimates are close to the true data subspaces at all times
Link: https://arxiv.org/abs/1810.03051
====================================================
Over-parameterization Improves Generalization in the XOR Detection Problem (Alon Brutzkus - 6 October, 2018)
Specifically, we prove data-dependent sample complexity bounds which show that over-parameterization improves the generalization performance of gradient descent.
Link: https://arxiv.org/abs/1810.03037
====================================================
Text-based Sentiment Analysis and Music Emotion Recognition (Erion Ãano - 6 October, 2018)
This thesis addresses the above problems to provide methodological and practical insights for utilizing neural networks on sentiment analysis of texts and achieving state of the art results. Regarding the first problem, the effectiveness of various crowdsourcing alternatives is explored and two medium-sized and emotion-labeled song data sets are created utilizing social tags
Link: https://arxiv.org/abs/1810.03031
====================================================
The Intuitive Power of Graph Pivots For User Exploration and Adaptive Data Abstraction (Alex Bigelow - 6 October, 2018)
This paper reports on a simple visual technique that boils extracting a subgraph down to two operations---pivots and filters---that is agnostic to both the data abstraction and the size of the graph. They also reveal ways that a series of graph pivots can expose the semantics of the data from the user's perspective, and how this information could be leveraged to create adaptive data abstractions that do not rely as heavily on a system designer to create a comprehensive abstraction that anticipates all the user's tasks.
Link: https://arxiv.org/abs/1810.03019
====================================================
Context-Aware Deep Spatio-Temporal Network for Hand Pose Estimation from Depth Images (Yiming Wu - 6 October, 2018)
Typically, the problem is modeled as learning a mapping function from images to hand joint coordinates in a data-driven manner. Our method is examined on two common benchmarks, the experimental results demonstrate that our proposed approach achieves the best or the second-best performance with state-of-the-art methods and runs in 60fps.
Link: https://arxiv.org/abs/1810.02994
====================================================
Anytime Stochastic Gradient Descent: A Time to Hear from all the Workers (Nuwan Ferdinand - 6 October, 2018)
In this paper, we focus on approaches to parallelizing stochastic gradient descent (SGD) wherein data is farmed out to a set of workers, the results of which, after a number of updates, are then combined at a central master node
Link: https://arxiv.org/abs/1810.02976
====================================================
Alpha Entanglement Codes: Practical Erasure Codes to Archive Data in Unreliable Environments (Vero Estrada-GaliÃ±anes - 6 October, 2018)
Alpha increases storage overhead linearly but increases the possible paths to recover data exponentially. As a result, an entangled storage system can provide high availability, durability and offer additional integrity: it is more difficult to modify data undetectably
Link: https://arxiv.org/abs/1810.02974
====================================================
Managing Smartphones Signalling Load in UMTS Networks: A Practical Analysis (Ayman Elnashar - 6 October, 2018)
More specifically, the signaling domain becomes overloaded while the data domain are underutilized
Link: https://arxiv.org/abs/1810.02972
====================================================
Performance Evaluation of VoLTE Based on Field Measurement Data (Ayman Elnashar - 6 October, 2018)
This paper provides guidelines for best practices of VoLTE deployment as well as practical performance evaluation based on field measurement data from commercial LTE networks.
Link: https://arxiv.org/abs/1810.02968
====================================================
Mining Novel Multivariate Relationships in Time Series Data Using Correlation Networks (Saurabh Agrawal - 6 October, 2018)
In particular, we discovered several multipole relationships that are reproducible in multiple other independent datasets and lead to novel domain insights.
Link: https://arxiv.org/abs/1810.02950
====================================================
Sifaka: Text Mining Above a Search API (Cameron VandenBerg - 5 October, 2018)
Often they are built from scratch using special-purpose software and data structures, which increases their cost and complexity
Link: https://arxiv.org/abs/1810.02907
====================================================
CDF Transform-Shift: An effective way to deal with inhomogeneous density datasets (Ye Zhu - 5 October, 2018)
It effectively converts a dataset with clusters of inhomogeneous density to one with clusters of homogeneous density, i.e., the data distribution is converted to one in which all locally low/high-density locations become globally low/high-density locations. Thus, after performing the proposed Transform-Shift, a single global density threshold can be used to separate the data into clusters and their surrounding noise points
Link: https://arxiv.org/abs/1810.02897
====================================================
HG-DAgger: Interactive Imitation Learning with Human Experts (Michael Kelly - 5 October, 2018)
Imitation learning has proven to be useful for many real-world problems, but approaches such as behavioral cloning suffer from data mismatch and compounding error issues
Link: https://arxiv.org/abs/1810.02890
====================================================
Scalable Micro-planned Generation of Discourse from Structured Data (Anirban Laha - 5 October, 2018)
Experiments on a benchmark mixed-domain dataset curated for paragraph description from tables reveals the superiority of our system over existing data-to-text approaches. We also demonstrate the robustness of our system in accepting other data types such as Knowledge-Graphs and Key-Value dictionaries.
Link: https://arxiv.org/abs/1810.02889
====================================================
Physics Guided Recurrent Neural Networks For Modeling Dynamical Systems: Application to Monitoring Water Temperature And Quality In Lakes (Xiaowei Jia - 5 October, 2018)
We will first describe the use of outputs from physics-based models in learning a hybrid-physics-data model. By using scientific knowledge to guide the construction and learning the data-driven model, we demonstrate that this method can achieve better prediction accuracy as well as scientific consistency of results.
Link: https://arxiv.org/abs/1810.02880
====================================================
Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks (Yau-Shian Wang - 5 October, 2018)
Auto-encoders compress input data into a latent-space representation and reconstruct the original data from the representation. By taking the generator output as the summary of the input text, abstractive summarization is achieved without document-summary pairs as training data
Link: https://arxiv.org/abs/1810.02851
====================================================
Deep Probabilistic Video Compression (Jun Han - 5 October, 2018)
Our model uses advances in variational autoencoders (VAEs) for sequential data and combines it with recent work on neural image compression. Evaluation on small videos from public data sets with varying complexity and diversity show that our model yields competitive results when trained on generic video content
Link: https://arxiv.org/abs/1810.02845
====================================================
ResumeNet: A Learning-based Framework for Automatic Resume Quality Assessment (Yong Luo - 5 October, 2018)
By investigating the dataset, we identify some factors or features that could be useful to discriminate good resumes from bad ones, e.g., the consistency between different parts of a resume. To deal with the label deficiency issue in the dataset, we propose several variants of the model by either utilizing the pair/triplet-based loss, or introducing some semi-supervised learning technique to make use of the abundant unlabeled data
Link: https://arxiv.org/abs/1810.02832
====================================================
Linear Queries Estimation with Local Differential Privacy (Raef Bassily - 5 October, 2018)
We study the problem of estimating a set of $d$ linear queries with respect to some unknown distribution $\mathbf{p}$ over a domain $\mathcal{J}=[J]$ based on a sensitive data set of $n$ individuals under the constraint of local differential privacy
Link: https://arxiv.org/abs/1810.02810
====================================================
Interpretable Convolutional Neural Networks via Feedforward Design (C. -C. Jay Kuo - 5 October, 2018)
It derives network parameters of the current layer based on data statistics from the output of the previous layer in a one-pass manner. The classification and robustness (against adversarial attacks) performances of BP- and FF-designed CNNs applied to the MNIST and the CIFAR-10 datasets are compared
Link: https://arxiv.org/abs/1810.02786
====================================================
Hierarchical Recurrent Filtering for Fully Convolutional DenseNets (JÃ¶rg Wagner - 5 October, 2018)
Using a synthetic dataset, we show the ability of our model to cope with data perturbations and highlight the importance of recurrent and hierarchical filtering.
Link: https://arxiv.org/abs/1810.02766
====================================================
Clustering-based Anomaly Detection for microservices (Roman Nikiforov - 4 October, 2018)
Anomaly detection is an important step in the management and monitoring of data centers and cloud computing platforms
Link: https://arxiv.org/abs/1810.02762
====================================================
From direct tagging to Tagging with sentences compression (Peihui Chen - 5 October, 2018)
if the data we need extract is not regular and its surrounding words are regular is relatively regular, then we can use information compression to cut the information we do not need before we tagging the data we need. In this way we can increase the precision of the data while not undermine the recall of the data.
Link: https://arxiv.org/abs/1810.02741
====================================================
Clust-LDA: Joint Model for Text Mining and Author Group Inference (Shaoyang Ning - 5 October, 2018)
We develop an inference procedure for clust-LDA and demonstrate its performance on simulated data, showing that clust-LDA out-performs the "vanilla" LDA on the topic identification task where authors exhibit distinctive topical preference. We also showcase the empirical performance of clust-LDA based on a real-world social media dataset from Reddit.
Link: https://arxiv.org/abs/1810.02717
====================================================
Wikistat 2.0: Educational Resources for Artificial Intelligence (Philippe Besse - 28 September, 2018)
Big data, data science, deep learning, artificial intelligence are the key words of intense hype related with a job market in full evolution, that impose  to adapt the contents of our university professional trainings. Which artificial intelligence is mostly concerned by the job offers? Which methodologies and technologies should be favored in the training pprograms? Which objectives, tools and educational resources do we needed to put in place to meet these pressing needs? We answer these questions in describing the contents and operational ressources in the Data Science  orientation of  the speciality  Applied Mathematics at INSA Toulouse
Link: https://arxiv.org/abs/1810.02688
====================================================
Thinging Ethics for Software Engineers (Sabah Al-Fedaghi - 25 September, 2018)
This is particularly clear in the area of software engineering, which focuses on software and associated tools such as algorithms, diagramming, documentation, modeling and design as applied to various types of data and conceptual artifacts
Link: https://arxiv.org/abs/1810.02685
====================================================
Generating Diffusion MRI scalar maps from T1 weighted images using generative adversarial networks (Xuan Gu - 5 October, 2018)
However, it is costly and time consuming to collect high quality diffusion data. We demonstrate how Generative Adversarial Networks (GANs) can be used to generate diffusion scalar measures from structural MR images in a single optimized step, without diffusion models and diffusion data
Link: https://arxiv.org/abs/1810.02683
====================================================
On Collaborative Predictive Blacklisting (Luca Melis - 5 October, 2018)
To this end, we reproduce and measure two systems: a non privacy-friendly one that uses a trusted coordinating party with access to all alerts (Soldo et al., 2010) and a peer-to-peer one using privacy-preserving data sharing (Freudiger et al., 2015)
Link: https://arxiv.org/abs/1810.02649
====================================================
ReTiCaM: Real-time Human Performance Capture from Monocular Video (Marc Habermann - 5 October, 2018)
The two resulting non-linear optimization problems per-frame are solved with specially-tailored data-parallel Gauss-Newton solvers
Link: https://arxiv.org/abs/1810.02648
====================================================
SLIC Based Digital Image Enlargement (M. Z. F. Amara - 5 October, 2018)
Selecting the best method to reconstruct an image to a higher resolution with the limited data available in the low-resolution image is quite a challenge
Link: https://arxiv.org/abs/1810.02643
====================================================
An Implementation Approach and Performance Analysis of Image Sensor Based Multilateral Indoor Localization and Navigation System (Md. Shahjalal - 5 October, 2018)
An android application is developed to support data acquisition from multiple simultaneous transmitter links. Experimentally, we received data from four links which are required to ensure a higher positioning accuracy.
Link: https://arxiv.org/abs/1810.02600
====================================================
Interference Management Based on RT/nRT Traffic Classification for FFR-Aided Small Cell/Macrocell Heterogeneous Networks (Mostafa Zaman Chowdhury - 5 October, 2018)
Cellular networks are constantly lagging in terms of the bandwidth needed to support the growing high data rate demands
Link: https://arxiv.org/abs/1810.02596
====================================================
Game-Based Approach for QoS Provisioning and Interference Management in Heterogeneous Networks (A. S. M. Zadid Shifat - 5 October, 2018)
These are important for service-providing operators because the system capacity and achievable data rates mainly depend on interference
Link: https://arxiv.org/abs/1810.02592
====================================================
A New Vehicle Localization Scheme Based on Combined Optical Camera Communication and Photogrammetry (Md. Tanvir Hossan - 5 October, 2018)
The FV transmits modulated data from the tail (or back) light, and the camera of the HV receives that signal using optical camera communication (OCC). In addition, the streetlight (SL) data are considered to ensure the position accuracy of the HV
Link: https://arxiv.org/abs/1810.02589
====================================================
C-DLSI: An Extended LSI Tailored for Federated Text Retrieval (Qijun Zhu - 5 October, 2018)
As the web expands in data volume and in geographical distribution, centralized search methods become inefficient, leading to increasing interest in cooperative information retrieval, e.g., federated text retrieval (FTR)
Link: https://arxiv.org/abs/1810.02579
====================================================
Service Aware Fuzzy Logic Based Handover Decision in Heterogeneous Wireless Networks (Mehek-Moutushy Rahman Mou - 5 October, 2018)
This work considered service types like voice, video, and data and their QoS requirements for handover decision using fuzzy logic in heterogeneous network environment. received signal strength indicator (RSSI), data rate, user's velocity, and interference level (signal-to-noise plus interference ratio) to make handover from femtocell to macrocell, macrocell to femtocell or femtocell to femtocell
Link: https://arxiv.org/abs/1810.02570
====================================================
Fixed-Mobile Convergence in the 5G era: From Hybrid Access to Converged Core (Massimo Condoluci - 5 October, 2018)
We present some testbed results on hybrid access and analyze some primary performance indicators such as achievable data rates, link utilization for aggregated traffic and session setup latency
Link: https://arxiv.org/abs/1810.02553
====================================================
Intelligent Interference Management Based on On-Demand Service Connectivity for Femtocellular Networks (Kazi Nawshad Azam - 5 October, 2018)
The femto-access-point (FAP), a low power small cellular base station provides better signal quality for the indoor users as to provide high data-rate communications with improved coverage, access network capacity and quality of service
Link: https://arxiv.org/abs/1810.02537
====================================================
PAPR Reduction in OFDM-IM Using Multilevel Dither Signals (Kee-Hoon Kim - 5 October, 2018)
Orthogonal frequency division multiplexing with index modulation (OFDM-IM) is a novel multicarrier scheme, which uses the indices of the active subcarriers to transmit data
Link: https://arxiv.org/abs/1810.02533
====================================================
Local Stability and Performance of Simple Gradient Penalty mu-Wasserstein GAN (Cheolhyeong Kim - 5 October, 2018)
Wasserstein GAN(WGAN) is a model that minimizes the Wasserstein distance between a data distribution and sample distribution. Based on this analysis, we claim that penalizing the data manifold or sample manifold is the key to regularizing the original WGAN with a gradient penalty
Link: https://arxiv.org/abs/1810.02528
====================================================
Learning To Simulate (Nataniel Ruiz - 5 October, 2018)
In this work, we propose a reinforcement learning-based method for automatically adjusting the parameters of any (non-differentiable) simulator, thereby controlling the distribution of synthesized data in order to maximize the accuracy of a model trained on that data. In contrast to prior art that hand-crafts these simulation parameters or adjusts only parts of the available parameters, our approach fully controls the simulator with the actual underlying goal of maximizing accuracy, rather than mimicking the real data distribution or randomly generating a large volume of data
Link: https://arxiv.org/abs/1810.02513
====================================================
CAC and Traffic Modeling for Integrated Macrocell/Femtocell Networks (Mostafa Zaman Chowdhury - 4 October, 2018)
Integrated macrocell/femtocell networks surely able to provide high data rate for the indoor users as well as able to offload huge traffic from the macrocellular networks to femtocellular networks
Link: https://arxiv.org/abs/1810.02490
====================================================
Correcting the bias in least squares regression with volume-rescaled sampling (MichaÅ DereziÅski - 4 October, 2018)
Furthermore, we propose algorithms to sample from this volume-rescaled distribution when the data distribution is only known through an i.i.d sample.
Link: https://arxiv.org/abs/1810.02453
====================================================
Visual Designs for Binned Aggregation of Multi-Class Scatterplots (Florian Heimerl - 4 October, 2018)
In this paper, we explore the space of visual designs for such data, and provide design guidelines for different analysis scenarios. To support these guidelines, we compile a set of abstract tasks and ground them in concrete examples using multiple sample datasets
Link: https://arxiv.org/abs/1810.02445
====================================================
FashionNet: Personalized Outfit Recommendation with Deep Neural Network (Tong He - 4 October, 2018)
Experiments on a large scale data set collected from a popular fashion-focused social network validate the effectiveness of the proposed networks.
Link: https://arxiv.org/abs/1810.02443
====================================================
AutoLoss: Learning Discrete Schedules for Alternate Optimization (Haowen Xu - 4 October, 2018)
AutoLoss provides a generic way to represent and learn the discrete optimization schedule from metadata, allows for a dynamic and data-driven schedule in ML problems that involve alternating updates of different parameters or from different loss objectives. The trained AutoLoss controller is generalizable -- it can guide and improve the learning of a new task model with different specifications, or on different datasets.
Link: https://arxiv.org/abs/1810.02442
====================================================
Relative Saliency and Ranking: Models, Metrics, Data, and Benchmarks (Mahmoud Kalash - 2 October, 2018)
Furthermore, we present data, analysis and benchmark baseline results towards addressing the problem of salient object ranking. In addition, we show how a derived dataset can be successively refined to provide cleaned results that correlate well with pristine ground truth
Link: https://arxiv.org/abs/1810.02426
====================================================
Feature prioritization and regularization improve standard accuracy and adversarial robustness (Chihuang Liu - 4 October, 2018)
In addition to qualitative evaluation, we also propose a novel experimental strategy that quantitatively demonstrates that our model is almost ideally aligned with salient data characteristics. Additional experimental results illustrate the power of our model relative to the state of the art methods.
Link: https://arxiv.org/abs/1810.02424
====================================================
A method to Suppress Facial Expression in Posed and Spontaneous Videos (Ghada Zamzmi - 4 October, 2018)
Experimental results of testing the method on various expressions namely happiness, sadness, and anger for two publicly available data sets (i.e., BU-4DFE and AM-FED) show the ability of our method in suppressing facial expressions.
Link: https://arxiv.org/abs/1810.02401
====================================================
Privacy-Preserving Multiparty Learning For Logistic Regression (Wei Du - 4 October, 2018)
Specifically, we consider logistic regression model for data training and propose two approaches for perturbing the objective function to preserve Îµ-differential privacy. The proposed solutions are tested on real datasets, including Bank Marketing and Credit Card Default prediction
Link: https://arxiv.org/abs/1810.02400
====================================================
Map Memorization and Forgetting in the IARA Autonomous Car (Thomas Teixeira - 4 October, 2018)
It consists in merging sensory information obtained during runtime (online) with a priori data from a high-precision map constructed offline
Link: https://arxiv.org/abs/1810.02355
====================================================
Multi-view X-ray R-CNN (Jan-Martin O. Steitz - 4 October, 2018)
Motivated by the detection of prohibited objects in carry-on luggage as a part of avionic security screening, we develop a CNN-based object detection approach for multi-view X-ray image data
Link: https://arxiv.org/abs/1810.02344
====================================================
Unsupervised Learning via Meta-Learning (Kyle Hsu - 6 October, 2018)
To do so, we construct tasks from unlabeled data in an automatic way and run meta-learning over the constructed tasks. Our experiments across four image datasets indicate that our unsupervised meta-learning approach acquires a learning algorithm without any labeled data that is applicable to a wide range of downstream classification tasks, improving upon the representation learned by four prior unsupervised learning methods.
Link: https://arxiv.org/abs/1810.02334
====================================================
A Practical Approach to Sizing Neural Networks (Gerald Friedland - 4 October, 2018)
Based on MacKay's information theoretic model of supervised machine learning, this article discusses how to practically estimate the maximum size of a neural network given a training data set. Second, we introduce and experimentally validate a heuristic method to estimate the neural network capacity requirement for a given dataset and labeling
Link: https://arxiv.org/abs/1810.02328
====================================================
Compound Binary Search Tree and Algorithms (Yong Tan - 4 October, 2018)
The Binary Search Tree (BST) is average in computer science which supports a compact data structure in memory and oneself even conducts a row of quick algorithms, by which people often apply it in dynamical circumstance. In this paper, we will develop this data structure into a synthesis to show a series of novel features residing in
Link: https://arxiv.org/abs/1810.02270
====================================================
Concept-drifting Data Streams are Time Series; The Case for Continuous Adaptation (Jesse Read - 4 October, 2018)
Learning from data streams is an increasingly important topic in data mining, machine learning, and artificial intelligence in general. A major focus in the data stream literature is on designing methods that can deal with concept drift, a challenge where the generating distribution changes over time
Link: https://arxiv.org/abs/1810.02266
====================================================
Memristor-based Deep Convolution Neural Network: A Case Study (Fan Zhang - 14 September, 2018)
An improved conversion algorithm is developed to convert convolution kernels to memristor-based circuits, which minimizes the error with consideration of the data and kernel patterns in CNNs
Link: https://arxiv.org/abs/1810.02225
====================================================
Improved generalization bounds for robust learning (Idan Attias - 4 October, 2018)
The learner gets uncorrupted training data with access to possible corruptions that may be used by the adversary during testing. Their aim is to build a robust classifier that would be tested on future adversarially corrupted data
Link: https://arxiv.org/abs/1810.02180
====================================================
Adaptive Policies for Perimeter Surveillance Problems (James A. Grant - 4 October, 2018)
We consider a scenario where the decision-maker may sequentially update the searchers' allocation, learning from the observed data to improve decisions over time
Link: https://arxiv.org/abs/1810.02176
====================================================
On the Performance of Space-Time MIMO Multiplexing for Free Space Optical Communications (Mohammad Taghi Dabiri - 4 October, 2018)
However, in order to have a practical role in the physical layer of future communication systems, data rate of FSO links must be improved. To this aim, in this paper we employ a multiple-input multiple-output (MIMO) multiplexing scheme with two transceivers to increase the data rate of the considered FSO system
Link: https://arxiv.org/abs/1810.02167
====================================================
A Single Approach to Decide Chase Termination on Linear Existential Rules (Michel Leclere - 4 October, 2018)
Existential rules, long known as tuple-generating dependencies in database theory, have been intensively studied in the last decade as a powerful formalism to represent ontological knowledge in the context of ontology-based query answering. A knowledge base is then composed of an instance that contains incomplete data and a set of existential rules, and answers to queries are logically entailed from the knowledge base
Link: https://arxiv.org/abs/1810.02132
====================================================
Monte Carlo Dependency Estimation (Edouard FouchÃ© - 4 October, 2018)
We show that MWP satisfies a number of desirable properties and can accommodate any kind of numerical data. We demonstrate the superiority of our estimator by comparing it to the state-of-the-art multivariate dependency measures.
Link: https://arxiv.org/abs/1810.02112
====================================================
Semi-Supervised Methods for Out-of-Domain Dependency Parsing (Juntao Yu - 4 October, 2018)
Our approaches use easily obtainable unlabelled data to improve out-of-domain parsing accuracies without the need of expensive corpora annotation. The evaluations on several English domains and multi-lingual data show quite good improvements on parsing accuracy
Link: https://arxiv.org/abs/1810.02100
====================================================
Dual Convolutional Neural Network for Graph of Graphs Link Prediction (Shonosuke Harada - 4 October, 2018)
Graphs are general and powerful data representations which can model complex real-world phenomena, ranging from chemical compounds to social networks; however, effective feature extraction from graphs is not a trivial task, and much work has been done in the field of machine learning and data mining. Experiments on link prediction tasks using several chemical network datasets demonstrate the effectiveness of the proposed method.
Link: https://arxiv.org/abs/1810.02080
====================================================
Transferring Physical Motion Between Domains for Neural Inertial Tracking (Changhao Chen - 4 October, 2018)
However, they are affected greatly by changes in sensor placement/orientation or motion dynamics, and it is infeasible to collect labelled data from every domain. To overcome the challenges of domain adaptation on long sensory sequences, we propose a novel framework that extracts domain-invariant features of raw sequences from arbitrary domains, and transforms to new domains without any paired data
Link: https://arxiv.org/abs/1810.02076
====================================================
Design and Evaluation of A Data Partitioning-Based Intrusion Management Architecture for Database Systems (Muhamad Felemban - 5 October, 2018)
The novelty in PIMS is the ability to contain the damage into data partitions, termed Intrusion Boundaries (IBs, for short). Finally, empirical and experimental performance evaluation of PIMS are conducted to demonstrate that intelligent partitioning of data tuples improves the overall availability of the DBMS under intrusion attacks.
Link: https://arxiv.org/abs/1810.02061
====================================================
Gradient Descent Provably Optimizes Over-parameterized Neural Networks (Simon S. Du - 4 October, 2018)
For an $m$ hidden node shallow neural network with ReLU activation and $n$ training data, we show as long as $m$ is large enough and the data is non-degenerate, randomly initialized gradient descent converges a globally optimal solution with a linear convergence rate for the quadratic loss function.
Link: https://arxiv.org/abs/1810.02054
====================================================
Seq2Slate: Re-ranking and Slate Optimization with RNNs (Irwan Bello - 3 October, 2018)
We show how to learn the model end-to-end from weak supervision in the form of easily obtained click-through data
Link: https://arxiv.org/abs/1810.02019
====================================================
Balancing Efficiency and Coverage in Human-Robot Dialogue Collection (Matthew Marge - 7 October, 2018)
Comparison of the data gathered in these phases show that the GUI enabled a faster pace of dialogue while still maintaining high coverage of suitable responses, enabling more efficient targeted data collection, and improvements in natural language understanding using GUI-collected data. As a promising first step towards interactive learning, this work shows that our approach enables the collection of useful training data for navigation-based HRI tasks.
Link: https://arxiv.org/abs/1810.02017
====================================================
Improving Community Detection by Mining Social Interactions (Jeancarlo Campos LeÃ£o - 4 October, 2018)
In this context, in this paper we propose a process to handle social network data that exploits temporal features to improve the detection of communities by existing algorithms
Link: https://arxiv.org/abs/1810.02002
====================================================
Image and Encoded Text Fusion for Multi-Modal Classification (Ignazio Gallo - 3 October, 2018)
Multi-modal approaches employ data from multiple input streams such as textual and visual domains. We compare our approach with individual sources on two large-scale multi-modal classification datasets while obtaining encouraging results
Link: https://arxiv.org/abs/1810.02001
====================================================
Action Model Acquisition using LSTM (Ankuj Arora - 3 October, 2018)
It is, however, becoming increasingly cumbersome to codify this model, and is more efficient to learn it from observed plan execution sequences (training data). We use the sequence labelling capabilities of LSTMs to isolate from an exhaustive model set a model identical to the one responsible for producing the training data
Link: https://arxiv.org/abs/1810.01992
====================================================
Discriminative Data-driven Self-adaptive Fraud Control Decision System with Incomplete Information (Junxuan Li - 3 October, 2018)
The proposed models are purely data-driven and self-adaptive in a real-time manner. The field test on Microsoft real online transaction data suggested that new systems could sizably improve the company's profit.
Link: https://arxiv.org/abs/1810.01982
====================================================
Learning Scheduling Algorithms for Data Processing Clusters (Hongzi Mao - 3 October, 2018)
Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms
Link: https://arxiv.org/abs/1810.01963
====================================================
Generating Labeled Flow Data from MAWILab Traces for Network Intrusion Detection (Jinoh Kim - 3 October, 2018)
Thus, in this paper, we introduce a method to construct labeled flow data by combining the packet meta-information with IDS logs to infer labels for intrusion detection research. In doing so, the introduced method at hand would aid researchers to access relevant network flow datasets along with label information.
Link: https://arxiv.org/abs/1810.01945
====================================================
PADDIT: Probabilistic Augmentation of Data using Diffeomorphic Image Transformation (Mauricio Orbes Arteaga - 3 October, 2018)
To induce invariance in CNNs to such transformations, we propose Probabilistic Augmentation of Data using Diffeomorphic Image Transformation (PADDIT) -- a systematic framework for generating realistic transformations that can be used to augment data for training CNNs
Link: https://arxiv.org/abs/1810.01928
====================================================
Generalized Inverse Optimization through Online Learning (Chaosheng Dong - 3 October, 2018)
Specifically, we develop an online learning algorithm that uses an implicit update rule which can handle noisy data. Moreover, under additional regularity assumptions in terms of the data and the model, we prove that our algorithm converges at a rate of $\mathcal{O}(1/\sqrt{T})$ and is statistically consistent
Link: https://arxiv.org/abs/1810.01920
====================================================
Real-time Clustering Algorithm Based on Predefined Level-of-Similarity (Rabindra Lamsal - 3 October, 2018)
The implementation of the proposed algorithm clearly demonstrates how efficiently a data-point with high dimensionality of features is assigned to an appropriate cluster with minimal operations. The proposed algorithm is very application specific and is applicable when the need is perform clustering analysis of real-time data-points, where the similarity measure between an incoming data-point and the cluster to which the data-point is to be associated with, is greater than predefined Level-of-Similarity.
Link: https://arxiv.org/abs/1810.01878
====================================================
Learning an internal representation of the end-effector configuration space (Alban LaflaquiÃ¨re - 3 October, 2018)
An internal representation of the end-effector configuration is generated from unstructured proprioceptive and exteroceptive data flow under very limited assumptions
Link: https://arxiv.org/abs/1810.01866
====================================================
Contextual Multi-Armed Bandits for Causal Marketing (Neela Sawant - 2 October, 2018)
Multi-armed bandit methods allow us to scale to multiple treatments and to perform off-policy policy evaluation on logged data. Preliminary offline experiments on a retail Fashion marketing dataset show merits of our proposal.
Link: https://arxiv.org/abs/1810.01859
====================================================
EPIC: Efficient Privacy-Preserving Scheme with E2E Data Integrity and Authenticity for AMI Networks (Ahmad Alsharif - 3 October, 2018)
In this paper, we propose EPIC, an efficient and privacy-preserving data collection scheme with E2E data integrity verification for AMI networks. In addition, we compare EPIC to existing data collection schemes in terms of overhead and security/privacy features.
Link: https://arxiv.org/abs/1810.01851
====================================================
Cloud4IoT: a heterogeneous, distributed and autonomic cloud platform for the IoT (Daniele Pizzolli - 3 October, 2018)
We introduce Cloud4IoT, a platform offering automatic deployment, orchestration and dynamic configuration of IoT support software components and data-intensive applications for data processing and analytics, thus enabling plug-and-play integration of new sensor objects and dynamic workload scalability. Overall, the platform is designed in order to support systems where IoT-based and data intensive applications may pose specific requirements for low latency, restricted available bandwidth, or data locality
Link: https://arxiv.org/abs/1810.01839
====================================================
Shrinkwrap: Differentially-Private Query Processing in Private Data Federations (Johes Bater - 3 October, 2018)
Owing to privacy concerns, these systems do not have a trusted data collector that can see all their data and their member databases cannot learn about individual records of other engines. Hence, existing private data federations do not scale well to complex SQL queries over large datasets.
Link: https://arxiv.org/abs/1810.01816
====================================================
Disambiguating Music Artists at Scale with Audio Metric Learning (Jimena Royo-Letelier - 3 October, 2018)
We explore the use of metric learning techniques to learn artist embeddings directly from audio, and using a dedicated homonym artists dataset, we compare our method with a recent approach that learn similar embeddings using artist classifiers. While both systems have the ability to disambiguate unknown artists relying exclusively on audio, we show that our system is more suitable in the case when enough audio data is available for each artist in the train dataset
Link: https://arxiv.org/abs/1810.01807
====================================================
Weighted dynamic finger in binary search trees (John Iacono - 3 October, 2018)
It is shown that the online binary search tree data structure GreedyASS performs asymptotically as well on a sufficiently long sequence of searches as any static binary search tree where each search begins from the previous search (rather than the root)
Link: https://arxiv.org/abs/1810.01785
====================================================
Reinforcement Learning for Model-Free Power Management of Networked Microgrids (Qianzhi Zhang - 1 October, 2018)
This paper presents an approximate Reinforcement Learning (RL) methodology for data-driven power management of networked Microgrids (MG) in electric distribution systems
Link: https://arxiv.org/abs/1810.01758
====================================================
Distributed transactional reads: the strong, the quick, the fresh \& the impossible (Alejandro Z. Tomsic - 3 October, 2018)
 We show that there is a three-way trade-off between them, which can be summarised as follows: (i) it is not possible to ensure at the same time order-preserving (e.g., causally-consistent) or atomic reads, Minimal Delay, and maximal freshness; thus, reading data that is the most fresh without delay is possible only in a weakly-isolated mode; (ii) to ensure atomic or order-preserving reads at Minimal Delay imposes to read data from the past (not fresh); (iii) however, order-preserving minimal-delay reads can be fresher than atomic; (iv) reading atomic or order-preserving data at maximal freshness may block reads or writes indefinitely.  Our impossibility results hold independently of other features of the database, such as update semantics (totally ordered or not) or data model (structured or unstructured)
Link: https://arxiv.org/abs/1810.01698
====================================================
Towards Low-level Cryptographic Primitives for JavaCards (Vasilios Mavroudis - 3 October, 2018)
We attribute this to the restricted access to low-level cryptographic primitives (e.g., elliptic curve operations) and the lack of essential data types (e.g., Integers)
Link: https://arxiv.org/abs/1810.01662
====================================================
SecGrid: A Secure and Efficient SGX-enabled Smart Grid System with Rich Functionalities (Shaohua Li - 3 October, 2018)
Our system leverage trusted hardware SGX to ensure that grid utilities can efficiently execute rich functionalities on customers' private data, while guaranteeing their privacy
Link: https://arxiv.org/abs/1810.01651
====================================================
Image as Data: Automated Visual Content Analysis for Political Science (Jungseock Joo - 2 October, 2018)
Scholars have already recognized the importance of visual data and a variety of large visual datasets have become available. The lack of scalable analytic methods, however, has prevented from incorporating large scale image data in political analysis
Link: https://arxiv.org/abs/1810.01544
====================================================
The Effect of Data Marshalling on Computation Offloading Decisions (Julio A. Reyes-Munoz - 2 October, 2018)
We conducted an extensive set of experiments with an offloading testbed to understand the impact that data marshalling techniques have on computation offloading decisions. We find that the popular JSON format to marshall data between client and server comes at a significant computational expense compared to a minimalistic raw data transfer
Link: https://arxiv.org/abs/1810.01540
====================================================
Harnessing Correlations in Distributed Erasure Coded Key-Value Stores (Ramy E. Ali - 2 October, 2018)
In this paper, we study the storage cost of ensuring consistency for the case where the data versions are correlated, in contrast to previous work where data versions were treated as being independent. We provide multi-version code constructions that show that the storage cost can be significantly smaller than the previous constructions depending on the degree of correlation between the different versions of the data
Link: https://arxiv.org/abs/1810.01527
====================================================
Mixing patterns and individual differences in networks (George T. Cantwell - 2 October, 2018)
We introduce a network model that captures this type of variance in assortativity along with an expectation-maximization algorithm for fitting it to observed network data. The fit allows us to make best estimates of the preferences of individual nodes, define metrics to quantify individual variation in assortativity, perform sensitive community detection even in the absence of traditional assortative structure, and accurately predict missing data in unlabeled or partially labeled networks.
Link: https://arxiv.org/abs/1810.01432
====================================================
GLAD: GLocalized Anomaly Detection via Active Feature Space Suppression (Shubhomoy Das - 8 October, 2018)
We propose an algorithm called GLAD (GLocalized Anomaly Detection) that allows end-users to retain the use of simple and understandable global anomaly detectors by automatically learning their local relevance to specific data instances using label feedback. Our experiments on synthetic and real-world data show the effectiveness of GLAD in learning the local relevance of ensemble members and discovering anomalies via label feedback.
Link: https://arxiv.org/abs/1810.01403
====================================================
Sketching for Latent Dirichlet-Categorical Models (Joseph Tassarotti - 2 October, 2018)
Recent work has explored transforming data sets into smaller, approximate summaries in order to scale Bayesian inference
Link: https://arxiv.org/abs/1810.01400
====================================================
Validation of a PETSc based software implementing a 4DVAR Data Assimilation algorithm: a case study related with an Oceanic Model based on Shallow Water equation (Luisa Carracciuolo - 3 October, 2018)
In this work are presented and discussed some results related to the validation process of a software module based on PETSc which implements a Data Assimilation algorithm.
Link: https://arxiv.org/abs/1810.01361
====================================================
An Exploration of Blockchain Enabled Decentralized Capability based Access Control Strategy for Space Situation Awareness (Ronghua Xu - 1 October, 2018)
To enhance SSA, the dynamic data-driven applications systems (DDDAS) framework couples on-line data with off-line models to enhance system performance. For information management, there is a need for identity authentication and access control strategy to ensure the integrity of exchanged data as well as to grant authorized entities access right to data and services
Link: https://arxiv.org/abs/1810.01291
====================================================
META-DES: A Dynamic Ensemble Selection Framework using Meta-Learning (Rafael M. O. Cruz - 29 September, 2018)
Experiments are conducted over several small sample size classification problems, i.e., problems with a high degree of uncertainty due to the lack of training data. Experimental results show the proposed meta-learning framework greatly improves classification accuracy when compared against current state-of-the-art dynamic ensemble selection techniques.
Link: https://arxiv.org/abs/1810.01270
====================================================
Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation (John Martin - 2 October, 2018)
For improved data efficiency, our method reduces TD updates to Gaussian Process regression. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks.
Link: https://arxiv.org/abs/1810.01217
====================================================
Inference Over Programs That Make Predictions (Yura Perov - 2 October, 2018)
It describes possible further steps to extend that work, such that, ultimately, automatic probabilistic program synthesis can generalise over any reasonable set of inputs and outputs, in particular in regard to text, image and video data.
Link: https://arxiv.org/abs/1810.01190
====================================================
Semi-supervised Text Regression with Conditional Generative Adversarial Networks (Tao Li - 2 October, 2018)
Besides promising potential of predicting capabilities, our superiorities are twofold: (i) the model works with unbalanced datasets of limited labelled data, which align with real-world scenarios; and (ii) predictions are obtained by an end-to-end framework, without explicitly selecting high-level representations. Finally we point out related datasets for experiments and future research directions.
Link: https://arxiv.org/abs/1810.01165
====================================================
Predicate learning in neural systems: Discovering latent generative structures (Andrea E. Martin - 2 October, 2018)
But how do the structures that these models invoke arise in neural systems in the first place? To answer this question, we explain how a system can learn latent representational structures (i.e., predicates) from experience with wholly unstructured data. The ability to learn predicates from experience, to represent structures compositionally, and to extrapolate to unseen data offers an inroads to understanding and modeling the most complex human behaviors.
Link: https://arxiv.org/abs/1810.01127
====================================================
Sinkhorn AutoEncoders (Giorgio Patrini - 3 October, 2018)
We prove that in the non-parametric limit the autoencoder generates the data distribution if and only if the two distributions match exactly, and that the optimum can be obtained by deterministic autoencoders
Link: https://arxiv.org/abs/1810.01118
====================================================
The Dreaming Variational Autoencoder for Reinforcement Learning Environments (Per-Arne Andersen - 2 October, 2018)
Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima
Link: https://arxiv.org/abs/1810.01112
====================================================
A New Approach to Privacy-Preserving Clinical Decision Support Systems for HIV Treatment (Thomas Attema - 2 October, 2018)
These support systems are based on clinical trials and expert knowledge; however, the amount of data available to these systems is limited
Link: https://arxiv.org/abs/1810.01107
====================================================
Cloud Chaser: Real Time Deep Learning Computer Vision on Low Computing Power Devices (Zhengyi Luo - 2 October, 2018)
However, to provide time critical services such as emergency response, home assistance, surveillance, etc, these devices often need real time analysis of their camera data
Link: https://arxiv.org/abs/1810.01069
====================================================
Improving Sentence Representations with Multi-view Frameworks (Shuai Tang - 2 October, 2018)
Multi-view learning can provide self-supervision when different views are available of the same data
Link: https://arxiv.org/abs/1810.01064
====================================================
Privacy-Preserving Outsourcing of Large-Scale Nonlinear Programming to the Cloud (Ang Li - 1 October, 2018)
The increasing massive data generated by various sources has given birth to big data analytics. Solving large-scale nonlinear programming problems (NLPs) is one important big data analytics task that has applications in many domains such as transport and logistics
Link: https://arxiv.org/abs/1810.01048
====================================================
Heterogeneous Replica for Query on Cassandra (Jialin Qiao - 1 October, 2018)
Cassandra is a popular structured storage system with high-performance, scalability and high availability, and is usually used to store data that has some sortable attributes
Link: https://arxiv.org/abs/1810.01037
====================================================
Real-Time Planning with Multi-Fidelity Models for Agile Flights in Unknown Environments (Jesus Tordesillas - 1 October, 2018)
In addition, we address the interaction between a fast planner and a slower mapper by considering the sensor data not yet fused into the map during the collision check. This novel mapping and planning framework for agile flights is validated in simulation, showing replanning times of 5-40 ms in cluttered environments, a value that is 3-30 times faster than similar state-of-the-art planning algorithms.
Link: https://arxiv.org/abs/1810.01035
====================================================
Inertial-aided Motion Deblurring with Deep Networks (Janne Mustaniemi - 1 October, 2018)
To train our network, we also introduce a novel way of generating realistic training data using the gyroscope. The evaluation shows a clear improvement in visual quality over the state-of-the-art while achieving real-time performance
Link: https://arxiv.org/abs/1810.00986
====================================================
Utilizing a Transparency-driven Environment toward Trusted Automatic Genre Classification: A Case Study in Journalism History (Aysenur Bilgin - 1 October, 2018)
With the growing abundance of unlabeled data in real-world tasks, researchers have to rely on the predictions given by black-boxed computational models. For this purpose, we developed an environment that empowers non-computer scientists to become practicing data scientists in their own research field
Link: https://arxiv.org/abs/1810.00968
====================================================
Challenges of Using Text Classifiers for Causal Inference (Zach Wood-Doughty - 1 October, 2018)
To facilitate causal analyses based on language data, we consider the role that text classifiers can play in causal inference through established modeling mechanisms from the causality literature on missing data and measurement error. We demonstrate how to conduct causal analyses using text classifiers on simulated and Yelp data, and discuss the opportunities and challenges of future work that uses text data in causal inference.
Link: https://arxiv.org/abs/1810.00956
====================================================
Wikidata: A New Paradigm of Human-Bot Collaboration? (Alessandro Piscopo - 1 October, 2018)
Wikidata is a collaborative knowledge graph which has already drawn the attention of practitioners and researchers. In this paper, we highlight some of the most salient aspects of human-bot collaboration in Wikidata
Link: https://arxiv.org/abs/1810.00931
====================================================
Joint On-line Learning of a Zero-shot Spoken Semantic Parser and a Reinforcement Learning Dialogue Manager (Matthieu Riou - 1 October, 2018)
Unlike many other language processing applications, dialogue systems require interactions with users, therefore it is complex to develop them with pre-recorded data. Data collection, annotation and use in learning algorithms are performed in a single process
Link: https://arxiv.org/abs/1810.00924
====================================================
Training Machine Learning Models by Regularizing their Explanations (Andrew Slavin Ross - 29 September, 2018)
However, they do not always scale to explaining predictions for entire datasets, are not always at the right level of abstraction, and most importantly cannot correct the problems they reveal. These methods let us train models which can not only provide more interpretable rationales for their predictions but also generalize better when training data is confounded or meaningfully different from test data (even adversarially so).
Link: https://arxiv.org/abs/1810.00869
====================================================
Domain-Adversarial Multi-Task Framework for Novel Therapeutic Property Prediction of Compounds (Lingwei Xie - 28 September, 2018)
Moreover, there is complex nonlinear dependency among heterogeneous data. Experiments on two real-world datasets illustrate that the performance of our approach obtains an obvious improvement over competitive baselines
Link: https://arxiv.org/abs/1810.00867
====================================================
Classification from Positive, Unlabeled and Biased Negative Data (Yu-Guan Hsieh - 1 October, 2018)
The fact that the training N data are biased also makes our work very different from those of standard semi-supervised learning. Experimental results demonstrate the effectiveness of our algorithm in not only PUbN learning scenarios but also ordinary PU leaning scenarios on several benchmark datasets.
Link: https://arxiv.org/abs/1810.00846
====================================================
CHET: Compiler and Runtime for Homomorphic Evaluation of Tensor Programs (Roshan Dathathri - 1 October, 2018)
Fully Homomorphic Encryption (FHE) refers to a set of encryption schemes that allow computations to be applied directly on encrypted data without requiring a secret key
Link: https://arxiv.org/abs/1810.00845
====================================================
Set Transformer (Juho Lee - 1 October, 2018)
We show that our model is theoretically attractive and we evaluate it on a range of tasks, demonstrating increased performance compared to recent methods for set-structured data.
Link: https://arxiv.org/abs/1810.00825
====================================================