Skip to content
/ DAEMA Public

This code reproduces the results presented in the paper "DAEMA: Denoising Autoencoder with Mask Attention" accepted at the ICANN 2021 conference.

License

Notifications You must be signed in to change notification settings

euranova/DAEMA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DAEMA: Denoising Autoencoder with Mask Attention

This repository contains the code used for the paper DAEMA: Denoising Autoencoder with Mask Attention. The documentation of the code, generated by sphinx, is available here.

Please cite as

@article{tihon2021daema,
  title={DAEMA: Denoising Autoencoder with Mask Attention},
  author={Tihon, Simon and Javaid, Muhammad Usama and Fourure, Damien and Posocco, Nicolas and Peel, Thomas},
  journal={arXiv preprint arXiv:2106.16057},
  year={2021}
}

How to setup the environment

On a Local Machine

Create and activate the conda environment with python 3.8.2

conda create --name <env-name> python=3.8.2
conda activate <env-name>

Install the libraries listed in requirements.txt

pip install -r requirements.txt

Run the code

cd src
python run.py

With Docker

The repo also contains Dockerfile to run the code

docker build -t <image_name>:<tag> .
docker run -t --name <container-name> <image_name> <experiment-to-run>

Example:

docker build -t daema:latest .
docker run -t --name daema_container daema:latest python run.py

Test your installation

You can test your installation by running

PYTHONPATH=src/ pytest tests

How to reproduce the results of the paper

MCAR state-of-the-art comparison:

  • DAEMA: python run.py
  • DAE: python run.py --daema_attention_mode no --daema_ways 1
  • AimNet: python run.py --model Holoclean --batch_size 0 --lr 0.05 --metric_steps 18 19 20 21 22
  • MIDA: python run.py --model MIDA --batch_size -1 --metric_steps 492 494 496 498 500 --scaler MinMax
  • MissForest: python run.py --model MissForest --metric_steps 0 --scaler MinMax
  • Mean: python run.py --model Mean --metric_steps 0
  • Real: python run.py --model Real --metric_steps 0

MNAR state-of-the-art comparison:

  • Same as above, but with an additional argument: --ms_setting mnar

Missingness proportions:

  • Same as above, but with an additional argument (e.g. for 10% missingness): --ms_prop 0.1

Ablation study part 1 (not part of the paper in the end):

  • Full: python run.py
  • Classic: python run.py --daema_attention_mode classic
  • Sep.: python run.py --daema_attention_mode sep

Ablation study part 2 (not part of the paper in the end):

  • DAEMA: python run.py
  • Reduced loss: python run.py --daema_loss_type dropout_only
  • Full loss: python run.py --daema_loss_type full
  • No art. miss.: python run.py --daema_pre_drop 0

How to add a dataset

To test the code on a local dataset:

  • put the dataset in files/data/<name>.csv;
  • update the src/pipeline/datasets/DATASETS variable to add your dataset;
  • run the tests;
  • use the --datasets argument to select it for the experiments (e.g. python run.py --datasets <name>).

How to add a model

To test the code on a custom model:

  • implement the model following the expected interface (see src/models/baseline_imputations/Identity for the basic structure);
  • update the src/models/__init__/MODELS variable to add your model;
  • run the tests;
  • use the --model argument to select it for the experiments (e.g. python run.py --model <Name>).

About

This code reproduces the results presented in the paper "DAEMA: Denoising Autoencoder with Mask Attention" accepted at the ICANN 2021 conference.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published