Requirements

This repository contains the code used in paper "CLEAR: Ranked Multi-Positive Contrastive Representation Learning for Robust Trajectory Similarity Computation"

Requirements

Ubuntu OS
Python 3.9.13 (tested)
PyTorch 1.13.0 (tested)

Preprocessing

We mainly follow t2vec to preprocess the datasets but reproduce all Julia scripts in Python. We support two trajectory datasets of different moving objects. They're Porto and GeoLife. Taking Porto as an example, our preprocessing includes several steps:

Unify the datasets in different formats:
python preprocess/preprocess.py -dataset_name "porto"
Then you'll get a .pkl file called "porto.pkl" in "data/porto".
Data augmentation.
python preprocess/augmentation.py
Then you'll get a .pkl file named such as "porto_distort_rate_0.2.pkl" in "data/porto/augmentation". Feel free to use multiprocessing :-)
Token generation.
python preprocess/grid_partitioning.py -dataset_name "porto"
Then you'll get a series of .pkl files named such as "porot_distort_rate_0.2_token.pkl" in "data/porto/token/cell-100_minfreq-50". Again, feel free to use multiprocessing. Meanwhile, the node2vec for the partitioned space will also be done here. You can find the node embedding file "../data/porto/porto_size-256_cellsize-100_minfreq-50_node2vec.pkl"

Training

You can train CLEAR with the following settings.
python main.py -dataset_name "porto" -combination "multi" -loss "pos-rank-out-all" -model_name "clear-DualRNN" -pretrain_mode "pf" -pretrain_method "node2vec" -batch_size 64 -cell_size 100 -minfreq 50 -aug1_name "distort" -aug1_rate 0.4 -aug2_name "downsampling" -aug2_rate 0.4
The trained model will be saved in "{}_checkpoint.pt" and "{}_best.pt". To facilicate the ablation study, they'll be named such as "clear-DualRNN_grid_cell-100_minfreq-50_multi-downsampling-distort-246_pos-rank-out-all_batch-64_pretrain-node2vec-pf_porto_checkpoint.pt" and saved in "data/porto". To reproduce the results of the other variants mentioned in our paper, you can modify the parameters such as combination, loss, batch_size, cell_size, and minfreq to corresponding values.

Evaluation

We support three types of evaluation metrics, i.e., "self-similarity", "cross-similarity" and "knn". Taking "self-similarity" as an example, you can follow the next steps to reproduce the results.

Prepare the experimental dataset.
python ./experiment/experiment.py -mode data -dataset_name "porto" -exp_list "self-similarity" -partition_method "grid" -cell_size 100 -minfreq 50
Then you'll get the experimental dataset in "experiment/self-similarity/porto/cellsize-100-minfreq-50".
Encode.
python ./experiment/experiment.py -mode encode -dataset_name "porto" -exp_list "self-similarity" -partition_method "grid" -cell_size 100 -minfreq 50 -combination "multi" -loss "pos-rank-out-all" -batch_size 64 -aug1_name "distort" -aug1_rate 0.4 -aug2_name "downsampling" -aug2_rate 0.4 -model_name "clear-DualRNN" -pretrain_mode "pf" -pretrain_method "node2vec"
Then you'll get the encoded vector for self-similarity experimental set named with a suffix corresponding to your model, in "experiment/self-similarity/porto/cellsize-100-minfreq-50"
Experiment. python experiment.py -mode encode -dataset_name "porto" -exp_list "self-similarity" -spatial_type "grid" -cell_size 100 -minfreq 50 -combination "single" -loss "pos-rank-out-all" -batch_size 64 -aug1_name "distort" -aug1_rate 0.4 -aug2_name "downsampling" -aug2_rate 0.4 -model_name "clear-DualRNN" -pretrain_mode "pf" -pretrain_method "node2vec" Then you'll get the experimental results (.csv file) in "experiment".

We put the unified data file and our trained model of Porto in data.

Reference

CLEAR has been accepted by IEEE MDM 2024 and selected as the Best paper runner-up.

If you find our work useful and inspiring, please consider citing our paper:

@inproceedings{li2024clear,
  title={CLEAR: Ranked Multi-Positive Contrastive Representation Learning for Robust Trajectory Similarity Computation},
  author={Li, Jialiang and Liu, Tiantian and Lu, Hua},
  booktitle={2024 25th IEEE International Conference on Mobile Data Management (MDM)},
  pages={21--30},
  year={2024},
  organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
experiment		experiment
model		model
preprocess		preprocess
README.md		README.md
__init__.py		__init__.py
config.py		config.py
constants.py		constants.py
dataset.py		dataset.py
evaluation.py		evaluation.py
losses.py		losses.py
main.py		main.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Requirements

Preprocessing

Training

Evaluation

Reference

About

Releases

Packages

Languages

jlliRUC/CLEAR

Folders and files

Latest commit

History

Repository files navigation

Requirements

Preprocessing

Training

Evaluation

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages