datasets

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md

README.md

Datasets

The input of CCGL model is a .txt file, each line of the file represents a cascade:

cascade_id \t adoption_id \t adoption_time \t num_adoptions \t [a list of adoptions seperated by "space"] \n

An example of Weibo cascade:

1	1	1464710400	41	1:0 1/2:22032 1/3:30685 1/4:32169 1/5:34580 1/6:29372 1/7:16459 1/8:11292 1/9:22293 1/10:6970 1/11:5530 1/12:2822 1/13:12772 1/14:1019 1/15:3360 1/16:21422 1/17:1333 1/18:1643 1/19:1518 1/20:669 1/21:2191 1/22:207 1/23:2880 1/24:445 1/25:23626 1/26:2514 1/27:681 1/28:2038 1/29:4815 1/30:99 1/31:2329 1/32:884 1/33:243 1/34:1931 1/35:236 1/36:908 1/37:7108 1/38:1501 1/39:1287 1/40:549 1/41:376

For each of the adoptions, e.g., 1/2:22032, it means user 2 retweet user 1's retweet at time 22032.

Caveat: About the seed

Due to some historical code issues, to get a corect dataset split results, please use 'xovee' (string) as seed for Weibo, ACM, and DBLP datasets, and use 0 (integer) as seed for Twitter and APS datasets.

Caveat: About Weibo dataset

As described in DeepHawkes paper, the cascades in Weibo dataset are between 8 AM and 6 PM (however the time in its code is different). If you want to compare CCGL with DeepHawkes, CasCN, and many others, make sure the time is set consistently. Golden rule: make sure the datasets (train, val, test) for training and testing are all identical for all baselines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

datasets

datasets

README.md

Datasets

Caveat: About the seed

Caveat: About Weibo dataset

Files

datasets

Directory actions

More options

Directory actions

More options

Latest commit

History

datasets

Folders and files

parent directory

README.md

Datasets

Caveat: About the seed

Caveat: About Weibo dataset