Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the task on the CMU-MOSEI dataset, I have some doubts when I reproduce it. #31

Open
yanzichuan opened this issue Dec 20, 2024 · 1 comment

Comments

@yanzichuan
Copy link

yanzichuan commented Dec 20, 2024

**I hope this message finds you well. I truly appreciate your time and expertise, and I want to thank you in advance for considering my question. I’ve tried my best to articulate my thoughts clearly and comprehensively, but please let me know if there’s anything I could clarify further or improve in my explanation. I deeply value your insights and guidance.

Thank you so much for your time and help!

**1、Our dataset is divided according to the division on the MOSEISDK, or according to the division consistent with the CelebvHq preprocessing. **
CelebvHq preprocessing

def gen_split(root: str):
    videos = list(filter(lambda x: x.endswith('.mp4'), os.listdir(os.path.join(root, 'cropped'))))
    total_num = len(videos)

    with open(os.path.join(root, "train.txt"), "w") as f:
        for i in range(int(total_num * 0.8)):
            f.write(videos[i][:-4] + "\n")

    with open(os.path.join(root, "val.txt"), "w") as f:
        for i in range(int(total_num * 0.8), int(total_num * 0.9)):
            f.write(videos[i][:-4] + "\n")

    with open(os.path.join(root, "test.txt"), "w") as f:
        for i in range(int(total_num * 0.9), total_num):
            f.write(videos[i][:-4] + "\n")

CMU-MOSEI Official SDK

standard_train_fold=['hh04W3xXa5s', 'GdFP_p4eQX0', '4iG0ffmnCOw', '81406'
standard_test_fold=['7l3BNtSE0xc', 'dZFV0lyedX4', '286943', '126872'
standard_valid_fold=['188343', 'VAXhC2U9-2A', 'AxNy9TeTLq8', 

2、In the emotion task, we still train according to multilabels, do not need to modify the evaluation code, only need to modify num _ classes = 6. and sentiment .... this is right

def train_mosei(args, config):
    if task == "emotion": # multilabel
        num_classes = 6
    elif task == "sentiment-2": # binary
        num_classes = 1       
    elif task == "sentiment-7": # multiclass
        num_classes = 7
 model = Classifier(
            num_classes, config["backbone"], True, args.marlin_ckpt, "multilabel", config["learning_rate"],
            args.n_gpus > 1,
        )
        dm = MoseiDataModule(
            data_path, finetune, task,
            batch_size=args.batch_size,
            num_workers=args.num_workers,
            clip_frames=backbone_config.n_frames,
            temporal_sample_rate=2
        )
    def step(self, batch: Optional[Union[Tensor, Sequence[Tensor]]]) -> Dict[str, Tensor]:
        x, y = batch
        y_hat = self(x)
        if self.task == "multilabel":
            y_hat = y_hat.flatten()
            y = y.flatten()
            prob = y_hat.sigmoid()
            acc = self.acc_fn(prob, y)
            auc = self.auc_fn(prob, y)
            loss = self.loss_fn(y_hat, y.float())
        elif self.task == "multiclass":
            prob = y_hat.softmax(dim=1)
            print(prob)
            print(y)
            acc = self.acc_fn(prob, y)
            auc = self.auc_fn(prob, y)
            loss = self.loss_fn(y_hat, y)
        elif self.task == "binary":
            prob = y_hat.sigmoid()
            acc = self.acc_fn(prob.squeeze(0), y)
            auc = self.auc_fn(prob.squeeze(0),  y.float())
            loss = self.loss_fn(y_hat.squeeze(0), y.float())


        return {"loss": loss, "acc": acc, "auc": auc}

In the sentiment task, for this task, he is a multi-category classification task multiclass. Is there a problem in writing the code as I do

@yanzichuan
Copy link
Author

1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant