In this project, we summarize the paper RelTR: Relational Transformer for scene graph generation and validate a claim made by the authors.
The authors claim RelTR[1] has lower number of parameters compared to other scene graph generation models such as FCSGG[2].
FCSGG[] has multiple configurations with varying backbone architectures. Since the authors of RelTR do not specify which configuration is used and RelTR uses the ResNet50 as a backbone, we decided it would only be fair to test when both architectures use the same backbone.
Model | Parameters (M) ↓ |
---|---|
RelTR | 67.9 |
FCSGG | 26.4 |
Table comparing the number of parameters (M) for the models RelTR[1] and FCSGG[2]. ↓ lower is better
We observe that RelTR has more parameters than FCSGG when both architectures use ResNet50 as their backbone.
Result: The claim RelTr has lesser parameters than FCSGG is False.