Skip to content

Supplementary materials for the paper "Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam."

License

Notifications You must be signed in to change notification settings

nabormendonca/gpt-4v-enade-cs-2021

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChatGPT-4 Vision Evaluation on ENADE 2021 Bachelor in Computer Science Exam

This repository contains supplementary materials for the study reported in the paper Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam, which has been accepted for publication in the ACM Transactions on Computing Education.

The tables below provide an overview of the open and multiple-choice questions, respectively, of the ENADE 2021 Bachelor in Computer Science exam. Each table includes the questions' subject, modality, reasoning strategy, and scoring status, as well as the accuracy of ChatGPT-4 Vision and the main challenges/errors it faced answering them. Click on the links on the right-most column of each table to view the model's full conversations and the expert assessments (when available) for each question in English and Portuguese.

Open Questions

# Subject Modality Reasoning Strategy Status Model Accuracy Model Challenges / Error Categories Model Conversations and Expert Assessments
03 Theory of Computing Visual Direct Scored Partly correct Logical Reasoning / Incorrect Multi-step Reasoning,
Visual Acuity / Misidentification of Visual Elements
English version,
Portuguese version
04 Computer Architecture Visual Direct Scored Partly correct Visual Acuity / Lack of Domain-specific Visual Output English version, Portuguese version
05 Algorithms Visual Direct Scored Partly correct Logical Reasoning / Incorrect Algorithmic Reasoning English version, Portuguese version

Multiple-choice Questions

# Subject Modality Reasoning Strategy Status Model Accuracy Model Challenges / Error Categories Model Conversations and Expert Assessments
09 Operating Systems Text Direct Scored Incorrect Logical Reasoning / Insufficient Domain Knowledge English version, Portuguese version
10 Programming Text Direct Scored Correct English version, Portuguese version
11 Artificial Intelligence Visual Indirect Scored Correct English version, Portuguese version
12 Computers in Society Text Indirect Not scored Incorrect Logical Reasoning / Evaluation Leniency,
Logical Reasoning / Incorrect Multi-step Reasoning
English version, Portuguese version
13 Software Engineering Text Direct Not scored Incorrect Question Interpretation / Inconsistent Responses,
Logical Reasoning / Insufficient Domain Knowledge
English version, Portuguese version
14 Computer Architecture Text Indirect Scored Incorrect Logical Reasoning / Evaluation Leniency,
Logical Reasoning / Insufficient Domain Knowledge
English version, Portuguese version
15 Software Engineering Text Indirect Scored Correct English version, Portuguese version
16 Computer Architecture Visual Direct Scored Correct English version, Portuguese version
17 Operating Systems Visual Indirect Not scored Correct English version, Portuguese version
18 Artificial Intelligence Text Indirect Scored Incorrect Question Interpretation / Inconsistent Responses,
Logical Reasoning / Incorrect Multi-step Reasoning
English version, Portuguese version
19 Human-Computer Interaction Text Indirect Scored Correct English version, Portuguese version
20 Programming Text Indirect Scored Correct English version, Portuguese version
21 Computer Networks Visual Direct Not scored Incorrect Logical Reasoning / Evaluation Leniency English version, Portuguese version
22 Information Systems Visual Direct Scored Correct English version, Portuguese version
23 Programming Visual Direct Scored Correct English version, Portuguese version
24 Computer Security Text Indirect Scored Correct English version, Portuguese version
25 Distributed Systems Text Indirect Not scored Incorrect Question Interpretation / Inconsistent Responses,
Logical Reasoning / Incorrect Multi-step Reasoning
English version, Portuguese version
26 Web Development Text Indirect Scored Incorrect Logical Reasoning / Insufficient Domain Knowledge,
Logical Reasoning / Incorrect Multi-step Reasoning
English version, Portuguese version
27 Performance Analysis Visual Direct Scored Correct English version, Portuguese version
28 Computer Architecture Visual Indirect Scored Correct English version, Portuguese version
29 Image Processing Visual Indirect Invalid English version, Portuguese version
30 Compilers Text Indirect Scored Correct English version, Portuguese version
31 Theory of Computing Visual Indirect Scored Incorrect Question Interpretation / Non-Compliance with Guidelines,
Logical Reasoning / Incorrect Algorithmic Reasoning,
Logical Reasoning / Incorrect Multi-step Reasoning,
Visual Acuity / Misidentification of Visual Elements
English version, Portuguese version
32 Algorithms Text Indirect Not scored Correct English version, Portuguese version
33 Programming Text Direct Invalid English version, Portuguese version
34 Graph Theory Visual Direct Scored Incorrect Visual Acuity / Misidentification of Visual Elements English version, Portuguese version
35 Distributed Systems Text Indirect Scored Correct English version, Portuguese version

The following materials are also available:

About

Supplementary materials for the paper "Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam."

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published