This repository contains supplementary materials for the study reported in the paper Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam, which has been accepted for publication in the ACM Transactions on Computing Education.
The tables below provide an overview of the open and multiple-choice questions, respectively, of the ENADE 2021 Bachelor in Computer Science exam. Each table includes the questions' subject, modality, reasoning strategy, and scoring status, as well as the accuracy of ChatGPT-4 Vision and the main challenges/errors it faced answering them. Click on the links on the right-most column of each table to view the model's full conversations and the expert assessments (when available) for each question in English and Portuguese.
Open Questions
# | Subject | Modality | Reasoning Strategy | Status | Model Accuracy | Model Challenges / Error Categories | Model Conversations and Expert Assessments |
---|---|---|---|---|---|---|---|
03 | Theory of Computing | Visual | Direct | Scored | Partly correct | Logical Reasoning / Incorrect Multi-step Reasoning, Visual Acuity / Misidentification of Visual Elements |
English version, Portuguese version |
04 | Computer Architecture | Visual | Direct | Scored | Partly correct | Visual Acuity / Lack of Domain-specific Visual Output | English version, Portuguese version |
05 | Algorithms | Visual | Direct | Scored | Partly correct | Logical Reasoning / Incorrect Algorithmic Reasoning | English version, Portuguese version |
Multiple-choice Questions
# | Subject | Modality | Reasoning Strategy | Status | Model Accuracy | Model Challenges / Error Categories | Model Conversations and Expert Assessments |
---|---|---|---|---|---|---|---|
09 | Operating Systems | Text | Direct | Scored | Incorrect | Logical Reasoning / Insufficient Domain Knowledge | English version, Portuguese version |
10 | Programming | Text | Direct | Scored | Correct | English version, Portuguese version | |
11 | Artificial Intelligence | Visual | Indirect | Scored | Correct | English version, Portuguese version | |
12 | Computers in Society | Text | Indirect | Not scored | Incorrect | Logical Reasoning / Evaluation Leniency, Logical Reasoning / Incorrect Multi-step Reasoning |
English version, Portuguese version |
13 | Software Engineering | Text | Direct | Not scored | Incorrect | Question Interpretation / Inconsistent Responses, Logical Reasoning / Insufficient Domain Knowledge |
English version, Portuguese version |
14 | Computer Architecture | Text | Indirect | Scored | Incorrect | Logical Reasoning / Evaluation Leniency, Logical Reasoning / Insufficient Domain Knowledge |
English version, Portuguese version |
15 | Software Engineering | Text | Indirect | Scored | Correct | English version, Portuguese version | |
16 | Computer Architecture | Visual | Direct | Scored | Correct | English version, Portuguese version | |
17 | Operating Systems | Visual | Indirect | Not scored | Correct | English version, Portuguese version | |
18 | Artificial Intelligence | Text | Indirect | Scored | Incorrect | Question Interpretation / Inconsistent Responses, Logical Reasoning / Incorrect Multi-step Reasoning |
English version, Portuguese version |
19 | Human-Computer Interaction | Text | Indirect | Scored | Correct | English version, Portuguese version | |
20 | Programming | Text | Indirect | Scored | Correct | English version, Portuguese version | |
21 | Computer Networks | Visual | Direct | Not scored | Incorrect | Logical Reasoning / Evaluation Leniency | English version, Portuguese version |
22 | Information Systems | Visual | Direct | Scored | Correct | English version, Portuguese version | |
23 | Programming | Visual | Direct | Scored | Correct | English version, Portuguese version | |
24 | Computer Security | Text | Indirect | Scored | Correct | English version, Portuguese version | |
25 | Distributed Systems | Text | Indirect | Not scored | Incorrect | Question Interpretation / Inconsistent Responses, Logical Reasoning / Incorrect Multi-step Reasoning |
English version, Portuguese version |
26 | Web Development | Text | Indirect | Scored | Incorrect | Logical Reasoning / Insufficient Domain Knowledge, Logical Reasoning / Incorrect Multi-step Reasoning |
English version, Portuguese version |
27 | Performance Analysis | Visual | Direct | Scored | Correct | English version, Portuguese version | |
28 | Computer Architecture | Visual | Indirect | Scored | Correct | English version, Portuguese version | |
29 | Image Processing | Visual | Indirect | Invalid | English version, Portuguese version | ||
30 | Compilers | Text | Indirect | Scored | Correct | English version, Portuguese version | |
31 | Theory of Computing | Visual | Indirect | Scored | Incorrect | Question Interpretation / Non-Compliance with Guidelines, Logical Reasoning / Incorrect Algorithmic Reasoning, Logical Reasoning / Incorrect Multi-step Reasoning, Visual Acuity / Misidentification of Visual Elements |
English version, Portuguese version |
32 | Algorithms | Text | Indirect | Not scored | Correct | English version, Portuguese version | |
33 | Programming | Text | Direct | Invalid | English version, Portuguese version | ||
34 | Graph Theory | Visual | Direct | Scored | Incorrect | Visual Acuity / Misidentification of Visual Elements | English version, Portuguese version |
35 | Distributed Systems | Text | Indirect | Scored | Correct | English version, Portuguese version |
The following materials are also available:
- ChatGPT-4 Vision and Turbo prompt templates [ English version ] [ Portuguese version ]
- ENADE 2021 Bachelor in Computer Science exam document [ Portuguese version ]
- ENADE 2021 Bachelor in Computer Science exam response standard (for open questions) [ Portuguese version ]
- ENADE 2021 Bachelor in Computer Science exam answer key (for multiple-choice questions) [ Portuguese version ]
- ENADE 2021 Bachelor in Computer Science exam results and statistics [ Portuguese version (external link) ]