Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack buffer overflow error found in void Assembly::readPDB(std::string filename, ...) in src/assembly.cpp #1208

Closed
Arpan3323 opened this issue Nov 26, 2024 · 2 comments

Comments

@Arpan3323
Copy link
Contributor

Hi, I am reporting a stack-buffer-overflow error found via testing the implementation of void Assembly::readPDB(std::string filename, bool use_segid_instead_of_chainid, bool do_sort). The error is caused by operations performed on line 220, reading 600 chars into a 100-char buffer which is defined at line 210.

relion/src/assembly.cpp

Lines 199 to 221 in f2e59d6

void Assembly::readPDB(std::string filename, bool use_segid_instead_of_chainid, bool do_sort)
{
// Clear existing object
clear();
std::ifstream fh(filename.c_str(), std::ios_base::in);
if (fh.fail())
REPORT_ERROR( (std::string) "Assembly::read: File " + filename + " does not exists" );
char line[100];
bool is_sorted = true;
int old_resnum = -1;
long int mol_id = -1;
long int res_id = -1;
std::string molname, alt_molname, old_molname="";
fh.seekg(0);
// Loop over all lines
while (fh.getline (line, 600))
{

Testing with a valid PDB file, such as this: pdb1hho.txt does not cause the problem as each line in this specific file is 80 chars maximum. But if we add a single line (anywhere in the input PDB file) that is more than 99 characters we get a stack-buffer-overflow.

For example: this file: pdb1hho_invalid.txt results in a stack-buffer-overflow because of the way data is read into char line[100]; currently.

Example to reproduce:

#include <cstdint>
#include <cstring>
#include <string>

#include "src/assembly.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) 
{
    if(Size < 4) { return 0; }
    std::string name((reinterpret_cast<const char*>(Data)), Size);
    Assembly asmb(name);
    asmb.readPDB("pdb1hho_invalid.ent");
    return 0;
}

Crash report:

INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 3481883330
INFO: Loaded 1 modules   (3161 inline 8-bit counters): 3161 [0x564ece349da0, 0x564ece34a9f9), 
INFO: Loaded 1 PC tables (3161 PCs): 3161 [0x564ece34aa00,0x564ece356f90), 
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2      INITED cov: 2 ft: 2 corp: 1/1b exec/s: 0 rss: 34Mb
=================================================================
==39729==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7f3b5d600414 at pc 0x564ece2079cf bp 0x7fffad7de210 sp 0x7fffad7dd9d8
READ of size 101 at 0x7f3b5d600414 thread T0
    #0 0x564ece2079ce in strlen (assemblyFuzzer_readPDB+0x989ce) (BuildId: 43e4295b6d775d5dc60bada8894bab55ce85b87f)
    #1 0x564ece2e7a19 in std::char_traits<char>::length(char const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/char_traits.h:399:9
    #2 0x564ece2e7a19 in std::basic_string_view<char, std::char_traits<char>>::basic_string_view(char const*) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/string_view:141:16
    #3 0x564ece2e7a19 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::basic_string<char [100], void>(char const (&) [100], unsigned long, unsigned long, std::allocator<char> const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/basic_string.h:785:35
    #4 0x564ece2e423c in Assembly::readPDB(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, bool) build/_deps/relion-src/src/assembly.cpp:223:15
    #5 0x564ece2f7df6 in LLVMFuzzerTestOneInput RelionFuzzers/fuzz_assembly/target_a/fuzzer.cpp:12:10
    #6 0x564ece1d6e64 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (assemblyFuzzer_readPDB+0x67e64) (BuildId: 43e4295b6d775d5dc60bada8894bab55ce85b87f)
    #7 0x564ece1d6559 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) (assemblyFuzzer_readPDB+0x67559) (BuildId: 43e4295b6d775d5dc60bada8894bab55ce85b87f)
    #8 0x564ece1d7d45 in fuzzer::Fuzzer::MutateAndTestOne() (assemblyFuzzer_readPDB+0x68d45) (BuildId: 43e4295b6d775d5dc60bada8894bab55ce85b87f)
    #9 0x564ece1d88a5 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (assemblyFuzzer_readPDB+0x698a5) (BuildId: 43e4295b6d775d5dc60bada8894bab55ce85b87f)
    #10 0x564ece1c5b7f in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (assemblyFuzzer_readPDB+0x56b7f) (BuildId: 43e4295b6d775d5dc60bada8894bab55ce85b87f)
    #11 0x564ece1f0206 in main (assemblyFuzzer_readPDB+0x81206) (BuildId: 43e4295b6d775d5dc60bada8894bab55ce85b87f)
    #12 0x7f3b5eebe1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #13 0x7f3b5eebe28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #14 0x564ece1bab64 in _start (assemblyFuzzer_readPDB+0x4bb64) (BuildId: 43e4295b6d775d5dc60bada8894bab55ce85b87f)

Address 0x7f3b5d600414 is located in stack of thread T0 at offset 1044 in frame
    #0 0x564ece2e363f in Assembly::readPDB(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, bool, bool) build/_deps/relion-src/src/assembly.cpp:200

  This frame has 44 object(s):
    [32, 552) 'fh' (line 205)
    [688, 720) 'ref.tmp' (line 208)
    [752, 784) 'ref.tmp17' (line 208)
    [816, 848) 'ref.tmp18' (line 208)
    [880, 912) 'ref.tmp40' (line 208)
    [944, 1044) 'line' (line 210)
    [1088, 1120) 'molname' (line 215) <== Memory access at offset 1044 partially underflows this variable
    [1152, 1184) 'alt_molname' (line 215)
    [1216, 1248) 'old_molname' (line 215)
    [1280, 1312) 'record' (line 223)
    [1344, 1376) 'str' (line 251)
    [1408, 1440) 'ref.tmp164' (line 252)
    [1472, 1504) 'ref.tmp171' (line 252)
    [1536, 1542) 'snum' (line 254)
    [1568, 1573) 'atomname' (line 254)
    [1600, 1602) 'altLoc' (line 254)
    [1616, 1620) 'resname' (line 254)
    [1632, 1634) 'chainID' (line 254)
    [1648, 1652) 'resnum' (line 255)
    [1664, 1666) 'insertion_residue_code' (line 256)
    [1680, 1684) 'x' (line 257)
    [1696, 1700) 'y' (line 257)
    [1712, 1716) 'z' (line 257)
    [1728, 1732) 'occupancy' (line 257)
    [1744, 1748) 'bfactor' (line 257)
    [1760, 1765) 'segID' (line 258)
    [1792, 1795) 'element' (line 258)
    [1808, 1811) 'charge' (line 258)
    [1824, 1856) 'str370' (line 303)
    [1888, 1920) 'ref.tmp384' (line 304)
    [1952, 1984) 'ref.tmp391' (line 304)
    [2016, 2048) 'ref.tmp424' (line 309)
    [2080, 2112) 'ref.tmp436' (line 309)
    [2144, 2176) 'str_chainID' (line 312)
    [2208, 2240) 'str_segID' (line 313)
    [2272, 2304) 'str_atomname' (line 314)
    [2336, 2368) 'str_resname' (line 315)
    [2400, 2432) 'agg.tmp572'
    [2464, 2496) 'agg.tmp578'
    [2528, 2560) 'agg.tmp630'
    [2592, 2624) 'agg.tmp723'
    [2656, 2728) 'atom' (line 379)
    [2768, 2800) 'agg.tmp752'
    [2832, 2856) 'ref.tmp770' (line 380)
SUMMARY: AddressSanitizer: stack-buffer-overflow (assemblyFuzzer_readPDB+0x989ce) (BuildId: 43e4295b6d775d5dc60bada8894bab55ce85b87f) in strlen
Shadow bytes around the buggy address:
  0x7f3b5d600180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7f3b5d600200: 00 00 00 00 00 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2
  0x7f3b5d600280: f2 f2 f2 f2 f2 f2 f8 f8 f8 f8 f2 f2 f2 f2 f8 f8
  0x7f3b5d600300: f8 f8 f2 f2 f2 f2 f8 f8 f8 f8 f2 f2 f2 f2 f8 f8
  0x7f3b5d600380: f8 f8 f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00
=>0x7f3b5d600400: 00 00[04]f2 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
  0x7f3b5d600480: 00 00 00 00 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
  0x7f3b5d600500: 00 00 00 00 f2 f2 f2 f2 f8 f8 f8 f8 f2 f2 f2 f2
  0x7f3b5d600580: f8 f8 f8 f8 f2 f2 f2 f2 f8 f8 f8 f8 f2 f2 f2 f2
  0x7f3b5d600600: f8 f2 f2 f2 f8 f2 f2 f2 f8 f2 f8 f2 f8 f2 f8 f2
  0x7f3b5d600680: f8 f2 f8 f2 f8 f2 f8 f2 f8 f2 f8 f2 f8 f2 f2 f2
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==39729==ABORTING

Proposed solution:

changing line 220: while (fh.getline (line, 600)) to while (fh.getline (line, 100)) should fix the issue without changing the expected behavior of the function.

Please let me know if I am mistaken or if you have any questions :)

Sincerely,
Arpan Srivastava

@biochem-fan
Copy link
Member

Thanks again for your report.

changing line 220: while (fh.getline (line, 600)) to while (fh.getline (line, 100)) should fix the issue without changing the expected behavior of the function.

Yes, I agree. Please send a pull request to the ver4.0 branch, which I will propagate to ver5.0 as well.

The PDB specification says one line of a valid PDB file is up to 80 characters. So 100 should be more than enough.

Every PDB file is presented in a number of lines. Each line in the PDB entry file consists of 80 columns. The last character in each PDB entry should be an end-of-line indicator.
https://www.wwpdb.org/documentation/file-format-content/format33/sect1.html

biochem-fan added a commit that referenced this issue Nov 26, 2024
fix stack buffer overflow error in Assembly::readPDB as discussed in #1208.
@biochem-fan
Copy link
Member

Fixed by the PR #1209.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants