Skip to content

Performs a static analysis of a file containing C source code, identifying the location in memory of the variables defined within the code

Notifications You must be signed in to change notification settings

AB20CS/MemoryModelAnalyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 

Repository files navigation

UNIX Memory Model Analyzer

Approach

Overview

The output is generated by reading off the entries in a series of structs that are filled up as the program reads through the source code. Below is a summary of the structs and their fields:

struct Fields
Stats name, number of lines and number of functions in the source code being read
FunctionNode name of the function, number of lines and variables in the function
MemNode name, scope, type and size of the variable

In the program, there is one Stats struct, one linked list of FunctionNode nodes and four linked list of MemNode nodes (one linked list for each of the segments of memory - i.e., read-only data, static data, heap and stack).

Parsing Details

Deciding if a function is being read:

  • There is a boolean flag that holds whether a function is currently being read. If the boolean flag is true, there cannot be a function header that is read.
  • The boolean flag is set to true when a function header is read.
  • In order to decide if a function header was read, the following criteria were checked:
    • the line does not contain ;
    • the line must contain (
    • the first word in the line must be an accepted variable type
  • The end of the function is realized when a line whose only non-whitespace character is } is read.

Deciding where in memory a variable is stored:

  • All parameters passed into functions were read by tokenizing the function headers using commas as the delimiter. All of these variables are stored in the stack.
  • To determine if the a variable is declared in a non-function-header line, the program checks if the line begins with an accepted variable type and ends with a semi-colon.
    • If the line begins with const or static, the word after these indentifiers is used as the "first word".
  • If the static identifier is used or if the variable is read while a function is not being read, the variable is stored in the static data segment.
  • If malloc is a substring of the line, the pointer variable declared is stored on the stack and the space allocated is stored on the heap.
  • If =" or '= "` is a substring of the line, a string literal is initialized. The pointer is stored on the stack and string is stored in the read-only (RO) data segment.
  • If there are square brackets in the line, an array is declared (will be stored on the stack). The size reserved by the array is decided in one of two ways:
    • If there is a numerical value between the square brackets, the numerical value indicates the number of elements. The size reserved is the number of elements multiplied by the size of a single element (determined by the array type).
    • If there is a curly bracket in the line, the array elements are initialized in the form type arr[] = {_, _, ... , _};. In this case, the number of elements is one more than the number of commas within the curly brackets.

Miscellaneous:

  • Any text after // in any line is ignored by only taking the text before it using strtok.
  • Any whitespace characters (i.e., tabs, spaces) in the beginning of each line are ignored for the purposes of parsing through the source code.

Function Overview

Primary Functions

Function Description
bool isFunctionHeader(char *line, char **types, int num_types) Returns true if and only if line is a function header, where types is an array of all valid types and num_types is the size of the array types
FunctionNode *initFunction(char *header, FunctionNode *func_head, MemNode *stack_head) Initializes a FunctionLL node for the current function whose header is header and inserts the new node in the linked list whose head is head
bool isVar(char *line, char **types, int num_types, FunctionNode *curr_func, MemNode *ro_head, MemNode *static_head, MemNode *heap_head, MemNode *stack_head) Returns true if anf only if line contains a variable declaration and inserts the variable into the appropriate linked list if a variable is contained, where types is an array of all valid types and num_types is the size of the array types
int readFile(Stats *stats, int argc, char **argv, FunctionNode *func_head, MemNode *ro_head, MemNode *static_head, MemNode *heap_head, MemNode *stack_head) Iterates and reads through lines of source file. Returns 0 if and only if there is a valid file indicated by user
void printOutput(Stats *stats, FunctionNode *func_head, MemNode *ro_head, MemNode *static_head, MemNode *heap_head, MemNode *stack_head) Prints the output given information read from source file

Helper Functions

Function Description
int getSize(char *type, int num_elements) Returns the size of memory occupied by num_elements elements of type type
void insertFuncNode(FunctionNode *func_head, FunctionNode *node) Inserts FunctionNode node into linked list whose head is head
void insertMemNode(MemNode *func_head, MemNode *node) Inserts MemNode node into linked list whose head is head
void deleteFunctionList(FunctionNode *head) Deletes a FunctionNode linked list whose head is head
void deleteMemList(MemNode *head) Deletes a MemNode linked list whose head is head
bool isWhitespace(char *str) Returns true iff str is whitespace

Running the Program

  1. Navigate to the directory (i.e., cd) in which SimpleMemModAnalyzer.c is saved on your machine.
  2. In the terminal, enter gcc SimpleMemModAnalyzer.c -o SimpleMemModAnalyzer to compile the program.
  3. To execute the program, enter ./SimpleMemModAnalyzer file.c into the terminal, where file is the name of the file with the C source code.

Test Cases

Test Case #1

  • Contains global variables (part of static data)
  • All variables in functions should go to the stack
  • Empty heap and RO data
  • Comments are present (should be ignored by program)

Source Code:

// evil global variables
int evil_glob_var_1;
float evil_glob_var_2;

void fun1(int x)
{
  int y;
  int z;
  printf("%d \n", x+y+z); 
}

int fun2(float z)
{
   float x;
   return (int)(z+x);
}

int main(int argc, char** argv)
{
  int w;

  fun1(w);
  fun2();

  return 0;
}

Output:

>>> Memory Model Layout <<<
***  exec // text ***
   prog1.c

### ROData ###       scope  type  size

### static data ###
   evil_glob_var_1   global   int   4
   evil_glob_var_2   global   float   4

### heap ###

####################
### unused space ###
####################

### stack ###
   x   fun1   int   4
   y   fun1   int   4
   z   fun1   int   4
   z   fun2   float   4
   x   fun2   float   4
   argc   main   int   4
   argv   main   char**   8
   w   main   int   4

**** STATS ****
  - Total number of lines in the file: 26
  - Total number of functions: 3
    fun1, fun2, main
  - Total number of lines per functions:
    fun1: 3
    fun2: 2
    main: 6
  - Total number of variables per function:
    fun1: 3
    fun2: 2
    main: 3
//////////////////////////////

Test Case #2

  • Has variables that are present on stack
  • Has dynamically-allocated memory (using malloc)
    • Both in a line in which the pointer variable is declared as well as in a line in which it is not declared
  • Has an array
  • Comments are present (should be ignored by program)

Source Code:

/* prog2.c */


void f1(int **i, const char *x)
{
 i = malloc(sizeof(int));
 int *h = malloc(sizeof(char **));
}


int f2()
{

  float x[5];
}

int main(int argc, char **argv)
{
  int i;
  int j;

  i = f2();
  f1(j);
}

Output:

***  exec // text ***
   prog2.c

### ROData ###       scope  type  size

### static data ###

### heap ###
   i   f1   int    4
   h   f1   int   8

####################
### unused space ###
####################

### stack ###
   i   f1   int **   8
   x   f1   const char *   8
   h   f1   int*   8
   x   f2   float[]   20
   argc   main   int   4
   argv   main   char **   8
   i   main   int   4
   j   main   int   4

**** STATS ****
  - Total number of lines in the file: 24
  - Total number of functions: 3
    f1, f2, main
  - Total number of lines per functions:
    f1: 2
    f2: 2
    main: 5
  - Total number of variables per function:
    f1: 3
    f2: 1
    main: 4
//////////////////////////////

Test Case #3

  • Has variables that are present on stack
  • Has global variables and variables declared as static (both are stored in static data)

Source Code:

/* prog3.c */

int x = 10;
int y;
 
int f(int p, int q)
{
    static int j = 5;

    x = 5;
    return p * q + j;
}
 
int main()
{
   int i = x;
 
   y = f(i, i);
   return 0;
}

Output:

>>> Memory Model Layout <<<
***  exec // text ***
   prog3.c

### ROData ###       scope  type  size

### static data ###
   x   global   int   4
   y   global   int   4
   j   f   static int   4

### heap ###

####################
### unused space ###
####################

### stack ###
   p   f   int   4
   q   f   int   4
   i   main   int   4

**** STATS ****
  - Total number of lines in the file: 20
  - Total number of functions: 2
    f, main
  - Total number of lines per functions:
    f: 4
    main: 4
  - Total number of variables per function:
    f: 3
    main: 1
//////////////////////////////

Test Case #4

  • Has variables that are present on stack
  • Has arrays that are declared but not initialized
  • Has functions whose return type is a pointer
  • Has malloc call in which its parameter is not an integer or of the form sizeof(T) where T is a type

Source Code:

/* string_tools.c */

char *concat_wrong(const char *s1, const char *s2) 
{
  char result[70];

  strcpy (result, s1);
  strcat (result, s2);

  return result;
}


char *concat(const char *s1, const char *s2)
{
  char *result;

  result = malloc(strlen(s1) + strlen(s2) + 1);
  if (result == NULL) {
      printf ("Error: malloc failed\n");
      exit(1);
  }

  strcpy (result, s1);
  strcat (result, s2);

  return result;
}

Output:

>>> Memory Model Layout <<<
***  exec // text ***
   string_tools.c

### ROData ###       scope  type  size

### static data ###

### heap ###
   result   concat   char   strlen(s1) + strlen(s2) + 1

####################
### unused space ###
####################

### stack ###
   s1   concat_wrong   const char *   8
   s2   concat_wrong   const char *   8
   result   concat_wrong   char[]   70
   s1   concat   const char *   8
   s2   concat   const char *   8
   result   concat   char*   8

**** STATS ****
  - Total number of lines in the file: 28
  - Total number of functions: 2
    concat_wrong, concat
  - Total number of lines per functions:
    concat_wrong: 6
    concat: 12
  - Total number of variables per function:
    concat_wrong: 3
    concat: 3
//////////////////////////////

Test Case #5

  • Has variables that are present on stack
  • Has variables of the same type listed in a row, separated by commas
  • Has a for loop
  • Has an array initialized
  • Has a string literal

Source Code:

/* sumArray.c */

int sum(int *a, int size) {
   int i, s = 0;

   for(i = 0; i < size; i++) 
       s += a[i];

   return s;
}


int main()
{
  int N = 5;
  int i[N] = {10, 9, 8, 7, 6};
  char string[1024] = "Hello World";

  printf("sum is %d\n", sum(i,N));

  return 0;
}

Output:

>>> Memory Model Layout <<<
***  exec // text ***
   sumArray.c

### ROData ###       scope  type  size
   string   main   char[]   1024

### static data ###

### heap ###

####################
### unused space ###
####################

### stack ###
   a   sum   int *   8
   size   sum   int   4
   i   sum   int   4
   N   main   int   4
   i   main   int[]   20
   string   main   char[]   1024

**** STATS ****
  - Total number of lines in the file: 22
  - Total number of functions: 2
    sum, main
  - Total number of lines per functions:
    sum: 6
    main: 7
  - Total number of variables per function:
    sum: 3
    main: 3
//////////////////////////////

About

Performs a static analysis of a file containing C source code, identifying the location in memory of the variables defined within the code

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages