-
Notifications
You must be signed in to change notification settings - Fork 2
Database Schema
This is the database schema that is created when the scm2pgsql project has been run. Below is a list of tables and information regarding the records that are stored in each along with their description.
##Commits This table stores the information for a commit such as commit id, author, email, comment, commit date and brand id. Please note git stores commits chronologically but this order is not necessary correct. The commit should be viewed in a parent and child relation ship as stored in Commit_family table. Columns are as follows.
id | commit_id | author | author_email | comments | commit_date | branch_id |
integer not null auto_increment | character varying(255) | character varying(255) | character varying(255) | text | timestamp with time zone (ex. "2012-05-07 09:33:49-07") | character varying(255) |
##File_diffs This table stores the differences of a file between 2 commits. First commit has an empty old_commit_id. Each entry represents a diff in a file which contains the raw text, the start character, end character and diff type.
file_id | new_commit_id | old_commit_id | diff_text | char_start | char_end | diff_type |
character varying(255) | character varying(255) | character varying(255) | text | integer | integer | character varying(30) |
###Notes
- Char_start and Char_end of DIFF_MODIFYDELETE are relative to OLD FILE, and DIFF_MODIFYINSERT are relative to NEW FILE.
- Char_end is offset by 1. For example, char_end = 10 means the last character of the diff is at index 9.
- OLD FILE = EQUAL + DIFF_MODIFYDELETE
- NEW FILE = EQUAL + DIFF_MODIFYINSERT
- EQUAL can be calculated if we have either OLD FILE or NEW FILE. Since we have DIFF_MODIFYDELETE and MODIFYINSERT, one can figure out the NEW FILE from an OLD FILE and vice versa.
Diff type could be one of the following:
- DIFF_ADD - file didn't exist in the old commit and was added in the new commit. A pair of commits should only have 1 Add entry.
- DIFF_DELETE - file existed in the old commit and was deleted in the new commit. The file_id is the file name in the old commit. A pair of commits should only have 1 pair of Delete entry.
- DIFF_MODIFYINSERT - file exists in the old commit and was inserted a block of codes in the new commit. A pair of commits can have multiple modifyinsert entries.
- DIFF_MODIFYDELETE - file exists in the old commit and was deleted a block of codes in the new commit. A pair of commits can have multiple modifydelete entries.
- DIFF_COPY - file was copied to different place. Never run into this case.
- DIFF_RENAME - file was renamed to new file name. Currently JGit sees a rename file as a DIFF_DELETE and DIFF_ADD instead of DIFF_RENAME. This type is here just in case JGit decides to handle rename differently.
- DIFF_UNKNOWN - error while reading the file diff, unknown type
##Branches This table stores the branch id and all of its commit id. A commit can appear in multiple branches. The first commit always has a NULL branch (just the way JGit handle it).
branch_id | branch_name | commit_id |
character varying(255) | character varying(100) | character varying(255) |
##Commit_family This table stores the whole commit graphs. Each entry show a parent and child relationship between two commits. A Parent can have multiple children and a child can have multiple Parents.
parent | child |
character varying(255). Not null | character varying(255). Not null |
##Owners This table stores the JGit Blame for a file in a given commit. The ownership gets updated only when a file is changed in a commit. The ownership does not store the ownership of a file for every single commit.
commit_id | source_commit_id | owner_id | file_id | char_start | char_end | change_type |
character varying(255). Not null | character varying(255). Not null | character varying(255). Not null | character varying(255). Not null | integer. Not null | integer. Not null | character varying(12). Not null |
##Networks This table stores the network generated from the callgraph analyzer.
new_commit_id | old_commit_id | network_id |
character varying(255) | character varying(255) | integer. Not null, auto increment |
###Notes
- Indexes: "networks_pkey" PRIMARY KEY, btree (network_id) Referenced by: TABLE "edges" CONSTRAINT "edges_network_id_fkey" FOREIGN KEY (network_id) REFERENCES networks(network_id) ON DELETE CASCADE TABLE "nodes" CONSTRAINT "nodes_network_id_fkey" FOREIGN KEY (network_id) REFERENCES networks(network_id) ON DELETE CASCADE
##Nodes This table stores the node for Gephi graph visualizer.
id | label | network_id |
character varying(255). Not null | character varying(255) | integer |
###Notes
- Foreign-key constraints: "nodes_network_id_fkey" FOREIGN KEY (network_id) REFERENCES networks(network_id) ON DELETE CASCADE
##Edges This table stores the Edges for Gephi graph visualizer.
source | target | weight | is_fuzzy | network_id |
character varying(255) | character varying(255) | real | boolean | integer |
###Notes
- Foreign-key constraints: "edges_network_id_fkey" FOREIGN KEY (network_id) REFERENCES networks(network_id) ON DELETE CASCADE