IMPLEMENTATION OF DATABASE SYSTEM

Authors:

Despotidou Kalliopi Christina(Κallistina) (sdi2000045)
Lionis Emmanouil Georgios(Akis) (sdi1900103)

Summary

This project takes some Records and then saves them in data.db file and stores with a heap or a HashTable implementation. Also, you can make a secondary hash table that saves the data in index.db. This way, we allocate blocks and data, and then someone can take the data from data.db or index.db and print all the entries. Also we have made some Statistics for some variations.

Report

When we run the ht_main (the primary HashTable implementation), we observe that while records with integer ids from 1 to n (n= RECORDS_NUM) are inserted linearly, they are evenly distributed in the HashTable blocks. In this way, some buckets cannot avoid overflow, so new blocks are created.

Statistics of ht_main

This file has 40 blocks
Bucket 0 Max Records: 5
Bucket 0 Average Records: 5.000000
Bucket 0 Min Records: 5
 ~~~ 
Bucket 1 Max Records: 5
Bucket 1 Average Records: 5.000000
Bucket 1 Min Records: 5
 ~~~ 
Bucket 2 Max Records: 5
Bucket 2 Average Records: 5.000000
Bucket 2 Min Records: 5
 ~~~ 
Bucket 3 Max Records: 5
Bucket 3 Average Records: 5.000000
Bucket 3 Min Records: 5
 ~~~ 
Bucket 4 Max Records: 5
Bucket 4 Average Records: 5.000000
Bucket 4 Min Records: 5
 ~~~ 
Bucket 5 Max Records: 5
Bucket 5 Average Records: 5.000000
Bucket 5 Min Records: 5
 ~~~ 
Bucket 6 Max Records: 5
Bucket 6 Average Records: 5.000000
Bucket 6 Min Records: 5
 ~~~ 
Bucket 7 Max Records: 5
Bucket 7 Average Records: 5.000000
Bucket 7 Min Records: 5
 ~~~ 
Bucket 8 Max Records: 5
Bucket 8 Average Records: 5.000000
Bucket 8 Min Records: 5
 ~~~ 
Bucket 9 Max Records: 5
Bucket 9 Average Records: 5.000000
Bucket 9 Min Records: 5
 ~~~ 
Average blocks per bucket: 4.000000
Buckets with overflow blocks: 10
Bucket 0 Overflow Blocks: 3
Bucket 1 Overflow Blocks: 3
Bucket 2 Overflow Blocks: 3
Bucket 3 Overflow Blocks: 3
Bucket 4 Overflow Blocks: 3
Bucket 5 Overflow Blocks: 3
Bucket 6 Overflow Blocks: 3
Bucket 7 Overflow Blocks: 3
Bucket 8 Overflow Blocks: 3
Bucket 9 Overflow Blocks: 3

When we run the sht_main (the secondary HashTable implementation), we observe that the info with records of integer ids from 1 to n (n= MAX_RECORD) are NOT inserted linearly. The reason is that the hashValue function divides the info data depending on the records' name, they are NOT evenly distributed in the secondary HashTable blocks and some buckets remain empty. As a result, we can find easier the block we search for and we use less memory than we need in the sht. On the other hand, many info data are inserted in the same bucket and as a result, many new blocks are created in the very same bucket while other buckets are empty. This way, the worst-case scenario complexity O(n) may occur compared to the primary HashTable. Also, two or more records may have the same name domain, so we can't separate them. As a result, in the getAllEntries function, we may print two or more times the same record: when we search in the secondary HashTable, the block's info with the given name we find (name, blockId) info couple and nothing else about this record. Then we visit the block with id=blockId in the primary HashTable and look for the record in the block's record table for the record with the given name and we print as many records have this name (maybe 2 or more). Then we go back to the secondary HashTable and look if any other info data from the info's table has the given name. If yes, we repeat the same actions in the primary HashTable. As a result, we do not know which records were already printed because we need their id, and we eventually print them again.

Statistics of sht_main

This file has 11 blocks
Bucket 0 Max INFO: 12
Bucket 0 Average INFO: 12.000000
Bucket 0 Min INFO: 12
 ~~~ 
Bucket 1 Max INFO: 24
Bucket 1 Average INFO: 21.666667
Bucket 1 Min INFO: 17
 ~~~ 
Bucket 2 Max INFO: 0
Bucket 2 Average INFO: 0.000000
Bucket 2 Min INFO: 0
 ~~~ 
Bucket 3 Max INFO: 12
Bucket 3 Average INFO: 12.000000
Bucket 3 Min INFO: 12
 ~~~ 
Bucket 4 Max INFO: 16
Bucket 4 Average INFO: 16.000000
Bucket 4 Min INFO: 16
 ~~~ 
Bucket 5 Max INFO: 21
Bucket 5 Average INFO: 16.500000
Bucket 5 Min INFO: 12
 ~~~ 
Bucket 6 Max INFO: 22
Bucket 6 Average INFO: 22.000000
Bucket 6 Min INFO: 22
 ~~~ 
Bucket 7 Max INFO: 0
Bucket 7 Average INFO: 0.000000
Bucket 7 Min INFO: 0
 ~~~ 
Bucket 8 Max INFO: 24
Bucket 8 Average INFO: 20.000000
Bucket 8 Min INFO: 16
 ~~~ 
Bucket 9 Max INFO: 0
Bucket 9 Average INFO: 0.000000
Bucket 9 Min INFO: 0
 ~~~ 
Average blocks per bucket: 1.100000
Buckets with overflow blocks: 3
Bucket 1 Overflow Blocks: 2
Bucket 5 Overflow Blocks: 1
Bucket 8 Overflow Blocks: 1

How to run the code

There are numerous ways to run the code. Either make the build and run it or write the command to make it and run it.

Make Build

To make the build, you have to write make <Algorithm> and the Algorithm has to be either hp, ht, or sht. In the folder build, you will find the _main, which you can execute. For example: make sht

Make Run

If you want to build and run the program immediately,n you have to write make <Algorithm>-run and the Algorithm has to be either hp or ht or sht. For example: make sht-run

Make Clean

The make clean will delete all the .o .d files and the data.db and index.db files

Files - Description

There are 7 different code files: hp_file, hp_main, ht_table, ht_main, sht_table, sht_main, record, and bf_main. Also, we have the 5 header files for the functions: bf, hp_file, ht_table, sht_table, and record.

hp_main

The main creates the file, opens it, inserts some random records, prints them with a specific id, and closes the file.

hp_file

It has the functions of heap implementation. The heap has the hp_info, which stores the id file description, max records in the first block and the max records per block. It also has the hp_block_info, which stores the count record and the following block id.

In the create, we create the first hp_block_info and the hp_info because we create the first block.

In the open, we take the hp_info

In the insert we:

take the record
find the last block
check if there is room
1. if there is room, then insert it there
2. if there is not, then create a new block, insert in the new block and link the previous block with the new block

Important: The insert starts to insert from the first block that contains the hp_info, the hp_block_info, to save up memory space

We go for all the blocks to get all entries and check if we see the record. Because we want to print all the records with the specific id, we pass through all the blocks and records.

In the close, we close the file and free the hp_info.

ht_main

The main creates the file, opens it, inserts some random records, prints all the records with a specific id, closes the file, and finds the statics of the hash table.

The hash statistics function prints

the count of blocks for the data.db file,
the minimum, average and maximum count of records for every bucket of this file,
the average number of blocks for every bucket
the number of buckets with overflowing blocks and the amount of overflowing blocks for every bucket

ht_table

It has the functions of primary hash table implementation.

The hash table has the ht_info, which stores the id file description, the count of buckets of this hash table, the position of the hash table, the max records in the first block of the first bucket and the max records per block.

It also has the ht_block_info, which stores the count record and the following block id.

In the create function, we create the first ht_block_info and the hp_info because we create the first block.

In the open function, we take the ht_info

In the insert we:

take the record
hash the record id in order to find the last block of the appropriate bucket to put it
check if there is room
1. if there is room, then insert it there
2. if there is not, then create a new block, insert in the new block and link the previous block with the new block

Important: The insert starts to insert from the first block of the first bucket that contains the ht_info, the ht_block_info, to save up memory space

In the get all entries function, we hash the value with the function we created to find the right bucket. Then, we find this bucket's blocks and check if they contain the record with the given value. If yes, we will print it.

In the close function, we close the file and free the ht_info.

sht_main

The main creates the file, then opens it, then inserts some random records in the hash table and then in the secondary hash table and then prints all the records and tries to find a specific name, then closes the file and finds the statics of the secondary hash table.

The hash statistics function prints

the count of blocks for the index.db file,
the minimum, the average and the maximum count of info (name,blockId) for every bucket of the secondary hash table of this file,
the average number of blocks for every bucket
the number of buckets with overflowing blocks and the amount of overflowing blocks for every bucket

sht_table

It has the functions of secondary hash table implementation.

We created a new struct: in the SHT_node_info, we store the name of one record and the id of the ht HashTable block in which the whole record is stored.

The sht has the sht_info, which stores the id file description, the count of buckets of this hash table, the position of the hash table, the max records in the first block of the first bucket and the max records per block.

It also has the sht_block_info, which stores the count record and the following block id.

In the create, we create the first sht_block_info and the sht_info because we create the first block.

In the open, we take the sht_info

In the insert we:

take the record
find the last block
check if there is room
1. if there is room, then insert it there
2. if there is not, then create a new block, insert in the new block and link the previous block with the new block

Important: The insert starts to insert from the first block that contains the sht_info, the sht_block_info, to save up memory space

In the get all entries function, we hash the name given with the hash function we created to find the right bucket. Then, we find this bucket's blocks and check if they contain a sht_node_info with the given name. If yes, we reach the ht HashTable block with the id which is crossed with this name in the sht_node_info. Then, we search the records in the id block and if we find a record with the given name, we print it, or we print that an error occurred and return -1.

In the close, we close the file and free the sht_info.

bf_main

It is an example of how the blocks can handle.

record

It has essential functions for the record files.

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
examples		examples
include		include
lib		lib
src		src
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMPLEMENTATION OF DATABASE SYSTEM

Summary

Report

Statistics of ht_main

Statistics of sht_main

How to run the code

Make Build

Make Run

Make Clean

Files - Description

hp_main

hp_file

ht_main

ht_table

sht_main

sht_table

bf_main

record

About

Releases

Packages

Contributors 2

Languages

lionisakis/Implementation-of-Database-Systems

Folders and files

Latest commit

History

Repository files navigation

IMPLEMENTATION OF DATABASE SYSTEM

Summary

Report

Statistics of ht_main

Statistics of sht_main

How to run the code

Make Build

Make Run

Make Clean

Files - Description

hp_main

hp_file

ht_main

ht_table

sht_main

sht_table

bf_main

record

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages