Data Type | Description | Default | Size |
---|---|---|---|
boolean |
true or false |
false |
1 bit |
char |
Unicode character | \u0000 |
16 bits |
byte |
twos complement integer | 0 |
8 bits |
short |
twos complement integer | 0 |
16 bits |
int |
twos complement integer | 0 |
32 bits |
long |
twos complement integer | 0 |
64 bits |
float |
IEEE 754 floating point | 0.0 |
32 bits |
double |
IEEE 754 floating point | 0.0 |
64 bits |
- Stores data elements based on a sequential (usually zero-based) index.
- Optimal for indexing. Bad for searching, inserting and deleting.
- Linear Arrays, or one dimensional arrays, are the most basic.
- Two Dimensional Arrays have
x
&y
indices like a grid or nested arrays. - Dynamic Arrays are like one dimensional arrays but have reserved space for additional elements.
- If a dynamic array is full, it copies its contents to a larger array (usually 2x in size).
- Access:
O(1)
(unsorted & sorted) - Search:
O(n)
(unsorted),O(log n)
(sorted) - Insertion:
O(1)
(without shifting / unsorted),O(n)
(with shifting / sorted) - Deletion:
O(1)
(without shifting / unsorted),O(n)
(with shifting / sorted) - Insertion at End of Dynamic Array:
O(1)
amortized
Array
: Oracle Docs, TutorialsPoint.ArrayList
: Oracle Docs, TutorialsPoint.Vector
: Similar toArrayList
but synchronized. Oracle Docs.String
: Oracle Docs.StringBuilder
: Oracle Docs.
- Stores data using nodes that have a datum and pointers to other nodes.
- Designed to optimize insertion and deletion during iteration. Slow at indexing and searching.
- Singly Linked Lists have nodes that reference the next node.
- Doubly Linked Lists have nodes that also reference the previous node.
- Circularly Linked Lists are linked lists whose tail reference the head.
- Stacks are commonly implemented using linked lists.
- Stacks are last in, first out (LIFO) data structures.
push()
&pop()
. - Implemented with a linked list where the head is the only place for insertion and removal.
- Stacks are last in, first out (LIFO) data structures.
- Queues are commonly implemented using linked lists.
- Queues are first in, first out (FIFO) data structures.
enqueue()
&dequeue()
. - Implemented with a linked list that only removes from the head and adds to the tail.
- Queues are first in, first out (FIFO) data structures.
- Double-Ended Queues or Deques are also commonly implemented using linked lists.
- Deques allow elements to be added and removed from either the head or tail of the queue.
- Access:
O(n)
(unsorted & sorted) - Search:
O(n)
(unsorted & sorted) - Insertion:
O(1)
(unsorted),O(n)
(sorted) - Deletion:
O(1)
(unsorted & sorted)
- Java Implementation of Linked List
- Java Implementation of Stack
- Java Implementation of Queue
LinkedList
: Oracle Docs, TutorialsPoint.Iterator
: TutorialsPoint.Stack
: Oracle Docs.Queue
: Oracle Docs.Deque
: Oracle Docs.
- Stores data as key-value pairs in a direct access table.
- Designed to optimize searching, insertion and deletion.
- Hash Functions accept a key (from an arbitrarily sized dataset) and map it to an output i.e. hash code (from a fixed sized dataset). This hash code is then mapped to an index for storage.
- This is known as hashing, whose motivation is to assign a unique index to every possible key.
- This is done because the actual key space may be too large while only a fraction of those keys may appear.
- A good hash function must:
- return a value within the hash table range.
- achieve an even distribution of indices from the keys that actually occur.
- be easy and quick to compute.
- Hash Collisions occur when a hash function returns the same hash code for two distinct keys or two different hash codes are mapped to the same index.
- Most hash functions have this problem.
- Closed Address Hashing - Chain together the keys that generate the same index in a linked list (
O(n)
) or binary search tree (O(log n)
).- Load Factor is
n/h
, wheren
is the number of records andh
is the number of hash cells.
- Load Factor is
- Open Address Hashing - Store all the elements in the hash table and handle hash collisions by rehashing i.e. look for an alternative slot.
- Linear Probing - Insert the colliding record in the next slot recursively. However, this can result in long runs of occupied slots.
- Quadratic Probing/Double Hashing - Use two hash functions to hash the key twice in case of a collision.
- Using open address hashing requires a hash cell to be marked as obsolete when a record is deleted to avoid stopping search.
- Hashes are important for associative arrays (i.e. key-value pairs) and database indexing.
- Search: Average Case:
O(1)
, Worst Case:O(n)
- Insertion: Average Case:
O(1)
, Worst Case:O(n)
- Deletion: Average Case:
O(1)
, Worst Case:O(n)
HashMap
:<K, V>
, no guarantee about iteration order,O(1)
. Oracle Docs, TutorialsPoint.TreeMap
:<K, V>
, iteration according to natural order of keys or externally suppliedComparator
,O(log n)
. Oracle Docs.LinkedHashMap
:<K, V>
, iteration according to insertion order,O(1)
. Oracle Docs.HashSet
:<T>
, no guarantee about iteration order,O(1)
. Oracle Docs, TutorialsPoint.TreeSet
:<T>
, iteration according to natural order of keys or externally suppliedComparator
,O(log n)
. Oracle Docs.LinkedHashSet
:<T>
, iteration according to insertion order,O(1)
. Oracle Docs.
- A tree is a data structure composed of nodes.
- Each tree has one root node that has zero or more child nodes and each of these child nodes has zero or more child nodes.
- Nodes without children are called leaves.
- A tree cannot contain cycles and nodes may not have links back to their parent nodes.
- Is a tree data structure where every node has at most two children.
- There is one left and one right child node.
- A complete binary tree is one where every level of the tree is fully filled, except for perhaps the last level. To the extent that the last level is filled, it is filled from left to right.
- A full binary tree is one where every node has either zero or two children.
- A perfect binary tree is one that is both full and complete. Perfect binary trees must have exactly
2^k - 1
nodes, wherek
is the number of levels. - A degenerate tree is an unbalanced tree, which if entirely one-sided is essentially a linked list.
- Used to implement binary search trees & binary heaps.
- Is a variant of an n-ary tree in which characters are stored at each node. Each path down the tree represents a word.
*
nodes (ornull
nodes) are used to indicate complete words.- A node in a trie could have anywhere from 1 through
ALPHABET_SIZE + 1
children (or 0 throughALPHABET_SIZE
children if a boolean value is used to indicate*
nodes). - Tries are usually used to store words for quick prefix lookups. While a hash table can quickly look up whether a string is a valid word, it cannot quickly verify if a string is a prefix of any valid words.
- Validate Prefix:
O(n)
(wheren
is length of prefix)
- A segment tree is a tree data structure for storing intervals or segments.
- It allows querying which of the stored segments contain a given element.
- It is usually a static structure i.e. it cannot be modified easily once it's built.
- A segment tree for a set
I
ofn
intervals usesO(n log n)
storage and can be built inO(n log n)
time. - Segment trees support searching for all the intervals that contain a query point in
O(log n + k)
,k
being the number of retrieved intervals or segments.
- A linked list based data structure where links (edges) can exist between any two nodes (vertices).
- Edges may be directed/undirected and weighted/unweighted. The graph may be cyclic.
- LL-based Implementation:
- A 1-D array is used to represent the vertices.
- A linked list or array is used for each vertex
v
, which contains the vertices that are adjacent tov
(adjacency list). - Vertices adjacent to a vertex can be found quickly.
- In an undirected graph, edges will be stored twice.
- Array-based Implementation:
- A 1-D array is used to represent the vertices.
- A 2-D boolean array (adjacency matrix) is used to represent the edges.
- Connectivity between two vertices can be verified quickly.
- In an undirected graph, the adjacency matrix will be symmetrical.
- An adjacency matrix is usually better for storing dense graphs while an adjacency list is better for storing sparse graphs.
- Traversal in an adjacency list implementation is more efficient than an adjacency matrix implementation since it is not necessary to iterate through all the nodes to find a node's neighbors.
- Two nodes are adjacent if they are connected by an edge.
- The degree of a node
i
is the number of nodes adjacent toi
. - A path is a sequence of edges that connect two nodes.
- A connected graph is a graph where a path exists between any two nodes.
- A complete graph is a graph where every vertex is directly connected to every other vertex.
- A clique is a complete subgraph.
- Traversal is similar to trees but vertices must be marked as visited (due to multiple possible paths) and all adjacent vertices must be visited (as opposed to just two in binary trees).
- Add Vertex: Adjacency List:
O(1)
, Adjacency Matrix:O(|V|^2)
- Add Edge: Adjacency List:
O(1)
, Adjacency Matrix:O(1)
- Remove Vertex: Adjacency List:
O(|V| + |E|)
, Adjacency Matrix:O(|V|^2)
- Remove Edge: Adjacency List:
O(|E|)
, Adjacency Matrix:O(1)
- Query for Adjacency: Adjacency List:
O(|V|)
, Adjacency Matrix:O(1)
- Storage Size: Adjacency List:
O(|V| + |E|)
, Adjacency Matrix:O(|V|^2)