Skip to content
langera edited this page Oct 15, 2014 · 8 revisions

What is the capacity of a Slab?

A slab has a capacity of 231-1 * capacity of its storage up to 263-1 (Long.MAX_VALUE).

So for example, if the configured storage is a ByteBufferStorage then the slab capacity will be 231-1 * 231-1 = 262-232+1.

If the storage is a [DirectMemoryStorage] (https://github.com/langera/slab/blob/master/src/main/java/com/yahoo/slab/storage/DirectMemoryStorage.java) (with max capacity of 263-1 bytes) then the slab's max capacity will also be 263-1.

Provided Storages max capacity:

Storage Type MaxCapacity in bytes
DirectMemoryStorage 263-1
UnsafeByteArrayStorage 231-1
ByteBufferStorage (Direct) 231-1
ByteBufferStorage (Heap) 231-1

Why limit the objects being stored to "Simple flat POJO"?

The limitation of the object being stored to a "simple flat POJO" allows us to effectlvely manage the slab memory without the need to handle complex defragmentation and compaction scenarios. Note that if we are thinking of offloading objects to a different storage mechanism, this limitation is already imposed by the situation and object references must be handled as a special case regardless of the tool being used to offload. This gives us a slab that can be used to store efficiently POJOs whose number of instances is very dynamic in real world applications. i.e additions and removals happen constantly and at arbitrary points in the slab.

One such real world example can be the block information in the Hadoop HDFS Namenode.

Why not use a java.nio.ByteBuffer?

java.nio.ByteBuffer is optimised towards a use of a buffer and not a Collection. It does not support efficient removals of arbitrary sections in the buffer, does not consider issues of fragmentation and its compact() operation assume a behaviour where everything already read can be discarded. It also forces its own specific serialization mechanism for the primitive values.

ByteBuffer is also limited in size to 231-1 which is just not enough in many real big data applications. The exception to this - java.nio.MappedByteBuffer maps its buffer to a file and can support much more data but because the storage is backed by a file, its latency performance is just horrendous.

The Slab project does have a ByteBufferStorage which uses ByteBuffers as a back-end storage for the slab. Our performance tests (see StoragePerfTest) and others (see [here] (http://mechanical-sympathy.blogspot.co.uk/2012/07/native-cc-like-performance-for-java.html)) showed it is being out-performed by sun.misc.Unsafe

Why not make Slab implement java.util.Collection? java.util.List?

The basic use of a Slab should be when you are willing to trade off object lookup performance with a reduction in memory consumption. This means you probably are using a lot of memory and need to support more than Integer.MAX_VALUE of instances.

Therefore Slab API uses long as its key to the objects which makes it impossible for us to link it to the regular java.util.Collection family. It does implement java.util.Iterable

Thread-Safety?

Slab is not thread-safe. This allows maximum performance if you already access it from a single thread and don't need to pay any thread-safety performance penalty.

In a multi-threaded env. access to the Slab state is done by the users of the stored objects (for example when calling add or get) and possibly via a separate caller which calls compact. We assume the application code can protect the user calls and the Slab API offers event hooks for the compaction operation which allows efficient concurrency control by only limiting the access to the slab at the point of a single move of an object inside the slab storage.