-
Notifications
You must be signed in to change notification settings - Fork 26
Threads
Threads, like processes, are also OS managed. Threads share the same address space of their parent process. This means that processes can spawn threads indirectly using OS provided functionality (e.g. CreateThread or pthread_create).
On a single processor, multithreading generally occurs by time-division multiplexing (as in multitasking): the processor switches between different threads. This context switching generally happens frequently enough that the user perceives the threads or tasks as running at the same time. On a multiprocessor (including multi-core system), the threads or tasks will actually run at the same time, with each processor or core running a particular thread or task.
Many modern operating systems directly support both [time-sliced](http://en.wikipedia.org/wiki/Preemption_(computing\)#Time_slice) and multiprocessor threading with a [process scheduler](http://en.wikipedia.org/wiki/Scheduling_(computing\)).
Like processes, threads are about preemptive multitasking and the OS decides when they are preempted. To avoid preemption with threads, mutex can be used. Windows also offers support for Critical Sections with two functions EnterCriticalSection and LeaveCriticalSection.
Communication between threads is far simpler than inter-process communication between processes. This is mainly due to the shared memory, but also due to the fact that the strict security constraints the OS puts on processes do not exist with threads.
Threads are generally said to be "lightweight", but that is relative. Threads have to support the execution of native code so the OS has to provide a decent-sized stack, usually measured in megabytes. In Windows, the [default stack reservation size](http://msdn.microsoft.com/en-us/library/windows/desktop/ms686774(v=vs.85\).aspx) used by the linker is 1 MB. In Linux, the typical thread stack size is between 2 MB and 10 MB. This means that in Linux, creating 1000 threads would equal to memory usage from ~2 GB to ~10 GB, without even beginning to do any actual work with the threads.
You can rather easily determine the default stack size for Linux OS by running the following:
$ ulimit -a | grep stack
stack size (kbytes, -s) 8192
The above output was from Ubuntu 11 x86. We can also test this with some code:
void makeTenThreads()
{
std::vector<pthread_t> threads;
for (int i = 0; i < 10; i++)
{
threads.push_back(pthread_t(0));
pthread_create(&threads.back(), 0, &doNothing2, 0);
}
std::vector<pthread_t>::iterator itr = threads.begin();
while (itr != threads.end())
{
pthread_join(*itr, 0);
++itr;
}
sleep(11);
threads.clear();
}
int main()
{
makeTenThreads();
sleep(10);
}
Running pmap -x 1234
where 1234
is the PID will give us 10 x 8192K blocks allocated, because we created 10 threads and each of them got 8 MB allocated.
Thread default stack size varies depending on the OS and you can actually set it on your own. On Linux, you call pthread_attr_setstacksize and on Windows it can be specified as a parameter to CreateThread.
The number of threads a process can create is limited by the available virtual memory and depends on the default stack size. On Windows, if every thread has 1 MB of stack space, you can create a maximum of 32 threads. If you reduce the default stack size, you can create more threads. These details vary greatly depending on the platform and libraries.
Reducing the thread stack size will not reduce overhead in terms of CPU or performance. Your only limit in this respect is the total available virtual address space given to threads on your platform. Generally you should not change the stack size, because you can't really compute how much you need for an arbitrary thread, as it totally depends on the code run. You would have to analyze the entire code and the resulting disassembly executed by the thread to know how much stack size to use. This is non-trivial.
Threads are "lightweight processes", not "lightweight" themselves as some may claim. They require less resources to create and to do context switching as opposed to processes, but it still is not cheap to have many of them running at the same time.
While thread context switching still involves restoring of the program counter, CPU registers, and other potential OS data, they do not need the context switch of an MMU unlike processes do, because threads share the same memory. Context switching with threads is less of a problem unless you have many threads.
Since memory/data is shared among threads in the same process, applications frequently need to deal with race conditions. [Thread Synchronization](http://en.wikipedia.org/wiki/Synchronization_(computer_science\)#Thread_or_process_synchronization) is needed. Typical synchronization mechanisms include Locks, Mutex, Monitors and Semaphores. These are concurrency constructs used to ensure two threads won't access the same shared data at the same time, thus achieving correctness.
Programming with threads involve hazardous race conditions, deadlocks and livelocks. This is often said to be one of the bad things about threads along with the overhead they bring.
Threads are good for concurrent CPU heavy work across 1-n processors and cores within a single application/process. This is, because they scale to cores and CPUs thanks to the OS scheduler. For IO heavy work, threads are a nightmare, because that usually involves spawning of multiple threads for shorter periods of time.
- You need plenty of CPU.
- You keep the threads running for a longer time.
- You do not spawn and kill and spawn and kill threads too often.
- You do not need many threads.
An example of a good use for a thread could be a game. Running e.g. AI logic on a separate thread makes sense, because the thread is spawned once and kept alive for a long time. AI logic also requires plenty of CPU making threads very good for such purposes.
Short-lived, frequently spawned threads make little sense. Building a chat web application that involves 1000 concurrent active chatters are an example when not to use threads. Memory usage would be high, context switching would take too much time relative to the actual application. Creating threads and killing them that often has an unacceptable high overhead. A chat requires more IO than CPU work, thus, threads do not suit that situation.