Definition
A version of the program that executes multiple tasks simultaneously
Example: execute queries one at a time, but issue I/O requests against different files disks simultaneously
Could read from several index files at once, processing the IO results as they arrive
Correctly and efficiently managing access to shared resources across multiple possibly-simultaneous tasks
The OS context switches between threads/processes
This is not parallelism! Parallelism is when multiple CPUs work simultaneously on 1 job
Why Threads?
- Advantages
- You mostly write sequential-looking code
- Threads can run in parallel if you have multiple CPUs/cores
- Disadvantages
- If threads share data, you need locks or other synchronization
- very bug-prone and difficult to debug
- Threads can introduce overhead
- lock contention, context switch overhead, and other issues
- Need language support for threads
Why Process
- Advantages
- No shared memory between processes
- No need for language support, OS provides “fork”
- Disadvantages
- More overhead than threads during creation and context switching
- Cannot easily share memory between process - typically communicate through the file system