Multiprocessing vs Multithreading: Understanding Concurrency and Parallelism
Explore the Differences Between Multiprocessing and Multithreading for Better Performance

In this "Go Deep with Golang" series, we have nearly covered all the fundamental topics necessary for a solid understanding of the language. Before we proceed to the next topic, which is Goroutines, it is crucial to first grasp the concepts of multiprocessing and multithreading. Understanding these concepts will provide a strong foundation for learning about Goroutines.
When we talk about performance in programming, terms like concurrency, parallelism, multithreading, and multiprocessing often come up together. They sound similar, but they describe different ways of doing multiple things at once.
We’ll break down these concepts in depth—how they work, how they differ, and when to use each.
The Core Idea: Doing Multiple Things at Once
Before diving into threads and processes, let's start with the core concept of doing more than one thing at the same time.
There are two main strategies to achieve this:
Concurrency:
Structuring your program so it can handle multiple tasks seemingly at the same time.
Parallelism:
Actually executing multiple tasks simultaneously using multiple CPUs or cores.
These two often overlap but are not the same thing.
Concurrency vs Parallelism: The Foundation
Concurrency
Concurrency means handling multiple tasks in overlapping time periods, not necessarily simultaneously.
Imagine a download manager that’s responsible for downloading five large files.
If your internet connection can only handle one download at a time, the manager might download a chunk from file 1, then switch to file 2, then file 3, and so on—cycling rapidly between them.
To you, it feels like all files are downloading at once, but in reality, the system is just switching quickly between tasks.
Concurrency is about efficient task-switching.
Even if only one task runs at any given moment, all of them make progress over time. In programming, concurrency.
In programming, concurrency is especially useful for I/O bound operations like making API calls, reading files, and waiting for network responses, where tasks spend most of their time waiting rather than computing.
Parallelism
Parallelism means doing multiple things truly at the same time.
Now, imagine that same download manager, but you have multiple network connections or threads, each downloading a file simultaneously.
In this case, all files are actively downloading in parallel, and there’s no switching involved.
Parallelism is about simultaneous execution using multiple cores or processors.
Parallelism is ideal for CPU-bound workloads, such as heavy computations like image rendering, large data processing, or complex simulations.

The image above illustrates that in concurrency, there is 1 chef (or 1 core) switching tasks back and forth. On the other hand, in parallelism, there are 3 different chefs (or 3 cores) working on different tasks at the same time.
What is Multithreading?
A thread is the smallest unit of execution within a process. By default, a program runs as a single main thread, executing code sequentially.
Using multithreading, we can create multiple threads that share the same memory space and work concurrently.
Example
Let’s say you’re building a web server in Go:
One thread handles incoming HTTP requests.
Another logs data to a file.
A third manages background jobs.
Each thread runs independently, yet all share access to the same data and memory.
Advantages of Multithreading
Threads are lightweight and fast to create.
They share memory, making communication simple.
Perfect for I/O-bound tasks like handling multiple web requests.
Drawbacks of Multithreading
Shared memory can cause race conditions (two threads updating the same variable).
Requires synchronization mechanisms (mutexes, locks).
In some languages like Python, the Global Interpreter Lock limits true parallelism with threads.
What is Multiprocessing?
Multiprocessing takes a different approach; it runs multiple processes, each with its own memory and interpreter instance. They don’t share memory space, so they’re fully isolated and can run on different CPU cores.
Example:
When exporting a video in tools like Premiere Pro, Final Cut, or FFmpeg, the renderer splits the video into chunks and processes them simultaneously on multiple CPU cores.
Why multiprocessing?
Each chunk is CPU-heavy and independent. More cores = faster rendering.
Advantages of Multiprocessing
True parallel execution across multiple cores.
Each process has isolated memory, so there are no race conditions.
Great for CPU-intensive workloads.
Drawbacks
More overhead for creating and managing processes.
Harder to share data (requires inter-process communication).
Inter-process communication is slower than thread communication.
How Many Threads Can You Spawn?
This is one of the most misunderstood topics.
Developers try creating
1000 threads
10000 threads
And then wonder why the system slows down.
Let's break it down.
Thread Creation Depends on 2 Things
Memory
Each thread needs its own stack.
Default stack size:
Linux: 8 MB
Windows: 1 MB
Java: 1 MB
Python: 8 MB
Go Goroutines: 2 kB
OS Thread Limit
Linux allows tens of thousands of threads
Windows/macOS allow fewer (2000 - 10000)
Max Threads Formula
Max Threads = Total Memory / Thread Stack Size
Example:
If you have 32 GB RAM and threads require 8 MB stack
32768 MB / 8 MB = 4,096 threads
So theoretically you could spawn nearly 4000 threads.
But you should not.
Why You Shouldn’t Spawn Thousands of Threads
Context switching
Cache invalidation
CPU time slice overhead
Kernel thrashing
Beyond a certain point, more threads lead to worse performance.
How Many Threads Should You Spawn?
CPU-Bound Tasks
Example
Image processing
Compression
Encryption
Encoding
Optimal threads = number of CPU cores
If you have 8 cores → 8 threads.
More threads won’t increase speed.
I/O Bound Tasks
Example
API calls
DB queries
File reading
Network operations
Threads spend most of the time waiting, so you can use many more.
Formula:
Optimal threads = Cores * (1 + (IO wait / compute time))
Example:
Task waits 90% of the time
8 cores * (1 + 0.9/0.1) = 80 threads
Real Example: Web Scraper
Scraping 5000 URLs
CPU usage is low
Network wait is high
In this scenario, you can safely create 100-300 threads, or in Go, you can create 10,000+ goroutines.
Let's explore multithreading further by looking at real-world scenarios, such as how the system will perform when we run a "Reading 100 files" operation using a single thread versus multithreading.
Let’s say you write a program that needs to read 100 files from disk.
Reading a file is usually I/O-bound because:
The thread asks the OS: “Please fetch file contents.”
The OS works with storage hardware.
The thread waits for disk I/O to finish.
So the real difference between single-threading and multithreading becomes HUGE.
Using Single Thread
In a single-threaded program:
The thread starts reading File 1
It waits for the disk to respond
When done, it moves to File 2
Waits again
File 3
Waits again
This is repeated for all 100 files. Only one file is being processed at any moment.

In the diagram above, using a single-thread approach, the thread is blocked and cannot do anything else during the wait time. Files are processed one after another.
This is simple but slow.
Using Multithreading
Now suppose we create 10 threads to read 100 files.
Each thread gets 10 files. So instead of reading files one by one, it reads like

In this,
while Thread 1 waits for File 1, Thread 2 reads File 2.
Thread 3 reads File 3, and so on.
So 10 files are being fetched I/O parallel, even on a single core — because disk I/O waits do not block other threads.
How They Look in Real Performance Terms
Single Thread
Total time = (Time to read 1 file) * 100
10 Threads
Total time = (Time to read 10 files in parallel batches)
If reading 1 file takes 30ms:
Single thread → 100 × 30ms = 3,000 ms
10 threads → (100/10) × 30 ms = 300 ms
Notice the performance improvement with multithreading, which results in 10 times less time.
When One Thread Finishes Early but Other Threads Are Still Working
In a real-world multithreaded system, it’s very common for one thread to complete its work earlier than the others. This situation has an official name:
👉 It’s called Load Imbalance.
This happens when different threads do not get an equal amount of work, causing:
Some threads to finish early.
Some threads to keep working.
CPU cores to remain partially idle.
Total execution time to be determined by the slowest thread.
But what happens next depends on the threading model your program uses. Here’s how it plays out in practice.
Static Assignment → Idle Threads (Bad Scenario)
In simple or naive multithreading designs, each thread gets a predefined chunk of work.
Example:
Thread 1 → Files 1-25
Thread 2 → Files 26-50
Thread 3 → Files 51-75
Thread 4 → Files 76-100
If Thread 1 finishes early, it has nothing else to do.
This is called Thread Idle Time
The root cause is Load Imbalance
The CPU core remains underutilized
The program finishes when the slowest thread completes.
This is why static splitting is rarely used today for large workloads.
Dynamic Task Queue → Automatic Load Balancing (Good Scenario)
Modern thread pools do not pre-assign tasks. Instead, all tasks are placed in a shared global queue.
Each thread:
Takes a task
Executes it
Returns for more
👉 It simply grabs the next available task.
This technique is called Dynamic Load Balancing, Work Queue Model
Result:
No idle threads
All threads stay active
Tasks finish at roughly the same time
Total execution time is much shorter
Work-Stealing Scheduler → Smart Redistribution (Best Scenario)
Go, Rust, Java ForkJoinPool, and modern runtimes use an even more advanced strategy:
Work Stealing.
Each thread has its own deque of tasks.
If a thread finishes early:
👉 It “steals” remaining tasks from other threads.
This eliminates load imbalance almost entirely.
Benefits:
Highly efficient
Minimizes idle time
Perfect for large, uneven workloads
Powers Go’s goroutines scheduler.
Conclusion
Concurrency is about handling multiple tasks at once, while parallelism is about executing them simultaneously. Multithreading is ideal for I/O-heavy tasks, whereas multiprocessing is best for CPU-heavy workloads that benefit from true parallel execution.
Modern runtimes use dynamic load balancing and work-stealing to avoid load imbalances, ensuring that threads don’t sit idle when others are still working. Choosing between threads, processes, or asynchronous patterns depends entirely on your workload, not on which technique is faster.



