Goroutines, wait in line!

If you want to bound concurrent goroutines, check this post out for executable examples from worker pool pattern to semaphores and their comparison!

Jason Lui
Xendit Engineering

--

My teammate wisma wrote a blog post explaining how she managed to survey her friends’ food preferences with concurrency in Go, wonderful idea! The core of the solution, where she “messages her friends at the same time”, looks like this:

Find the full runnable code here: https://go.dev/play/p/DsUVKClMmWl.

What if your messages are not free, though?

Wisma had 1000 free SMS, but what if you are not as lucky and very popular (having thousands of friends)? 😆

With the following, I’m printing the number of goroutines every millisecond.

Feel free to change the friends slice and run it here! You’d see that the number of goroutines grows as friends does. (Note: this version of code needs six goroutines because there are four friends, each requires one goroutine, plus one main, and one goroutine to print the number of goroutines (profiling), 4 + 1 + 1 = 6.)

In actual software building, this translates to “what if your resource is limited”. Perhaps one piece of such work needs a large amount of memory, you don’t want it to grow unbounded, or perhaps your downstream service can’t handle so many concurrent requests.

This is when we need to “throttle” goroutines.

Worker pool

One way to deal with this is to utilize a pool of workers. See the code below with comments:

If you alter the friends slice and run the full code here, and you’d see that no matter how many friends you have, the number of goroutines won’t exceed a certain threshold. (Note: this version of code needs five goroutines because there is one main, one profiling, two workers and one distributing the work, 1 + 1 + 2 + 1 = 5. In other words, the number of goroutines = 3 + numGoroutines, the constant)

Semaphore

Things look cool now, but as Bryan Mills shared in GopherCon 2018, the worker pool approach has several problems (thanks, John Mcguire, for introducing me to this talk!):

  1. it can be tedious to keep track of the worker states (in this toy example, I didn’t care because it’s not long-running)
  2. the worker goroutines can wait in idle, potentially for a long time or even forever if things go wrong, and it’s hard to debug then.

Hence, it’s better to use some form of semaphore to control the number of goroutines.

Naive Implementation — Bounded Channels

Below utilizes a bounded channel to implement a “semaphore”.

Run the above at https://go.dev/play/p/8fykf0JB3_N! (Note: this version of code also needs five goroutines because there is one main, one profiling, one distributing work which spawns two worker goroutines, 1 + 1 + 1+ 2= 5. In other words, the number of goroutines is also 3 + numGoroutines, the constant)

My other colleague Sean Yu has a brilliant post explaining this approach in detail, please feel free to read it for more information!

x/sync/semaphore

golang.org/x/sync has been my favorite extended module so far, and it has a semaphore package! With that, the code becomes

The full code can be found at https://go.dev/play/p/wfDKMsuF0F7! (Yes yes this looks awkward because the cancellation is more natural with contexts, but please bear with me for this toy example 😂) The number of goroutines is also five because it’s basically identical to the previous example, except the implementation of semaphore.

I think this is preferable over bounded channels because:

  1. it respects context cancellation out of the box;
  2. it supports “weighted” workers;
  3. it supports TryAcquire as well, which does not block and could be useful in certain cases;
  4. in settings in https://cs.opensource.google/go/x/sync/+/refs/tags/v0.1.0:semaphore/semaphore_bench_test.go, it’s faster than bounded channels. Note that this should be taken with a pinch of salt, because: 1. the cases are run with only one goroutine, and it’s uncontended; 2. in actual work, the time spent on acquiring semaphores should be negligible (otherwise, please don’t do it concurrently), and both approaches take a comparable amount of time.

Phew! It’s no trivial work to throw a party! 😂

Important Note

Please note that even though we have a stopCh, this implementation is still a bit “leaky”, because the goroutines only exit after isDurianHater() is done. If that takes a long time, the server might still run out of memory. The recommended way to enable early termination is to signal the cancellation but still wait for the goroutines to complete. One way to do this is through a context.Context.

In the following snippet, I make isDurianHater method accepts a context.Context. (Only because context.Context is ubiquitous, a “stopCh” and a “context.Context” have a similar effect in this context.)

Run the full code at https://go.dev/play/p/kNLHXGuMEro! (Notice that all the friend implementations listens to ctx.Done(), which is a channel that is closed when ctx is cancelled. Cancelling a context doesn’t magically kill the goroutine, but signals end of work. Many “context aware” packages have similar patterns, like database/sql and net/http. You can also read the official blog about context for more information)

Recap

If the number of goroutines needs to be limited, you can:

  • implement a worker pool pattern
  • but its issues make semaphores more favorable ✅

To implement semaphores, you can:

  • use a bounded channel ✅
  • or golang.org/x/sync/semaphore, which respects context cancellation ✅

To terminate early, signal the workers but wait for them actually to return!

--

--