Chapter 4: Asynchronous Control Flow Patterns with Callbacks
Summary
This chapter tackles the most common challenge in Node.js programming: managing
asynchronous control flow using callbacks. It starts with the web spider example
-- a CLI app that downloads web pages -- and shows how naive nesting leads to
callback hell (the pyramid of doom). The spider function has four nested levels:
fs.access -> superagent.get -> mkdirp -> fs.writeFile. The chapter then
introduces callback discipline (early return, named functions, modularization)
to tame the nesting -- the refactored spider extracts saveFile() and download()
as separate functions.
The core of the chapter presents three fundamental async patterns:
sequential execution (tasks one after another, including recursive iteration
over collections), parallel execution (all tasks at once with a completion
counter, introducing race conditions), and limited parallel execution (bounded
concurrency via a TaskQueue class). The chapter concludes with the async
library, which provides battle-tested helper functions for these patterns.
Key Concepts
The Difficulties of Async Programming
Asynchronous code is hard because you cannot rely on the familiar sequential flow of synchronous programming. Closures and in-place callback definitions lead to deeply nested, pyramid-shaped code:
// The web spider -- callback hell (pyramid of doom)
// Four nested levels: fs.access -> superagent.get -> mkdirp -> fs.writeFile
function spider(url, callback) {
const filename = urlToFilename(url);
fs.access(filename, err => { // Level 1: check if file exists
if (!err) return callback(null, filename, false);
superagent.get(url).end((err, res) => { // Level 2: download the page
if (err) return callback(err);
mkdirp(path.dirname(filename), err => { // Level 3: create directories
if (err) return callback(err);
fs.writeFile(filename, res.text, err => { // Level 4: save to disk
if (err) return callback(err);
callback(null, filename, true);
});
});
});
});
}
Why Callback Hell is Dangerous
- Code shifts to the right, becoming unreadable
- Variable names clash or get reused across scopes
- Error handling is duplicated or forgotten
- Refactoring is risky because of deep closure dependencies
Callback Discipline
Three rules to tame callback hell without external libraries:
The Three Rules
- Exit early -- use
return callback(err)to stop execution on error - Named functions -- extract callbacks into named functions for readability
- Modularize -- split logic into small, reusable, single-purpose functions
Early Return Pattern
// BAD: no return -- code continues after callback(err)
function doSomething(input, callback) {
asyncOp(input, (err, result) => {
if (err) {
callback(err);
// BUG: execution continues here!
}
// This runs even when there was an error
processResult(result, callback);
});
}
// GOOD: return prevents further execution
function doSomething(input, callback) {
asyncOp(input, (err, result) => {
if (err) {
return callback(err); // exit immediately
}
processResult(result, callback);
});
}
Applying Callback Discipline (Refactored Spider)
// AFTER: modularized with named functions
// saveFile() and download() extracted as separate, testable functions
function spider(url, callback) {
const filename = urlToFilename(url);
fs.access(filename, err => {
if (!err) return callback(null, filename, false);
download(url, filename, err => {
if (err) return callback(err);
callback(null, filename, true);
});
});
}
function download(url, filename, callback) {
superagent.get(url).end((err, res) => {
if (err) return callback(err);
saveFile(filename, res.text, callback);
});
}
function saveFile(filename, content, callback) {
mkdirp(path.dirname(filename), err => {
if (err) return callback(err);
fs.writeFile(filename, content, callback);
});
}
Sequential Execution Pattern
Execute a set of async tasks one after another, where each task starts only after the previous one completes.
Known Set of Tasks
function sequentialKnown(callback) {
task1((err, result1) => {
if (err) return callback(err);
task2(result1, (err, result2) => {
if (err) return callback(err);
task3(result2, (err, result3) => {
if (err) return callback(err);
callback(null, result3);
});
});
});
}
Sequential Iteration (Dynamic Collection)
function iterateSeries(collection, iteratorFn, finalCallback) {
function iterate(index) {
if (index === collection.length) {
return finalCallback();
}
iteratorFn(collection[index], (err) => {
if (err) return finalCallback(err);
iterate(index + 1); // recursion, not a loop
});
}
iterate(0);
}
// Usage with the spider -- download links sequentially:
function spiderLinks(currentUrl, body, nesting, callback) {
const links = getPageLinks(currentUrl, body);
iterateSeries(links, (link, cb) => {
spider(link, nesting - 1, cb);
}, callback);
}
Why Recursion, Not a Loop
A for loop fires all iterations immediately because it does not wait
for async callbacks. Recursion ensures each step starts only after the
previous callback fires.
Parallel Execution Pattern
Start all tasks at once. Track completions with a counter.
function parallel(tasks, finalCallback) {
let completed = 0;
const results = [];
tasks.forEach((task, index) => {
task((err, result) => {
if (err) return finalCallback(err);
results[index] = result;
if (++completed === tasks.length) {
finalCallback(null, results);
}
});
});
}
// Spider v3: download all links in parallel
function spiderLinks(currentUrl, body, nesting, callback) {
const links = getPageLinks(currentUrl, body);
if (links.length === 0) return process.nextTick(callback);
let completed = 0;
links.forEach(link => {
spider(link, nesting - 1, err => {
if (err) return callback(err);
if (++completed === links.length) {
callback();
}
});
});
}
Race Conditions
Even on a single thread, parallel async tasks can race on shared state.
In the parallel spider, two tasks both call fs.access on the same URL.
Both see the file doesn't exist, both start downloading the same page.
The check-then-act pattern is not atomic across async boundaries.
// Race condition in parallel spider:
// Task A: fs.access(file) -> NO (doesn't exist) -> start download
// Task B: fs.access(file) -> NO (doesn't exist) -> start download (same URL!)
// Task A: fs.writeFile(file) -> saved
// Task B: fs.writeFile(file) -> overwrites A's result
Limited Parallel Execution
Run at most N tasks concurrently. When one finishes, start the next from the queue.
function parallelLimit(tasks, concurrency, finalCallback) {
let completed = 0;
let running = 0;
let index = 0;
function next() {
while (running < concurrency && index < tasks.length) {
const task = tasks[index++];
running++;
task((err) => {
if (err) return finalCallback(err);
running--;
completed++;
if (completed === tasks.length) {
return finalCallback();
}
next();
});
}
}
next();
}
The TaskQueue Class
A reusable class for globally limiting concurrency:
class TaskQueue {
constructor(concurrency) {
this.concurrency = concurrency;
this.running = 0;
this.queue = [];
}
pushTask(task) {
this.queue.push(task);
process.nextTick(() => this.next());
return this;
}
next() {
while (this.running < this.concurrency && this.queue.length > 0) {
const task = this.queue.shift();
this.running++;
task((err) => {
this.running--;
this.next();
});
}
}
}
// Spider v4: limited concurrency
const downloadQueue = new TaskQueue(2); // max 2 concurrent downloads
function spiderTask(url, nesting, callback) {
downloadQueue.pushTask(done => {
spider(url, nesting, (err) => {
if (err) console.error(err);
done();
});
});
}
Why process.nextTick in pushTask?
Deferring next() with process.nextTick allows all synchronous
pushTask() calls to queue their tasks before any start executing.
Without it, the first task would start immediately, potentially altering
the queue state while other tasks are still being added.
The async Library
The async npm package provides helper functions for all these patterns:
import async from 'async';
// Sequential execution
async.series([task1, task2, task3], (err, results) => { });
// Parallel execution
async.parallel([task1, task2, task3], (err, results) => { });
// Sequential iteration
async.eachSeries(urls, (url, cb) => download(url, cb), (err) => { });
// Limited parallel iteration
async.eachLimit(urls, 3, (url, cb) => download(url, cb), (err) => { });
// Task queue with concurrency
const q = async.queue((task, cb) => processTask(task, cb), 2);
q.push(tasks);
q.drain(() => console.log('All done'));
Pattern Summary Table
| Pattern | When to Use | Key Mechanism | Spider Version |
|---|---|---|---|
| Sequential | Tasks depend on previous result | Nesting / recursion | v2 (link-by-link) |
| Parallel | Tasks are independent | Completion counter | v3 (all links at once) |
| Limited parallel | Independent tasks but resource constraints | Queue + running counter | v4 (max N downloads) |
Mind Map
Connections
- Previous: Chapter 3 -- Callbacks and Events
- Next: Chapter 5 -- Asynchronous Control Flow Patterns with Promises and Async/Await
- Foundation: Chapter 1 -- The reactor pattern drives the event loop that executes these callbacks
- The patterns here (sequential, parallel, limited) are reimplemented with Promises in Chapter 5