Write you a webpack for great good

Module bundlers have become quite popular in the world of JavaScript. Today, I’d like us to briefly see the concept of it, get into how it really works, and I’ll share the story of making a module bundler myself.

What is a module bundler?

Module systems structure a large-scale code base into units of modules. For long, there has been no module system for JavaScript, but ever since Node.js started to use a module system called CommonJS from 2009, modular programming became a common practice. Also, ES Module, a new module system, has been added to the JavaScript specification in the ECMAScript 2015. It’s hard to imagine coding JavaScript without modules nowadays.

// Example of ES Module
import "otherModule";
import { someFunc } from "otherModule";
export const x = 10;

However, for web browsers to support specification changes take time. Some browsers are yet to support ES modules; to use ES modules on browsers, we need to convert ES modules into plain JavaScript that can run on browsers, and have the modular dependencies resolved in advance. This conversion is what module bundlers do.

What does a module bundler do?

As we discovered, module bundlers identify dependencies in modules and convert modules into plain JavaScript code; module bundlers read a module and convert it into a function that can be understood by web browsers. The parameters of this function are a function for importing other modules and an object to save the value the module exports. Suppose we have a module as shown below.

import { x } from "otherModule";

export const y = x + 10;

A module bundler converts the code above into a function that can be understood by browsers, as shown below.

function (require, module, exports) {
  var { x } = require("otherModule");
  exports.y = x + 10;
}

Some of you might have recognized that the content of the method is in CommonJS format. There already are many handy tools that convert ES modules into CommonJS. After converting module code into a function, a module bundler combines the functions converted, so the module references can work. Converting to combining, is the process of module bundling, and are the main tasks of module bundlers.

Making a module bundler

What I cannot create, I do not understand

Richard Feynman

I hope you now have the basic idea behind module bundlers, but to fully understand it, the best thing to do is to make one yourself. So I made a simple module bundler myself, Tinypack. Tinypack is a TypeScript bundler; TypeScript uses JavaScript’s module system, so Tinypack’s operations are identical to that of JavaScript bundlers. I’ll go over my implementation in steps.

Semantic Analysis

Before converting a module, the first thing to do is to check data types. This task is required only because Tinypack is a TypeScript bundler, thus is not required for JavaScrip bundlers. To check data types, we use the TypeScript Compiler API.

let errors = ts.getPreEmitDiagnostics(
  ts.createProgram([entry], { ... })
)

Converting source code into module objects

We convert source code in TypeScript (or JavaScript) from the entry into modules; we process the entry first and then process dependencies in the order they were identified. Every time we convert a module, we assign a unique module ID to it. For the entry file we assign 0 as its module ID, push the file to the conversion queue, and start converting. The first task of conversion is to parse the source file into an AST (Abstract Syntax Tree), to identify dependencies. Like in semantic analysis, we use the TypeScript Compiler API, for parsing.

let ast = ts.createSourceFile(file, content, ...)

We loop through the AST and check for import statements. Each import statement found is a dependency. Assign a unique module ID per dependency, save the path and the module ID in the dependency map, and add the module into the conversion queue.

source.forEachChild(node => {
  if (node.kind === ts.SyntaxKind.ImportDeclaration) {
    let importDecl = node as ts.ImportDeclaration;

    // module specifier should be a string literal
    let moduleSpecifier = importDecl.moduleSpecifier.getText(source);
    let dep = JSON.parse(moduleSpecifier) as string;

    ...

    queue.push(depPath);
    deps.set(dep, depID);
  }
});

Once dependency check is done, use the TypeScript Compiler API to transpile the source file into CommonJS format.

let transpiled = ts.transpileModule(content, {
    compilerOptions: {
      module: ts.ModuleKind.CommonJS,
      ...
    }
  }).outputText;

At this point, we get a module ID, dependency map and the result of transpiling. Convert these three items into an object, and we are done with module conversion for a given file.

return { id, deps, transpiled };

Repeat the converting process until the queue gets empty to convert all the files in the dependency graph.

Code generation/bundling

Combine all the module objects created in the previous process to make a bundle file to run on a web browser. Each module contains the following items:

  • Module ID
  • Dependency map (Module name: Module ID)
  • Transpiled code

With this information we generate code for each module, as in the following example, which looks very similar to the function we’ve seen in the What does a module bundler do? section. We have an object containing dependency information retrieved from the dependency map. This information is used in matching the module path in the function to the actual module ID.

var modules = {
  0: [ // ← Module ID
    function(require, module, exports) {
      ... // ← Transpiled code
    },
    {
      "otherModule": 1, // ← Dependency map
      ...
    }
  ],
  1: ...
}

The next thing to do is write a function that runs the module code generated, as shown below. There is a require function which uses the dependency relationship object(`mod[1]`), and this function is passed to the `mod[0]`. When we run this function, the value the module exports is stored in module.exports, and we return it as it is.

function executeModule(id) {
  var mod = modules[id];
  var localRequire = function (path) {
    return executeModule(mod[1][path]);
  };

  var module = { exports: {} };
  mod[0](localRequire, module, module.exports);
  return module.exports;
}

Lastly, we add a line to run the entry mode, and that’s it!

executeModule(0); // 0 implies the entry module.

Once we generate code, the basic implementation of a module bundler is complete.

Advanced features

We have completed the basic implementation of a module bundler, and we can get code that runs on web browsers. But, well-known module bundlers like webpack provide more features than this. I’ve implemented some of these additional features on Tinypack as well.

Preventing duplicated modules

With the way I’ve implemented, if a module is imported by a number of modules, the imported module will be recognized as a different module by each importer module. Not only will we have duplicated code, but we have a problem of module context not being shared. To solve these issues, we store file paths as absolute paths, and store the filepath with a module ID in the map (fileModuleIdMap). So before we convert a module, we look up its module ID (moduleID) using an absolute path, and if there is no such ID, only then we add the module.

let depPath = path.resolve(file, dep);
let depID = fileModuleIdMap.get(depPath);
if (depID === undefined) {
  depID = ++moduleID;
  fileModuleIdMap.set(depPath, depID);
  files.push(depPath);
}

Resolving circular dependency

Circular dependency implies that a dependency graph is cyclic, as some call it cyclic dependency. Suppose module x imports module y and moduley imports module x. If we don’t handle this circular dependency, our conversion queue will become infinite and conversion will never end. Solving this is the same as handling duplications. We don’t convert processed modules, but only make use of the module ID of the modules. But we need to have an additional condition satisfied, that a module needs to have its module ID assigned before it gets processed. With our example, the modules x and y, we need to know the module ID of module y before making the dependency map for the module x. This is why we assigned and saved the module ID for the first thing in our previous code example. Note this solution can only resolve circular dependency in terms of dependency, but not in terms of execution.

Importing npm modules

npm, the package manager for Node.js is often used with JavaScript for front-end. Actually, npm is used more by the front-end side.


The npm Registry now distributes more front-end code than server-side JavaScript and is used to distribute every popular web development framework.

Source: The npm Blog

I suspect that many of you would want to bundle npm packages. So, let’s see what we need to bundle npm packages.

First, to deal with dependency, we need to apply the rules used in Node.js:

  • If a module name starts with ., the module is a local module.
  • If a module name does not start with ., the module is an npm module.

These rules can be written in code as follows.

let depPath: string;
if (dep.startsWith(".")) {
  depPath = localModulePath(dep, file);
} else {
  depPath = npmModulePath(dep, file);
}

Another thing to do is checking the fields module and main in the package.json file of an npm module, as shown below.

let packageJSONPath = resolve(pkgRoot, "package.json");
if (isFile(packageJSONPath)) {
  let main: string =
    require(packageJSONPath).module ||
    require(packageJSONPath).main;
  if (main) {
    return resolve(pkgRoot, main);
  }
}

What else? We need to handle files without an extension, fallback to the index.jsfile and others; these tasks are for both local modules as well as npm modules.

Conclusion

It was fun making a module bundler myself to understand the concept of module bundling. I’ve shared only a small part of module bundlers; there are more features module bundlers provide, such as Code splitting and Dead code elimination. Have a go at making one yourself, too!

The bundler I made, Tinypack is available on GitHub, so come and have look (and drop a ⭐️! ). By the way, Minipack was the motivation for making Tinypack; Minipack is quite interesting and the code is easy to understand, so do check it out)!

Related Post