Web Workers 101 πŸ•Έβš™οΈ:  The Hidden Power of Web Workers  in Microsoft Teams App

Web Workers 101 πŸ•Έβš™οΈ: The Hidden Power of Web Workers in Microsoft Teams App

Learn how to leverage web workers by offloading heavy task with a real world example

Β·

12 min read

This blog talks about how you can use web workers and improve user experience and performance. At the end of this blog, you will have a solid grasp of web workers.

Web Workers πŸ•Έβš™οΈ

One of JavaScript’s core is its single-threaded nature. Any piece of code that takes or might take a long time to run, can block the UI and make the user experience janky. Web Workers saves us from these kind of things

"Web Workers allow you to do things like fire up long-running scripts to handle computationally intensive tasks, but without blocking the UI or other scripts to handle user interactions" ~ Google Web Dev

Blocking the Main Thread β›”πŸ§΅

// Simulate a long-running synchronous operation
function simulateHeavyWork(duration) {
  const startTime = Date.now();
  while (Date.now() - startTime < duration) {}
}

// Perform the heavy work in the main thread
console.log('Starting heavy work...');
simulateHeavyWork(5000); // Simulate 5 seconds of heavy work
console.log('Heavy work completed.');

This code will block β›” your browser for 5 seconds ⌚. I ran this code right in my browser dev tools console. Code is here

I thought that making the above simulateHeavyWork() function async would solve this issue, BUT WAIT !!!!! It didn't

async function simulateHeavyWork(duration) {
  const startTime = Date.now();
  while (Date.now() - startTime < duration) {}
}

console.log('Starting heavy work...');
simulateHeavyWork(5000) // Simulate 5 seconds of heavy work
  .then(() => {
    console.log('Heavy work completed.');
  });

Even though the above function looks like an async function, the synchronous nature of the while loop is what causes the main thread to be blocked.

So I made some additional changes to my simulateHeavyWork() function and it truly works now 😎

async function simulateHeavyWork(duration) {
  const startTime = Date.now();
  while (Date.now() - startTime < duration) {
    await new Promise(resolve => setTimeout(resolve, 1)); // Introduce a non-blocking delay
  }
}

// Example usage
console.log('Starting heavy work...');
simulateHeavyWork(5000) // Simulate 5 seconds of heavy work
  .then(() => {
    console.log('Heavy work completed.');
  });

Blocking the Main Thread (More Examples): β›”πŸ§΅

Another potential example of blocking the main thread is making a big complex object write request to indexDB. The catch here is that if we can change Synchronous IndexedDB Write to Asynchronous IndexedDB Write it will NOT BLOCK the main thread

fetch('https://jsonplaceholder.typicode.com/todos')
    .then(response => response.json())
    .then(data => {
      console.log('API response received. Starting IndexedDB write...');

      // Synchronous IndexedDB write operation (could block UI) if 
      // the data to written is complex and big

      // If IndexedDB Write can be made Asynchronous it will NOT BLOCK UI  
      const transaction = db.transaction('todos', 'readwrite');
      const store = transaction.objectStore('todos');
      store.add(data);

      console.log('IndexedDB write completed.');
    })
    .catch(error => {
      console.error('Error:', error);
    });
};

Let me show you one more example, The first code snippet blocks the Main Thread UI and the second code snippet does not.

Code Snippet: Block the Main Thread after making a fetch request

// Simulate a long-running synchronous operation
function simulateHeavyWork(duration) {
  const startTime = Date.now();
  while (Date.now() - startTime < duration) {}
}

// Make a Fetch API request with a long-running synchronous operation
fetch('https://jsonplaceholder.typicode.com/posts/1')
  .then(response => response.json())
  .then(data => {
    console.log('API response received. Starting heavy work...');
    simulateHeavyWork(5000); // Simulate 5 seconds of heavy work
    console.log('Heavy work completed.');
  })
  .catch(error => {
    console.error('Error:', error);
  });

console.log('Fetch API call initiated.');

Code Snippet: Non-Blocking the Main Thread after making a fetch request

// Simulate a delay in processing the API response (attempted async)
async function simulateHeavyWork() {
  const startTime = Date.now();
  while (Date.now() - startTime < 5000) {
    await new Promise(resolve => setTimeout(resolve, 1));
  }
}

// Make a Fetch API request and simulate delayed processing
fetch('https://jsonplaceholder.typicode.com/posts/1')
  .then(response => response.json())
  .then(data => {
    console.log('API response received and starting processing...');
    simulateHeavyWork(); // Making it async does not block the UI
    console.log('API processing completed.');
  })
  .catch(error => {
    console.error('Error:', error);
  });

console.log('Fetch API call initiated.');

In the second code snippet, I have made simulateHeavyWork() function asynchronous and it does not block the UI anymore.

Observations πŸ”

  1. Now from the above examples, it is clear that any long-running πŸ•‘synchronous task can block the Main Thread UI.

  2. It matters how you write your asynchronous code βœ…, it should be truly asynchronous, certain synchronous operations within asynchronous code can still block the main thread β›”πŸ§΅.

Good Questions to Ask β“πŸ™‹β€β™€οΈ

  1. Now, the question is: can a long-running πŸ•‘ asynchronous task block the main thread UI?

  2. When to use Web Workers vs. When to use Asynchronous code?

Answers to the above questions πŸ—£οΈ

  1. No, a long-running asynchronous task will not block the Main Thread. It also depends on how you write your code, so analyze whatever you write if necessary. What if there are too many of them? Homework for you guysπŸ§šβ€β™‚οΈ

     // Simulate a long-running sequential task asynchronously
     async function simulateLongRunningTask() {
       console.log('Task started...');
    
       await performSubTask('Subtask 1', 3000); // Simulate subtask 1 taking 3 seconds
       await performSubTask('Subtask 2', 5000); // Simulate subtask 2 taking 5 seconds
    
       console.log('Task completed.');
     }
    
     // Simulate a subtask asynchronously
     async function performSubTask(subtaskName, duration) {
       console.log(`[${subtaskName}] Started...`);
       await new Promise(resolve => setTimeout(resolve, duration));
       console.log(`[${subtaskName}] Completed.`);
     }
    
     // Example usage
     console.log('Main thread started.');
    
     // Start the long-running task
     simulateLongRunningTask().then(() => {
       console.log('Long running task has finished.');
     });
    
     console.log('Main thread continues...');
    
  2. Well if you are looking for true parallelism, Web Workers are the answer. Asynchronous codes are concurrent. Measure Measure Measure πŸ“, you can measure the performance of a certain piece of code and decide what to use. Do not use Web Workers everywhere just because they provide parallelism

Let's write some Web Worker Code

Let's try to move the simulateHeavyWork() to a worker πŸ•Έβš™οΈ.

index.html code below

<!DOCTYPE html>
<html>
  <head>
    <title>Parcel Sandbox</title>
    <meta charset="UTF-8" />
  </head>

  <body>
    <div id="app"></div>
    <div id="result"></div>
    <button id="startButton">Perform Heavy Task</button>
    <script src="./index.mjs" type="module"></script>
  </body>
</html>

index.js code is below

import './styles.css';

document.getElementById('app').innerHTML = `
<h1>A simple Web Worker Example</h1>
<div>
simulateHeavyWork() Function will be executed by Web Worker IF YOU CLICK ON BELOW BUTTON
</div>
`;
const startButton = document.getElementById('startButton');
const resultElement = document.getElementById('result');
const duration = 5000;

if (window.Worker) {
  console.info('Worker Supported');
  const worker = new Worker(new URL('worker.js', import.meta.url));

  startButton.addEventListener('click', () => {
    worker.postMessage(duration);
  });

  worker.onmessage = (event) => {
    resultElement.textContent = `Result: ${event.data}`;
  };
} else {
  console.error('Worker Not Supported');
}

worker.js code is below

onmessage = (event) => {
  const duration = event.data;
  simulateHeavyWork(duration);
  postMessage('Complete Running Web Worker');
};

function simulateHeavyWork(duration) {
  const startTime = Date.now();
  while (Date.now() - startTime < duration) {}
}

The above code does not block the main thread anymore. Code is here

How Microsoft Teams App uses Web Worker πŸ•Έβš™οΈ

In a blog post by Microsoft, they discussed, how their Client Data Layer is part of Web Worker now. Below is a small excerpt from the blog post

Client Data Layer
One of JavaScript’s core is its single-threaded nature. To overcome this limitation, we implemented a solution by moving the data management to a separate worker, known as the client data layer. This enabled data fetch, data storage, data compliance operations, push notifications, and offline functionality to run in parallel threads, without adding contention to the main user interface thread. ~ Microsoft Tech Team

MS new architecture

In the above picture, you can see, how the Client Data Layer interacts with IndexDB and other MS Services, performs Data Compliance, Data Transformation etc.

We are going to implement a simple Client Data Layer πŸ˜€. which performs the same things.

Client Data Layer Implementation πŸ§‘β€πŸ’Ό

We will try to implement a simple client data layer web worker which has the following responsibilities.

  1. Handle Data Fetching, Data Transformation, Data Compliance

  2. Handle Read/Write to IndexDB

This will give us confidence and make sure, that whenever you write web worker it will be smooth as butter 🧈. Code is here

Let me show you a diagram that will present an overview of the implementation

Let's start writing code

clientDataLayerWorker.js : Worker

import { CLIENT_DATA_LAYER_ACTION } from './constant';

// Handle postMessage request from Main Thread
self.addEventListener('message', (event) => {
  const { requestInfo, action } = event.data;

  switch (action) {
    case CLIENT_DATA_LAYER_ACTION.FETCH:
      const fetchPromise = fetchData(requestInfo.url, requestInfo.options);
      Promise.resolve(fetchPromise)
        .catch((e) => console.error(e))
        .then((response) => {
          // Send Msg to Main thread with response and action
          self.postMessage({
            response,
            action
          });
        });

    default:
      console.error(`Wrong ${action} Action Type`);
  }
});

async function fetchData(url, options) {
  const response = await fetch(url, options);
  const jsonData = await response.json();
  return jsonData;
}

In the above code self is the worker's global scope, because the window object does not exist. You can also use globalThis and also it is easier because the user does not have to remember whether they are running code in Node js, Browser, or Worker.

So we attach a message event listener and listen for requests from the Main Thread🧡. Once a request is received, it will process and send back the output (could be anything, like Fibonacci, prime number, network response) to the Main Thread 🧡

customFetch.js : Main Thread

import { CLIENT_DATA_LAYER_ACTION } from './constant';

const worker = new Worker(
  new URL('clientDataLayerWorker.js', import.meta.url),
  { type: 'module' }
);

export async function customFetch(url, options) {
  return new Promise((resolve, reject) => {
    // Listen for response from Worker
    worker.addEventListener('message', function handleMessage(event) {
      const { action } = event.data;
      if (action === CLIENT_DATA_LAYER_ACTION.FETCH) {
        // Return Response
        resolve(event.data.response);
      }
    });

    // Sending Request to Worker
    worker.postMessage({
      action: CLIENT_DATA_LAYER_ACTION.FETCH,
      requestInfo: { url, options },
    });
  });
}

We define a function called customFetch() which will be used in the app to fetch the data from the server.

This is how we can use it.

async function fetchPost() {
  const res1 = await customFetch('https://swapi.dev/api/planets/3/');
  const res2 = await customFetch('https://swapi.dev/api/planets/4/');

  console.log({ res1, res2 }); // Works fine

  const res3 =  await Promise.all([
    customFetch('https://swapi.dev/api/planets/1/'),
    customFetch('https://swapi.dev/api/planets/2/'),
  ]);

  console.log(res3); // Does not work

  return [res1,res2,...res3]
}

If you have noticed carefully res3 output is wrong πŸ₯΅. It seems our code does NOT WORK ❌ with Promise.all([customFetch(), customFetch()]). See the diagram below.

Let me show you a diagram which will explain the root cause of this problem. This is very important ❗

You can see that when a parallel/concurrent request is sent by Promise.all([customFetch(), customFetch()]), each customFetch() adds a message πŸ“¨ event listener πŸ“£ (listens for a response from Worker). Now there are two message event listeners in the Main Thread.

Whenever the Worker sends the Fetch Response both of the event listeners receive this response from the Worker.

The catch here is that whichever request is completed first by the Worker will be sent to the Main Thread, now Both of the event listeners in the Main Thread will receive the same Fetch Response (since both of them are listening), so we see duplicate data πŸ˜†.

Promise.all() and Web Workers

To address the issue outlined above, the event listener should exclusively accept the Fetch Response originating from the Request that initiated it, as demonstrated by the following example

We need to change our code in customFetch.js

import { CLIENT_DATA_LAYER_ACTION } from './constant';

const worker = new Worker(
  new URL('clientDataLayerWorker.js', import.meta.url),
  { type: 'module' }
);

export async function customFetch(url, options) {
  return new Promise((resolve, reject) => {
    // Create a unique id
    const id = Date.now() + Math.random();

    worker.addEventListener('message', function handleMessage(event) {
      // Does this unique id belong to me ? 
      if (id === event.data.id) {
        worker.removeEventListener('message', handleMessage);
        resolve(event.data.response);
      }
    });

    worker.postMessage({
      action: CLIENT_DATA_LAYER_ACTION.FETCH,
      requestInfo: { url, options },
      id,
    });
  });
}

Just by adding a unique ID for each customFetch() request, the event listener knows which response it has to resolve and which one to ignore.

Inside clientDataLayerWorker.js we should send back the unique ID

// ......Rest of the code
Promise.resolve(fetchPromise)
        .catch((e) => console.error(e))
        .then((response) => {
          // console.info('Worker: Fetch', response);
          self.postMessage({
            response,
            action,
           // Send back the Unique ID
            id,
          });
        });
// ......Rest of the code

IndexDB πŸ“‡ and Web Worker

We will start writing code for IndexDB. I am going to use dexie.js to query indexDB

  1. Create an indexDB database in the Main Thread. Go to devtool->Application-> IndexDB and you can see the created database. Code is here
import { Dexie } from 'dexie';

const dbName = 'MyDatabase';
function makeDatabaseReady() {
  const idbInstance = InitIDB(dbName);
  createSchema(idbInstance, { friends: '++id, name, age' });
  addFriends(idbInstance);
}

makeDatabaseReady();

function InitIDB(dbName) {
  try {
    return new Dexie(dbName);
  } catch (error) {
    console.error(error);
  }
}

function createSchema(db, schemaDefinition) {
  try {
    return db.version(1).stores(schemaDefinition);
  } catch (error) {
    console.log(error);
  }
}
  1. Send Signal (postMessage()) from Main Thread to Web Worker. Web Worker queries the indexDB and sends back the response
async function getFriends() {
  const friends = await customIDBQuery(dbName, 1, schemaDefs[1], [
    {
      path: ['friends', 'bulkGet'],
      type: 'APPLY',
      argsList: [[1, 2]],
    },
  ]);

  return friends;
}

async function putFriends() {
  const friendID = await customIDBQuery(dbName, 1, schemaDefs[1], [
    {
      path: ['friends', 'put'],
      type: 'APPLY',
      argsList: [{ id: 5, name: 'Gaurav', age: 50 }],
    },
  ]);

  return friendID;
}

I am going to use customIDBQuery() to query indexDB. Notice πŸ‘€ the parameters for customIDBQuery() we will talk later about this unusual way of sending the parameters.

Note: I have changed my customFetch().js name to clientDataLayerInterface.js All the logic to interact with Web Worker is here.

Inside my clientDataLayerInterface.js file I am going to add one more function called customIDBQuery()

export function customIDBQuery(dbName, version, schemaDef, queries) {
  return new Promise((resolve, reject) => {
    const id = Date.now() + Math.random();
    worker.addEventListener('message', function handleMessage(event) {
      if (id === event.data.id) {

        worker.removeEventListener('message', handleMessage);
        resolve(event.data.response);
      }
    });

    // Can send object that are clonable.
    worker.postMessage({
      action: CLIENT_DATA_LAYER_ACTION.IDB_QUERY,
      requestInfo: {
        dbName,
        version,
        schemaDef,
        queries,
      },
      id,
    });
  });
}

Whenever we call postMessage() behind-the-scenes structured clone is called. It is responsible for creating deep copies of objects and sending them to the destination (a worker or main thread).

So parameters you send in the postMessage() should be clonable otherwise, it will throw unable to clone error ❌.

Therefore we can't send instances of indexDB. We need to send parameters in such a way that the Web Worker should be able to figure out which query to execute.

Inside my clientDataLayerWorker.js file I am going to add logic to handle indexDB Queries


self.addEventListener('message', (event) => {
  const { requestInfo, action, id } = event.data;

  switch (action) {
    // .... Rest of the code
    case CLIENT_DATA_LAYER_ACTION.IDB_QUERY:
      const queryIDBPromise = queryIDB(
        requestInfo.dbName,
        requestInfo.version,
        requestInfo.schemaDef,
        requestInfo.queries
      );
      Promise.resolve(queryIDBPromise)
        .catch((e) => console.error(e))
        .then((response) => {
          console.log(response, id, action);
          self.postMessage({
            response,
            action,
            id,
          });
        });
      break;
    default:
      console.error(`Wrong ${action} Action Type`);
  }
});

async function queryIDB(dbName, version, schemaDef, queries) {
  const { path, type, argsList } = queries[0];
  const originalDB = new Dexie(dbName);
  const v = originalDB.version(version).stores(schemaDef);
  // You can think about closing this connection
  const query = await executePath(originalDB, path, type, argsList)();
  return query;
}

// UTILS FUNC

// Build indexDB Query from the path and argsList
function executePath(instance, path, type, argsList) {
  let currentInstance = instance;

  for (const prop of path) {
    if (prop in currentInstance) {
      const methodOrProperty = currentInstance[prop];

      if (type === 'APPLY' && typeof methodOrProperty === 'function') {
        currentInstance = methodOrProperty.bind(currentInstance, ...argsList);
      } else {
        currentInstance = methodOrProperty;
      }
    } else {
      throw new Error(`Invalid path prop: ${prop}`);
    }
  }

  return currentInstance;
}

I know the above executePath() might be a bit complex, but its sole purpose is to build the indexDB query. Instead of using executePath() you can also use Proxy Pattern to build a more robust query. That's homework for you πŸ§šβ€β™‚οΈ. Let me know if you want a new blog on Proxy Pattern

I am going to end it here πŸ”šπŸ”œ

End Notes

I hope you enjoyed the blog, see you soon.

Β