Accessing your local machine from a Docker container

A small entry about Docker. I am using Docker mostly in development to have one Docker per web application I am running. It gives me the advantage to have different instances of Apache running that are isolated. A few time ago, I wanted to perform an HTTP request from inside the Docker container back to my development machine — I could not. I was hitting 503.

Docker’s localhost is not my machine localhost. After some research, I found that using the DNS docker.for.mac.host.internal was the way to access my developer machine. It works for a MacOs.

How to setup a TypeScript, NodeJS, Express Apollo Server to easy debugging with VsCode

There is a lot of keyword in the title but this is not a clickbait, we will setup without too much burden a simple configuration that will allow Visual Studio Code (VsCode) to hook into a GraphQL installation. The idea is that everytime a TypeScript file is saved that automatically the file is transpiled into JavaScript and to have Node reboot. The solution I propose can do the transpilation, the schema transportation and the restart of NodeJS under 2 seconds.

NPM

The first step is to get some NPM packages. The first one is named concurrently which will allow from a single line, a single NPM command to execute multiple commands. This is required to have TypeScript in watch mode, having a file watcher for the GraphQL schemas and restarting node if any of the previous two changes. The second is the package cpx which can watch for file and copy them if something changes. The third is TypeScript that will watch all TypeScript file for changes and build into the output folder. The fourth package is nodemon that monitor changes in file. If a file change, it restart Node.

"concurrently": "^4.1.0",
"cpx": "^1.5.0",
"typescript": "^3.2.2",
"nodemon": "^1.18.8"

Then few NPM scripts are required.

"dev": "concurrently \"tsc -w\" \"npm run watchgraphql\" \"nodemon build/dist/index.js\"",
"debug": "concurrently \"tsc -w\" \"npm run watchgraphql\" \"nodemon --inspect build/dist/index.js\"",
"watchgraphql": "cpx 'src/graphql/schemas/**/*.graphql' build/dist/graphql/schemas/ -w -v"

There are two main scripts. The dev and debug. I mostly run the second one because it does the same as the first one with the addition of opening a port for VsCode to connect to debug the NodeJS (Express) server. What it goes is to start concurrently TypeScript in watch mode (-w), run the watchgraphql and nodemon to watch every file (produced JavaScript and GraphQL schema files. The GraphQL’s schemas have there own extension “.graphql” and are not moved like TypeScript during the transpilation. Hence, it requires a separate process to move the file when we edit the file.

Visual Studio Code

Finally, within Visual Studio you need to create a debug launch configuration. The creation occurs in the fourth button of the menu, the one with a bug. It is possible to select “Add Configuration” in the dropdown to create a new debugging configuration. Here is the one I am using:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Attach to node",
            "type": "node",
            "request": "attach",
            "restart": true,
            "port": 9229
        }
    ]
}

It will attach to an existing instance. It means that if you want to debug the startup of the NodeJS server that it does not work. You will need to change the way to avoid using nodemon and to have the VsCode debugger starting the NodeJS server. It is not a use case that I need, hence I do not have it configured.

Once you have these scripts in place and the VsCode configuration saved. You can click the play button or F5 to start debugging. It takes about 1-2 seconds to hook to the process. Any breakpoint you have set will stop the process and gives you time to explore the variables. If you do a change during the debug, NodeJS will restart and the debugging will stop and restart as well.

HTTP Request Debugging

I am using Axios, but other libraries allow also to shim a proxy to inspect HTTP request and response. It is very valuable when debugging an Apollo GraphQL server because you cannot do like with a web application and use Chrome’s network tab. A trick with Axios is when you configure the AxiosRequestConfig to set a proxy in the HTTP header of the request that point to your computer.

config.proxy = {
    host: "127.0.0.1",
    port: 5555
};

Then, installing a tool like Postman at the specified port is enough to receive every request with the proper HTTP headers in place.

Summary

I am far from being an expert concerning developing server side application with NodeJS. I was impressed of the quick result. Within an hour, I had a development environment that was efficient to develop and to debug. The experience of debugging directly in TypeScript is awesome and to see every request and response swimming through the proxy tool is priceless when it is the time to understand what is coming out and in the GraphQL’s resolvers.

Some other GraphQL articles

Burning the last GIT commit into your telemetry/log

I enjoy knowing exactly what happens in the systems that I am actively working and that I need to maintain. One way to ease the process is to know precisely the version of the system when an error occurs. There are many ways to proceed like having a sequential number increasing, or having a version number (major, minor, path). I found that the easiest way is to leverage the GIT hash. The reason is that not only it point me into a unique place in the life of the code, but it also removes all manual incrementation that a version number has or to have to use/build something to increment a number for me.

The problem with the GIT hash is that you cannot run it locally. The reason is that every change you are doing must be committed and pushed. Hence the hash will always be at least one hash before the last. The idea is to inject the hash at build time in the continuous integration (CI) pipeline. This way, the CI is always running on the latest code (or a specific branch) and knows what is the code being compiled thus without having to save anything could inject the hash.

At the moment, I am working with Jenkins and React using the react-script-ts. I only had to change the build command to inject into a React environment variable a Git command.

"build": "REACT_APP_VERSION=$(git rev-parse --short HEAD) react-scripts-ts build",

In the code, I can get the version by using the process environment.

const applicationVersion = process.env.REACT_APP_VERSION;

The code is minimal and leverage Git system and environment variable that can be read inside React application easily. There is no mechanism to maintain, and the hash is a source of truth. When a bug occurs, it is easy to setup the development environment to the exact commit and to use the remaining of the logs to find out how the user reached the exception.

Telemetry has a Centerpiece of Your Software

Since my arrival at Netflix, I have been working all my time on the new Partner Portal of Netflix Open Connect. The website is private, so do not worry if you cannot find a way to access its content. I built the new portal with a few key architectural concepts as the foundation and one of it is telemetry. In this article, I will explain what it consists of and why it plays a crucial role in the maintainability of the system as well as how to smartly iterate.

Telemetry is about gathering insight on your system. The most basic telemetry is a simple log that adds an entry to a system when an error occurs. However, a good telemetry strategy reaches way beyond capturing faulty operation. Telemetry is about collecting behaviors of the users, behaviors of the system, misbehavior of the correct programmed path and performance by defining scenario. The goal of investing time into a telemetry system is to raise awareness of what is going on on the client machine, like if you were behind the user’s back. Once the telemetry system is in place, you must be able to know what the user did. You can see telemetry like having someone dropping breadcrumb everywhere.

A majority of systems collects errors and unhandled errors. Logging errors are crucial to clarify which one occurs to fix them. However, without a good telemetry system, it can be challenging to know how to reproduce. Recording which pages the user visited with a very accurate timestamp, as well as with which query string, on which browser, from which link is important. If you are using a framework like React and Redux, knowing which action was called, which middleware execute code and fetched data, as well as the timing of each of these steps, are necessary. Once the data in your system, you can extract different views. You can extract all errors by time and divide them by category of errors, you can see error trends going up and down when releasing a new piece of code.

Handling error is one perspective, but knowing how long a user waited to fetch data is as much important. Knowing the key percentiles (5th, 25th, 50th, 75th, 95th, 99th) of your scenarios indicate how the user perceives your software. Decisions about which part need improvement can be taken with certitude because that it is backed by real data from users that consume your system. It is easier to justify engineering time to improve code that hinders the experience of your customer when you can have hard data. Collecting about scenarios is a source of feature popularity as well. The aggregation of the count by users of a specific scenario can indicate if a feature is worth staying in the system or should be promoted to be more easy to discover. The conclusion of how to interpret the telemetry values are subjective most of the time, but is less opinionate then a raw gut feeling. Always keep in mind that a value may hide an undiscovered reality. For example, a feature may be popular but users hate using it — they just do not have any other alternative.

There are many telemetries and when I unfolded my plan to collect them I created a TypeScript (client-side) library that is very thin with 4 access points. The first one is named “trackError”. Its specialty is to track error and exception. It is simple as having an error name that allows to easily group the error (this is possible with handled error caught in try-catch block) and contains the stack trace. The second one is “trackScenario” which start collecting the time from the start to the end. This function returns a “Scenario” object which can be ended but also having the capability of adding markers. Each marker is within the scenario and allows fined grained sub-steps. The goal is to easily identify what inside a scenario involves slowness. The third access point is trackEvent which take an event name and a second parameter that contain an unstructured object. It allows collecting information about a user’s behavior. For example, when a user sorts a list there is an event “sortGrid” with a data object that has a field that indicates which grid, the direction of the sort, which field is being sorted, etc. With the data of the event, we can generate many reports of how the user is using each grid or more generic which field etc. Finally, it is possible to “trackTrace” which allow specifying with many trace level (error, warning, info, verbose) information about the system. The library is thin, simple to use and has basic functionality like always sending the GIT hash of the code within the library, always sending the navigation information (browser info), having the user unique identifier, etc. It does not much more. In fact, one more thing which is to batch the telemetry and send them periodically to avoid hammering the backend. The backend is a simple Rest API that takes a collection of telemetry message and stores them in an Elastic Search persistence.

A key aspect, like many software architectures and process decision, is to start right from the beginning. There are hundreds of usage of telemetry at the moment in the system and it was not a burden to add them. The reason is that they were added continually during the creation of the website. Similar to writing unit tests, it is not a duty if you do not need to add to write them all at once. While coding the features, I had some reluctances in few decisions, I also had some ideas that were not unanimous.

The aftereffect of having all the data about the user sooth many hot topics by having a reality check about how really the users are using the system. Even when performing a thorough user session to ask how they are using the system, there is nothing like real data. For example, I was able to conclude that some user tries to sort empty grid of data. While this might not be the discovery of the century, I believe it is a great example that shows a behavior that no user would have raised. Another beneficial aspect is monitoring the errors and exceptions and fixing them before users report. In the last month, I have fixed many (minor) errors and less than 15% were raised or identified by a user. When an error occurs, there is no need to consult the user — which is often hard since they can be remote around the world. My daily routine is to sort all error by count by day and see which one is rising. I take the top ones and search for a user who had the issue and looks at the user’s breadcrumb to see how to reproduce locally on my developer’s machine. I fix the bug and push it in production. The next day, I look back at the telemetry to see if the count is reducing. A proactive fixing bug approach is a great sensation. You feel way less trying to put water on fire allowing you to fix properly the issue. Finally, with the telemetry system in place, when my day to day job is getting boresome or that I have a slump in productivity, I take this opportunity to start querying the telemetry data to break into several dimensions with the goal to shed some light about how the system is really used and how it can be improved to provide the best user experience possible.

This article was not technical. I will follow up in a few days with more detail about how to implement telemetry with TypeScript in a React and Redux system.

Using TypeScript and React to create a Chrome Extension Developer Tool

I recently have a side project that I am dogfooding at Netflix. It is a library that handles HTTP request by acting as a proxy to cache requests. The goal is to avoid redundant parallel call and to avoid requesting data that is still valid while being as simple as possible. It has few configurations, just enough to make be able to customize the level of cache per request if required. The goal of this article is not to sell the library but to expose information about how I created a Chrome’s extension that listens to every action and collects insight to help the developer understanding what is going on underneath the UI. This article will explain how I used TypeScript and React to build the UI. For information about how the communication is executed behind the scene, from the library to Chrome, you should refer to this previous article.


Here is an image of the extension at the time I wrote this article.

Chrome’s extension for developer tool requires a panel which is an HTML file. React with Creat-React-App generate a static HTML file that bootstrap React. There is a flavor of create-react-app with TypeScript that works similarly, but with TypeScript. In both case, it generates a build in a folder that can be published as a Chrome’s extension.

The build folder content can be copied and pasted into your distribution folder along with the manifest.json, the contentScript.js, and background.js files that has been discussed in the communication article between your code and Chrome extension.

What is very interesting is, you can develop your Chrome’s extension without being inside the developer tool. By staying outside, it increases your development velocity because you do not need to build which use Webpack — this is slow. It also requires to close and open the extension which at the end consume time for every little change. Instead, you can mock the data and leverage the hot-reload mechanism of create-react-app by starting the server (npm run start) and run the Chrome’s extension as we independent website until you are ready to test the full fledge extension with communication coming from outside your extension.

Running the website with creat-react-app is a matter of running a single command (start), however, you need to indicate to the panel’s code that you do not expect to receive a message from Chrome’s runtime. I handle the two modes by passing an environment variable in the command line. In the package.json file I added the following code:

"start": "REACT_APP_RUNENV=web react-scripts-ts start",

Inside the React, app.tsx file, I added a check that decides whether to subscribe to Chrome’s runtime message listener or to be injected with fake data for the purpose of web development.

if (process.env.REACT_APP_RUNENV === "web") {
    // Inject fake data in the state
} else {
    // Subscribe to postMessage event
}

Finally, using TypeScript and React is a great combination. It clarifies the message that is expected at every point in the communication but also simplifies the code by removing any potential confusion about what is required. Also, React is great in term of simplification of the UI and the state. While the Data Access Gateway Chrome’s extension is small and does not use Redux or another state management, it can leverage the React’s state at the app.tsx. It means that to save and load the user’s data is a matter of simply dumping the state in the Chrome’s localstorage and to restore it — that is it. Nothing more.

public persistState(): void {
  const state = this.state;
  chrome.storage.local.set({ [SAVE_KEY]: state });
}
public loadState(): void {
  chrome.storage.local.get([SAVE_KEY], (result) => {
    if (result !== undefined && result[SAVE_KEY] !== undefined) {
      const state = result[SAVE_KEY] as AppState;
      this.setState(state);
    }
  });
}

To summarize, a Chrome extension can be developed with any front-end framework. The key is to bring the build result along with the required file and make sure you connect in the manifest.json the index generated. React works well, not only because it generates for you the entry point as a simple HTML file which is the format required by Chrome’s extension. TypeScript is not a hurdle because the file generated by the build is JavaScript, hence no difference. React and TypeScript is a great combination. With the ability of developing the extension outside Chrome’s extension you can gain velocity and have a product rapidly in a shape that can be used by your user.

How to communicate from your website to a Chrome Extension

Passing a message from a website to a Chrome Extension is not routine job. Not only the communication between a specific piece of code from the browser to a specific browser is unusual, it is also confusing by the potential type of extension. In this article, I’ll focus on an extension that goes into the Chrome Extension Developer tools. Similar to the “Elements” or “Network” tab, the extension will have its own tab that will be populated by the website. To be more accurate, it could be any website using a specific library.

The illustration is the concept of what is happening. The reality is a little bit more complicated. There is more communication boundary that is required which can be confusing at first. The documentation is great but it lacks guidance for a first time user. The following illustration is what happens in term of communication and with that in mind, the flow should be easier to understand.

Your library that is sending information to your extension is very simple. It consists of using the “window.postMessage” to send an object. The extension will read and parse your payload depending on the source. For my library and extension, named Data Access Gateway, I decided to have the source name “dataaccessgateway-agent”. The name could be anything. Keep in mind that later, you will reuse the name at the extension code to verify that the message is coming from your source.

  window.postMessage({
                source: "dataaccessgateway-agent",
                payload: requestInfo
            }, "*");

The payload may be anything you want but make sure it remains with an object that is not constructed (with “new). For example, if in your payload you have a date, make sure they are not in the payload as an actual Date object but in a more primitive form (string or number). Otherwise, you will receive an exception.

The next step is to configure the manifest file for the extension. The critical detail is to specify two JavaScript files: the background and content_script. The former run regardless of which website is active. It runs in the background of the Chrome Extension from when the extension is loaded until it is unloaded. The latter is a script that the extension injects into the webpage. The injection can be targetted to a specific page or to run on all webpage. In my case, the extension must receive a message from a library, hence I do not know which website might use it and I allow the injection in every page. Because we are having this requirement to be available on every page, the security and the communication is more overwhelming that most information you can find in the basic documentation.

{
    "name": "Data Access Gateway Developer Tool",
    "version": "1.0",
    "description": "Data Access Gateway Developer Tool that allows getting insight about how the data is retrieved",
    "manifest_version": 2,
    "permissions": [
        "storage",
        "http://*/*",
        "https://*/*",
        "<all_urls>"
    ],
    "background": {
        "scripts": [
            "background.js"
        ],
        "persistent": false
    },
    "icons": {
        "16": "images/dagdl16.png",
        "32": "images/dagdl32.png",
    },
    "minimum_chrome_version": "50.0",
    "devtools_page": "index.html",
    "content_security_policy": "script-src 'self' 'unsafe-eval'; object-src 'self'",
    "content_scripts": [
        {
            "matches": [
                "<all_urls>"
            ],
            "js": [
                "contentScript.js"
            ],
            "run_at": "document_start"
        }
    ]
}

The manifest file asks for permissions and specifies the “index.html” which is the file loaded when the Chrome Developer Tool panel is open. We will come back later on the HTML file. The important part is the background and contentScript.js. Both files can be renamed as you wish. Before moving on, it is important to understand that the communication flows in this particular order: postMessage -> contentScript.js -> background.js -> dev tools HTML page. The core of the code will be in the HTML page and the remaining is just a recipe that must be followed to be compliant with Chrome’s security.

The contentScript.js is the file injected into the webpage. The sole purpose of this file is to listen for message passed by “window.postMessage”, to check the payload and make sure this is the one we are interested in and move along to Chrome’s runtime. The following code registers a “message” listener when the webpage loads. The script captures “postMessage” and checks for the source. When is the agent name defined in the previous step, we invoke the sendMessage from the Chrome’s runtime. The invocation passes the message to the background.js file.

window.addEventListener("message", (event) => {
    // Only accept messages from the same frame
    if (event.source !== window) {
        return;
    }

    var message = event.data;

    // Only accept messages that we know are ours
    if (typeof message !== "object" || message === null || !!message.source && message.source !== "dataaccessgateway-agent") {
        return;
    }
    chrome.runtime.sendMessage(message);
});

The next step is to listen to the Chrome’s runtime messages. More code is required. There is a collection of tabs which will handle multiple tabs situation to know where it comes from. There are two listeners. One handle incoming new message and one for new Chrome’s tab. The message dispatches the message to the proper tab, the other listener subscribes to and unsubscribe the tab.

let tabPorts: { [tabId: string]: chrome.runtime.Port } = {};
chrome.runtime.onMessage.addListener((message, sender) => {
    const port = sender.tab && sender.tab.id !== undefined && tabPorts[sender.tab.id];
    if (port) {
        port.postMessage(message);
    }
    return true;
});

chrome.runtime.onConnect.addListener((port: chrome.runtime.Port) => {
    let tabId: any;
    chrome.runtime.onMessage.addListener
    port.onMessage.addListener(message => {
        if (message.name === "init") { // set in devtools.ts
            if (!tabId) {
                // this is a first message from devtools so let's set the tabId-port mapping
                tabId = message.tabId;
                tabPorts[tabId] = port
            }
        }
    });
    port.onDisconnect.addListener(() => {
        delete tabPorts[tabId];
    });
});

The post.postMessage send the payload for the last time. This time, it will be within reach of your Chrome Developer Tools extension. You may remember that in the manifest file we also specified an HTML file. This file can have a JavaScript file specified that will listen to the messages from the background.js script. I am developing the Data Access Gateway Chrome Extension with React, so the index.html starts the index.jsx, this one attach the app.jsx which will have in its constructor the listener.

this.port = chrome.runtime.connect({
        name: "panel"
});

this.port.postMessage({
        name: "init",
        tabId: chrome.devtools.inspectedWindow.tabId
});

this.port.onMessage.addListener((message: Message) => {
        if (message.source === "dataaccessgateway-agent") {
          // Do what you want with the message object
          // E.g. this.setState(newState);
        }
});

chrome.devtools.panels.create(
        "Data Access Gateway",
        "images/dagdl32.png",
        "index.html"
);

Still quite a few lines of code before actually doing something in the Chrome’s extension. The first one is to connect to the tab (port). Then, initializing the communication by sending a post message. Finally, on the connected port to start listening to incoming messages. Finally, we invoke the creation of the panel. As you might have seen, the “addListener” is strongly typed with the object I sent from the initial library call — that is right! TypeScript is supported in each of these steps. You can see all the detail, in TypeScript, in the GitHub Repository of the Data Access Gateway Chrome Extension.

To conclude, the transportation of your object from your library (or website) to the Chrome’s Developer panel is not straightforward. It requires multiple steps which can fail in several places. A trick I learned while developing the extension is that “console.warn” is supported. You can trace and ensure that the data is passing as expected. Also, another debugging trick, you should undock the extension (to have the Developer Tool in a separate window) which will allow you to do “command+i” on Mac or “f12” on Windows to debug the Chrome’s developer tool. This is the only way to not only see your tracing but to set a breakpoint in your code.

NPM locally to increase coding velocity between two projects

One advantage having a big repository is to be able to change a part quickly and see the result right away. Recently, I moved a piece of code into another library. The reason was to reuse the library on several systems. The concept of breaking apart into different cohesive library make sense, however, it comes with the price that a quick one line change can become more demanding. I found that it’s roughly 5 to 10 times slower. The reason is that a single line of change require to save the file and the code is ready to use. The same change in another repository needs to package the source code and to fetch the new package. The major problem is that everyone using the package sees the version bumping and the release may not even be ready to share. Ideally, a library is self-contained, and a group of unit tests ensures that the quality is as expected. However, in reality, it appears that a shareable version may require some checks directly on the browser.

NPM provides a solution. The solution is to share the library code locally to a local project. The beauty is that no code modification is required. The solution tells NPM to link locally instead of getting the package from the remote server.

The first step is to have your project that consumes the library and the library locally on your computer. Both projects can reside anywhere.

The second step is to go at with your command line at the location of the package.json of your library. NPM has a command called “link”. The execution of the command will specify if the creation is a success or not. If it is a success, you can use your command prompt to move to the project that consumes the library. Again, at the level of the package.json the command “npm link” must be executed. The difference is the argument that needs to specify which library to link. The name of the library is the name specified in the package.json of the library that we “npm link” at the second step. This command succeeds with an output that shows that node_modules points to the local directory.

Finally, once you are done and want to use the node_module library, you can unlink. The command to unlink is “npm unlink ” where the parameter is the parameter of the library linked. The library can also be unlinked by going back to the library and executed “npm unlink”.

As a recap:

// 1) At the library level
npm link
// 2) At the project level (consumer)
npm link my-library-name
// 3) Stop using the local version
npm unlink my-library-name
// 4) Stop sharing locally my-library-name (must move back to library level)
npm unlink

The technique works with TypeScript. The compilation of the library needs to occur since the link to the library read the package.json and will look for the “files” property which is mostly pointing to the build folder.

Before closing on the subject, make sure that both libraries are running on the same NodeJS and npm version. Each NodeJS version links to a different folder. You can confirm the location of the link by using “node -v”. Another tips is for people using “create-react-app”. In that case, you may have to close and open the development server on each change. The reason is that Webpack does not notice the change in node_modules folder and will serve you the same files of the startup.

Debug TypeScript Unit Test with Jest and VsCode

If you are using create-react-app or the TypeScript equivalent react-script-ts for TypeScript, you see that the default testing framework is Jest. This is developed by Facebook like React and Redux. Jest procures many advantages like being fast. It doesn’t need to load a browser headless or being in a browser at all. It is also fast because it can run the unit tests on a changed test or run the unit test that has a relation to the code changed instead of running every test. In this article, I’ll guide you to setup Visual Studio Code to be able to debug directly in TypeScript. I take the time to write something because information on Internet is very slim and the configuration is fragile.

As mentioned, configuring Visual Studio with Jest require subtle detail that can break the whole experience. For example, using Node 8.1.4 won’t work, using Node 8.4 or 8.6 works. Another sensitive area is the configuration of Visual Studio. It requires having some specific configurations which vary. The following code is two different launchers that work with Visual Studio Code.

{
    "type": "node",
    "request": "launch",
    "name": "Jest 1",
    "program": "${workspaceRoot}/node_modules/jest/bin/jest",
    "args": [
        "-i"
    ],
    "preLaunchTask": "tsc: build - tsconfig.json",
    "internalConsoleOptions": "openOnSessionStart",
    "console": "integratedTerminal",
    "outFiles": [
        "${workspaceRoot}/build/dist/**/*"
    ],
    "envFile": "${workspaceRoot}/.env"
}

// OR

{
    "name": "Jest 3",
    "type": "node",
    "request": "launch",
    "program": "${workspaceRoot}/node_modules/jest-cli/bin/jest.js",
    "stopOnEntry": false,
    "args": [
        "--runInBand"
    ],
    "cwd": "${workspaceRoot}",
    "preLaunchTask": null,
    "runtimeExecutable": null,
    "runtimeArgs": [
        "--nolazy"
    ],
    "env": {
        "NODE_ENV": "test"
    },
    "console": "integratedTerminal",
    "sourceMaps": true
}

The second one requires to have jest-cli, the first one not. To download the jest-cli use NPM.

npm install --save-dev jest-cli

From there you can run directly inside Visual Studio Code under the debug tab the debug program or hit F5.

Resizing an Image with NodeJs

This is the second post about project of creating a search tool for local pictures. As mentioned in the first post, this tool needs to use a web service to get information about the picture. This mean we need to upload the image that Microsoft Cognitive Vision/Face service will analyze and return a JSON object with information about the picture. Like most service, there is some constraints in term of the minimum and maximum of the size of what you can upload. Also, even for us, we do not want to send a 25 megs picture when it is not necessary. This article discuss about how to resize picture before sending a request to the web service. This will not only allow to be withing the range of the acceptable values, but also speed up the upload.

I decide to take the arbitrary value of sending picture with the widest side of 640px. This produce in average a file 30kb which is tiny but still enough for the cognitive service to give very good result. This value may not be good if you are building something similar where people are far or if you are not using portrait pictures. In my case, the main subjects are always close range, hence very easy to get detail at that small resolution.

Resizing a file requires to use a third-party library. This is something easy to find with JavaScript and NPM has a library named “Sharp” that do it perfectly. The TypeScript definition file is also available, so we are in business!

npm install --save sharp
npm install --save-dev @types/sharp

Before anything, even if this project is for myself, I defined some configuration variables. Some rigor is required when it’s cheap to do! The three first constant is the maximum size we want to output the image. I choose 640 pixel. The directory name is the constant of the folder where we will save the image we will send and where we will late save the JSON file with the analysed data. We save the resized image because on the website later, we will use this small image instead of the full resolution image. The website will be snappy and since we have the file, why not using this optimization for free. At 30kb for 2000 images, we only use 58 megs. The last constant is the glob pattern to get all underscore JPEG pictures. We will talk about glob very soon.

const maxSize = 640;
const directoryName = "metainfo";
const pathImagesDirectory = path.join(imagesDirectory, "**/_*.+(jpg|JPG)");

The second pre-task is to find the images to resize. Again, this will require a third-party library to simplify our life. We could recursively navigate folders, but it would be nicer to have a singe glob pattern that handle it.

npm install --save glob
npm install --save-dev @types/glob

From there, we need to import the module. We will bring the path and fs module of NodeJs to be able to create proper path syntax and to save file on disk.

import * as g from "glob";
import * as path from "path";
import * as sharp from "sharp";
import * as fs from "fs";

The first function that we need to create is the one that return a list of string that represent the file to resize. This will be all our underscore aka best pictures. We want to be sure that we can re-run this function multiple times, thus we need to ignore the output folder where we will save resized images. This function returns the list in a promise fashion because the glob library is asynchronous. Here is the first version which call the module function “Glob” and add everything into an array while sending in the console the file for debugging purpose.

function getImageToAnalyze(): Promise<string[]> {
    const fullPathFiles: string[] = [];
    const promise = new Promise<string[]>((resolve, reject) => {
        const glob = new g.Glob(pathImagesDirectory, { ignore: "**/" + directoryName + "/**" } as g.IOptions, (err: Error, matches: string[]) => {
            matches.forEach((file: string) => {
                console.log(file);
                fullPathFiles.push(file);
            });
            resolve(fullPathFiles);
        });
    });
    return promise;
}

This can be simplified by just returning the matches string array and returning the promise instead of using a variable. At the end, if you are not debugging you can use :

function getImageToAnalyze(): Promise<string[]> {
    return new Promise<string[]>((resolve, reject) => {
        const glob = new g.Glob(pathImagesDirectory, { ignore: "**/" + directoryName + "/**" } as g.IOptions, (err: Error, matches: string[]) => {
            resolve(matches);
        });
    });
}

As mentioned, the quality of this code is average. In reality, some loves are missing around the error scenario. Right now, if something is wrong, the rejection promise bubble up.

At this point, we can call the method with :

console.log("Step 1 : Getting images to analyze " + pathImagesDirectory);
getImageToAnalyze()
    .then((fullPathFiles: string[]) => {
        console.log("Step 2 : Resize " + fullPathFiles.length + " files");
        return resize(fullPathFiles);
    })

The code inside the “then” is the one executed if the promise is resolved successfully. It will start resizing the list of pictures and pass this list into the function that we will create in an instant.

The resize function is not the one that will do the resize. It will call the function that does the resize only if the picture has not been yet resized. This is great if something happen to fail and you need to re-run. The resize function will check in the “metainfo” folder, where we output the resized picture and only resize this one if not present. In both case, this function return a promise. The type of the promise is a list of IImage.

export interface IImage {
    thumbnailPath: string;
    originalFullPathImage: string;
}

This type allows to have the detail about the full path of the thumbnail “resized” picture and the original picture. When we have already resized, we just create an instance, when we do not have an image we create this one and then return a new instance. This method waits all resize to occur before resolving. This is the reason of the .all. We are doing so just to have a clear cut before moving to the next step and since we are launching multiple resizes in parallel, we are waiting to have them all done before analyzing.

function resize(fullPathFiles: string[]): Promise<IImage[]> {
    const listPromises: Array<Promise<IImage>> = [];
    const promise = new Promise<IImage[]>((resolve, reject) => {
        for (const imagePathFile of fullPathFiles) {
            const thumb = getThumbnailPathAndFileName(imagePathFile);
            if (fs.existsSync(thumb)) {
                listPromises.push(Promise.resolve({ thumbnailPath: thumb, originalFullPathImage: imagePathFile } as IImage));
            } else {
                listPromises.push(resizeImage(imagePathFile));
            }
        }
        Promise.all(listPromises)
            .then((value: IImage[]) => resolve(value));
    });
    return promise;
}

This function use a function to get the thumbnail path to lookup if it’s been already created or not. This function call another one too, and both of these methods are having the same goal of providing a path. The first one, the getThumbnailPathAndFileName get the original full quality picture path and return the full image path of where the resized thumbnail is stored. The second one is a function that will be resused in some occasion and it gives the metainfo directory. This is where the resized picture are stored, but also the JSON file with the analytic data are saved.

function getThumbnailPathAndFileName(imageFullPath: string): string {
    const dir = getMetainfoDirectoryPath(imageFullPath);
    const imageFilename = path.parse(imageFullPath);
    const thumbnail = path.join(dir, imageFilename.base);
    return thumbnail;
}

function getMetainfoDirectoryPath(imageFullPath: string): string {
    const onlyPath = path.dirname(imageFullPath);
    const imageFilename = path.parse(imageFullPath);
    const thumbnail = path.join(onlyPath, "/" + directoryName + "/");
    return thumbnail;
}

The last method is the actual resize logic. The first line of the method create a “sharp” object for the desired picture. Then we invoke the “metadata” method that will give us access to the image information. We need this to get the actual width and height and do some computation to get the wider side and find the ratio of resizing. Once we know the height and the width of the thumbnail we need to create the destination folder before saving. Finally, we need to call the “resize” method with the height and width calculated. The “webp” method is the one that generate the image. From there, we could generate a buffered image and use a stream to handle it in memory or to store it on disk like we will do with the method “toFile”. This return a promise that we use to generate and return the IImage.

function resizeImage(imageToProceed: string): Promise<IImage> {
    const sharpFile = sharp(imageToProceed);
    return sharpFile.metadata()
        .then((metadata: sharp.Metadata) => {
            const actualWidth = metadata.width;
            const actualHeight = metadata.height;
            let ratio = 1;
            if (actualWidth > actualHeight) {
                ratio = actualWidth / maxSize;
            } else {
                ratio = actualHeight / maxSize;
            }
            const newHeight = Math.round(actualHeight / ratio);
            const newWidth = Math.round(actualWidth / ratio);
            const thumbnailPath = getThumbnailPathAndFileName(imageToProceed);
            // Create directory thumbnail first
            const dir = getMetainfoDirectoryPath(imageToProceed);
            if (!fs.existsSync(dir)) {
                fs.mkdirSync(dir);
            }

            return sharpFile
                .resize(newWidth, newHeight)
                .webp()
                .toFile(thumbnailPath)
                .then((image: sharp.OutputInfo, ) => {
                    return { thumbnailPath: thumbnailPath, originalFullPathImage: imageToProceed } as IImage;
                });
        }, (reason: any) => {
            console.error(reason);
        });
}

This conclude the resize part of the project. It’s not as straight forward as it may seem, but noting is space rocket science either. This code can be optimized to start resizing without having analyzed if all the image are present or not. Some refactoring could be done around the ratio logic within the promise callback of sharp’s metadata method. We could also optimize the write to remain in memory and hence having not to reload the thumbnail from the disk but working the on the memory buffer. The last optimization wasn’t done because I wanted every step to be re-executed what ever the state in which they were stopped. I didn’t wanted to bring more logic to reload in memory if already generated. That said, it could be done. The full project is available on GitHub : https://github.com/MrDesjardins/CognitiveImagesCollection

Create a Local Search Tool for Pictures in NodeJs

I recently searched for a specific picture of my daughter on my local drive with some issue. First, I am taking a lot of picture and it was hard to find. Second, I have some good pictures and some average, but I keep them all, hence I have thousand and thousand of picture that are not easy to find. However, I always had since 2003 got a systematic way to store my picture which is a main folder that contains one folder per year and every year has many folder per event. The event folder always have the format “yyyy-mm-dd-EventDescriptionIn2words”. I also have the habit to prefix the best pictures with an underscore inside these folders. Still, the picture name are always the sequential number of my camera and they are not consequent in time. There is no way I can search for “Alicia happy in red dress during summer 2015” for example.

Here come the idea that I started few weeks ago: having a training set of pictures that will serve as a base for the system to figure out who is in my picture and having a service that analyse what is in the picture. On top of the data, a simple website that let me query the database of pictures and return me the best match with a link to the actual full quality picture. Before going any further, a word of caution, the idea of this project is not to develop something that will scale, or a stellar code, hence the quality of the code is very average, but workable solution. Everything is developed with NodeJs, TypeScript, Microsoft Cognitive Api, MongoDb and doesn’t have any unit tests. I may refactor this project someday, but for the moment, let’s just get out head around how to do it.

I’ll write several posts around this project. In fact, at the moment I am writing this article, I have only done half way through the first phase which is analyzing a little subset of my picture. This article will serve more as a description of what will be build.

First thing we need to do is to read a sample of all the images. For me, instead of scanning and analyzing my whole hard drive for picture, I will analyze only picture between a specific range of date. At this date, I have 34 000 pictures taken since 2009 (since I met my wife) and in this population 2 000 have been identified with an underscore which mean that I really like them. For the purpose of having a smaller set of search and not having to analyze for too long time I will only use pictures with an underscore. Second, in these pictures, I can stay that roughly 75% of people are my wife, my daughter or me. Hence, I will only try to identify these people and mark others as “unknown”. Third, I want to be able to know the emotion and what is going on in the picture. This will require a third party service and I will use Microsoft Azure Cognitive API. I’ll get more in detail in the article about the api.

Once the picture will be analyzed, the data will be stored in a MongoDB, which is a JSON based storage. This is great because the result of all the analysis will be in a JSON format. It will allow us to query the content to get results to display in the website. To simplify this project, I will mark the first milestone as scanning the picture and create one JSON file per underscore file inside a “metainfo” folder. The second milestone will be to hydrate the MongoDB and the third one to create a simple web application that will communicate and display the result from MongoDB.

I’ll stop here for the moment. You can find the source code of the progress of this project in this GitHub repository : https://github.com/MrDesjardins/CognitiveImagesCollection