The Best Development Browser Resolution

An interesting challenge with web development is the unlimited different combination of width and height that people can consume the application. I am developing with a wide screen most of the time and once in a while directly on my MacBook Pro that has a retina resolution. In all cases, I have the Chrome’s developer tool open which allows to change the viewport of the web application at the exact amount of pixel desired. The question remains which resolution is the best?

The answer depends of your application. For website that have millions of user it will be harder to converge into a single or even a couple of combination of width and height. However, most people develop web application that are internally used or used by similar group of individual which usually lead into a handful amount of resolution.

At Netflix, I am working on the Open connect Partner Portal where thousands of partners consult the application as well as many Netflix employee. The application is built with React and Redux and I am capturing in every action the width and height of the browser. I am also getting a lot more information and one of it is if it is an employee at Netflix or not. The information about if the data is coming from Netflix or not may or may not be relevant but I wanted to confirm. The reason is that I am getting more direct feedback from internal employees than from partner around the worlds. I wanted to confirm that they are viewing the same web application.

I created a heat map with Kibana who is a simple user interface to visualize Elastic Search data. I created different bucket of resolution and normalized every employe data to have a proportional idea. The first heat map was about Netflix employee. The application was mostly used around 1600×750.

Heat map of resolution used by Netflix employee on the Open Connect Partner Portal

The data shows a different picture of partners where the height is smaller of about 250 pixels and a width with two distincts categories which is 1300px and 1900px.

Heat map of resolution used by non-Netflix employee on the Open Connect Partner Portal

Another view is to see the average width and height per day of all employees and non-employees. The following graph shows in purple the average screen width. The darker line is non-employee, lighter is an employee. Under is the height. The blue line is the employees average and the light green is the non-employee average.

Average of employee vs non-employees width and height

I was curious how bias I was while developing the application. Because I am gathering a lot of telemetry, I was able to plot my resolution through time during the development in parallel of how the users were using the application. The following graph shows my average and the users’ average for the last 10 weeks. 

Me vs Users

The Y-axis is the number of pixel. The two green lines represent the width, the darker line is me, the lighter green is the user. The purple line is the height, again the darker color is me and lighter are the users. In the last few three weeks I was aware of the resolution of my users, hence I was developing conscientiously with the most popular categories discovered. 

The best resolution for Netflix Open Connect Partner Portal is different from the best resolution that you should build your application. While it is crucial to develop for all resolutions, it is realist to have few preset defined to be efficient. The exercise showed me that I have a tendency to use and build the application with more height. I realized that few buttons were under the fold making it harder to find to several of our external users.

Adding Telemetry to your Web Application

I wrote several months ago about having telemetry has a centerpiece of your application. In this article, I’ll write how it works more technically.

First and foremost, the application needs a thin layer to send the telemetry to the backend. I wrote a small internal library that handles basic telemetry needs. For example, if we want to collect a user behavior (clicking, hovering, scrolling, etc) we use the trackEvent function that allows entering a telemetry entry in the system. All telemetries come with a set of information for free. I am collecting information about the user’s browser, the user identity, which organization it belongs, the navigator width/height, if the user is under a specific state (e.g. size of the organization the user belong) but also information about where the user is in the single-page application, and more. For each event, it is possible to add a custom payload. For example, it is possible to add the number of time a popup opens in milliseconds.

Log.trackEvent(this.props.name, {
    timeConsumedInMs: diffMs,
    HelpAnchorId: this.props.id,
    ...this.props.telemetryPayload
});

This layer has many functions from trackError, to trackPage or trackScenario which inject itself into Chrome’s performance tool to have a neat integration about when code start and end (with between marks). 

Chrome Performance Tool Integration with Telemetry Track Scenario

Second, we are receiving a lot of telemetries. Some are useful only for development time, some good to send to the backend. The telemetry library allows dispatching which one is mentioned to be developed only or not by a single option when tracking the information. Chrome’s console is then showing the information while the application run. I opted to use a mix of color and indentation to distinguish easily the different telemetry type.

Console with different telemetry collected

The library has many other features like batching several telemetry events together before sending to the backend the information. Similarly, the library hooks into the browser beacon feature to send what it has in its queue when the user leaves the web application.

The information is valuable while developing but it very valuable once in production. The collection of event and data allows having a real picture of the user. In one of my case, I realized that people at Netflix are using the system quite differently than from outside the company. We are collecting the “role” of the Netflix employee and if our partners are from a small or big organization. A mundane example is that I realized that the average viewport of our user change making some grid barely visible without scrolling for people from a particular group. In another case, an assumed important feature finished to be used only by a very low level of specific users.

There are many ways to analyze all the collected information, and it requires a thorough exercise to have the right conclusion. However, having the information end up to improve a web application less subjectively. It allows evaluating how far from some assumptions we are. I personally enjoy writing down in the specification document my hypotheses. Not only it allows me to put in place the right telemetry, but it also allows me to look back after few weeks and see if I was right or not and adjust my point of view to some potential bias I had or assumption. Telemetry improves the user experience and improves the developer sense to create future user interfaces that do not only look good but that is useful.

Dragging a DOM Element Visually Properly

A few weeks ago, I started using a web appliance (Trello) to handle my side projects priority.  The main action is moving the cards between columns and I realized that I did a way better job when I worked at Microsoft with Visual Studio Team Services than Trello. 

My Visual Studio Team Services Animation

You can compare with the following example of Trello and probably see the difference right away.

Trello board

Trello has the idea of showing that something is in motion, which is great. However, the way I created the animation feels more natural. The concept is like if the mouse cursor is a pin on a piece of paper. Moving a piece of paper with a pin naturally tilt the paper differently when moving left to right, or right to left. This is what I developed. Trello tilts the card, but in a constant way, which is on the right side.

I am using shadow to create a depth of field and showing that we are above other elements that remain still. Trello is also using that technic.

However, I also added a CSS scale effect of about 5% which simulate we take from the board and move it somewhere else. Like in real life, when taking a piece of something and moving it, the perspective change. Trello does not change the scaling factor, hence the card remains the same size. In my view, the lack of scaling removes the realistic aspect of the movement.

Finally, I changed the cursor icon to to be the move pointer. The move pointer shows to the user the potential direction the item can be dragged and moved. In VSTS, it was every direction hence the 4 arrows cursor. Trello is not changing the cursor. Once again, small detail.

In the end, small details matter. The combination of a dynamic tilting, scaling, shadow and cursor modifications create a smooth and snazzy user interface. You can push the limit by slightly blurring the background. However, this last detail was removed for performance reason but would make total sense without that speed penalty.

Side Panel instead of Dialog

During my time at Microsoft on the product Visual Studio Online (renamed to Visual Studio Team Services), my team had to build the new dashboard. One feature is the configuration which consists of selecting widgets that can be part of the dashboard with a specific configuration. For example, you could choose a chart with the number of bugs or a widget that display the list of open pull requests, etc. Each widget can be updated once added or removed. The initial idea was to use a modal dialog, and the MVP (minimal viable product) was built using this user interface pattern. I was against it, still against it and I modified it.

My issue with dialog (and modal dialog) is by experience I know it never ends very well. It is even worse with the web. First of all, a dialog often requires to open other popups hence result in many layers of dialog. For example, it can be a configuration dialog which has another dialog to select a user from a list of existing user or it can be a color picker etc. Second, the goal of the dialog is mainly to display information within the context of the actual page. For example, the dashboard you want to add a dashboard’s widget. However, these dialogs are oversized and remove visibility of the underlying main page. The modal defeats the purpose. Third, the most dialog does not handle well responsiveness. Changing the browser size or simply being under a small resolution fail. Forth, many web pages that use dialog does not handle well scrolling.

A better pattern is to have a side panel that can open and close. This is what I ended up building for Visual Studio Team Services and worked very well. The configuration or the addition of a widget was simple and allowed a user to drag and drop and the proper location the widget. On the right side, you can select the widget you desire, configure this one and position it. All that with a visibility on the actual dashboard allowing the user to always have in focus what is already in place.

Recently, in my work at Netflix, I had to migrate from an older system to the new one the creation of users. Originally, the design was with a dialog. The problem is that you cannot copy information from the existing list, neither see if a user was already created and it was not mobile friendly (small resolution).  I opted to use a side panel. Here are a few interactions possible.

Partner Portal User Management Side Panel

Overall, the biggest gain is the reduction of layer ordering bugs. From Jira to Twitter or other systems that have dialog, there are always issues. It can be an error message that should be displayed on the main page but will be half on top of the dialog or that a dialog opens a dialog that reopens the first dialog creating a potential never-ending dialog. It increases the complexity of the user interface but also in term of a state which can grow exponentially. The simple pattern of the side panel reduces the complexity and increase the user visibility to the main task by not covering information that is valuable to the task.

Telemetry has a Centerpiece of Your Software

Since my arrival at Netflix, I have been working all my time on the new Partner Portal of Netflix Open Connect. The website is private, so do not worry if you cannot find a way to access its content. I built the new portal with a few key architectural concepts as the foundation and one of it is telemetry. In this article, I will explain what it consists of and why it plays a crucial role in the maintainability of the system as well as how to smartly iterate.

Telemetry is about gathering insight on your system. The most basic telemetry is a simple log that adds an entry to a system when an error occurs. However, a good telemetry strategy reaches way beyond capturing faulty operation. Telemetry is about collecting behaviors of the users, behaviors of the system, misbehavior of the correct programmed path and performance by defining scenario. The goal of investing time into a telemetry system is to raise awareness of what is going on on the client machine, like if you were behind the user’s back. Once the telemetry system is in place, you must be able to know what the user did. You can see telemetry like having someone dropping breadcrumb everywhere.

A majority of systems collects errors and unhandled errors. Logging errors are crucial to clarify which one occurs to fix them. However, without a good telemetry system, it can be challenging to know how to reproduce. Recording which pages the user visited with a very accurate timestamp, as well as with which query string, on which browser, from which link is important. If you are using a framework like React and Redux, knowing which action was called, which middleware execute code and fetched data, as well as the timing of each of these steps, are necessary. Once the data in your system, you can extract different views. You can extract all errors by time and divide them by category of errors, you can see error trends going up and down when releasing a new piece of code.

Handling error is one perspective, but knowing how long a user waited to fetch data is as much important. Knowing the key percentiles (5th, 25th, 50th, 75th, 95th, 99th) of your scenarios indicate how the user perceives your software. Decisions about which part need improvement can be taken with certitude because that it is backed by real data from users that consume your system. It is easier to justify engineering time to improve code that hinders the experience of your customer when you can have hard data. Collecting about scenarios is a source of feature popularity as well. The aggregation of the count by users of a specific scenario can indicate if a feature is worth staying in the system or should be promoted to be more easy to discover. The conclusion of how to interpret the telemetry values are subjective most of the time, but is less opinionate then a raw gut feeling. Always keep in mind that a value may hide an undiscovered reality. For example, a feature may be popular but users hate using it — they just do not have any other alternative.

There are many telemetries and when I unfolded my plan to collect them I created a TypeScript (client-side) library that is very thin with 4 access points. The first one is named “trackError”. Its specialty is to track error and exception. It is simple as having an error name that allows to easily group the error (this is possible with handled error caught in try-catch block) and contains the stack trace. The second one is “trackScenario” which start collecting the time from the start to the end. This function returns a “Scenario” object which can be ended but also having the capability of adding markers. Each marker is within the scenario and allows fined grained sub-steps. The goal is to easily identify what inside a scenario involves slowness. The third access point is trackEvent which take an event name and a second parameter that contain an unstructured object. It allows collecting information about a user’s behavior. For example, when a user sorts a list there is an event “sortGrid” with a data object that has a field that indicates which grid, the direction of the sort, which field is being sorted, etc. With the data of the event, we can generate many reports of how the user is using each grid or more generic which field etc. Finally, it is possible to “trackTrace” which allow specifying with many trace level (error, warning, info, verbose) information about the system. The library is thin, simple to use and has basic functionality like always sending the GIT hash of the code within the library, always sending the navigation information (browser info), having the user unique identifier, etc. It does not much more. In fact, one more thing which is to batch the telemetry and send them periodically to avoid hammering the backend. The backend is a simple Rest API that takes a collection of telemetry message and stores them in an Elastic Search persistence.

A key aspect, like many software architectures and process decision, is to start right from the beginning. There are hundreds of usage of telemetry at the moment in the system and it was not a burden to add them. The reason is that they were added continually during the creation of the website. Similar to writing unit tests, it is not a duty if you do not need to add to write them all at once. While coding the features, I had some reluctances in few decisions, I also had some ideas that were not unanimous.

The aftereffect of having all the data about the user sooth many hot topics by having a reality check about how really the users are using the system. Even when performing a thorough user session to ask how they are using the system, there is nothing like real data. For example, I was able to conclude that some user tries to sort empty grid of data. While this might not be the discovery of the century, I believe it is a great example that shows a behavior that no user would have raised. Another beneficial aspect is monitoring the errors and exceptions and fixing them before users report. In the last month, I have fixed many (minor) errors and less than 15% were raised or identified by a user. When an error occurs, there is no need to consult the user — which is often hard since they can be remote around the world. My daily routine is to sort all error by count by day and see which one is rising. I take the top ones and search for a user who had the issue and looks at the user’s breadcrumb to see how to reproduce locally on my developer’s machine. I fix the bug and push it in production. The next day, I look back at the telemetry to see if the count is reducing. A proactive fixing bug approach is a great sensation. You feel way less trying to put water on fire allowing you to fix properly the issue. Finally, with the telemetry system in place, when my day to day job is getting boresome or that I have a slump in productivity, I take this opportunity to start querying the telemetry data to break into several dimensions with the goal to shed some light about how the system is really used and how it can be improved to provide the best user experience possible.

This article was not technical. I will follow up in a few days with more detail about how to implement telemetry with TypeScript in a React and Redux system.

How to Write a IF Statement that determine a value

Pretty much a basic case if you have done some programming. How to write a IF statement is an agnostic problem when it’s to assign one or multiple variables to be used. There is two patterns that I often see. The first one assign the variable or property directly.

if (/*what ever*/) {
    this.icon = "icon1";
}
else {
    this.icon = "icon2";
}

The second approach set the value into a temporary, scoped, variable and at the end of the IF assign the value to the field/property.

var iconType = "";
if (/*what ever*/) {
    iconType  = "icon1";
}
else {
    iconType  = "icon2";
}
this.icon = iconType ;

These two examples could be that instead of assigning to this.icon would be that we call this.featureMethod(icon). Like the two examples above, in the first approach, you would see the method twice, while on the second approach you would assign the value into a variable and have the method call once at the end. The first approach is appealing because you do not have to assign a temporary variable. However, we have code duplication that doesn’t seem to bother most people. The real problem is in code maintenance. If the method that needs to be invoked change it’s signature, you have two places to change instead of 1. If the IF become with more condition (else if) you will have to call the method (or assign field/property) few more times instead of just keeping a single call. These two argumentation leans in favor of the second approach and there is more. The second approach is cleaner in term of figuring out what is going on. The first approach is taking a decision and executing at the same time. If you look at the method, you cannot have a clear view of what is happening. From top to bottom you have multiple sections that do a condition check + action. Thus, the second approach is cleaner. We could even break down the code into two distinct part: arrange and act. We could refactor the method into 2 sub-methods which one determines the values to be used and the second that set values or calls methods.

I am bringing that point because the first approach seems to be taken with the argument that it’s the same as the second one. The real justification is that the first one is taking 2 lines of code less, hence faster to type which make it an easy default choice. If you are using the first approach, I suggest that you try for few times the second approach. You will see the benefits slowly when working and modifying that code again in the future.

here is an example of 3 temporary variables

function getMyLink(threshold: number) {
    // Default
    let url: string = "http://aboveHundred.com";
    let className: string = "default";
    let padding: number = 0;
    // Logics
    if (threshold <= 0) {
        url = "http://underOrZero.com";
        className = "dark-theme";
        padding = 100;
    }
    else if (threshold > 0 && threshold < 100) {
        url = "http://betweenZeroAndHundred.com";
        className = "light-theme";
        padding = 200;
    }
    // Assignments
    this.url = url;
    this.className = className;
    this.padding = padding;
}

If the next iteration of changes in the code requires to change one of the assignment to other variable, we have a single place to change. If instead of assigning we need to return something, we also have a single place to change.

function getMyLink(threshold: number) {
    // Default
    let url: string = "http://aboveHundred.com";
    let className: string = "default";
    let padding: number = 0;
    // Logics
    if (threshold <= 0) {
        url = "http://underOrZero.com";
        className = "dark-theme";
        padding = 100;
    }
    else if (threshold > 0 && threshold < 100) {
        url = "http://betweenZeroAndHundred.com";
        className = "light-theme";
        padding = 200;
    }
    // Now we return
    return `<a href="${url}" class="${className}" style="padding:${padding}">Click Here</a>`;
}

In term of flexibility, you may have to define these variables but the code is structured to be well resistant to future changes. Also, when a function requires a lot of assignation, it is often a case that the method will be long. It means that it’s even harder to have an idea of what is going on if assignations are done all over the function. I strongly believe that while assigning a long list of variables can be cumbersome that assigning them directly to several places reduce the readability and introduce more error (like forgetting one assignment in a specific case which keep an old assignment).

There are pros and cons in both, but the one I illustrate has more pros than cons in my opinion.

To recap about the advantage of having to assign values and then calling or assigning:

  • Remove code duplication
  • Easier refactoring since only one signature to change
  • Clearer readability of what happen into a method
  • Allow faster refactoring into smaller methods

The bad habit of hiding features with context menu and double click

Once in a while, I feel that some subjects return on the table wherever I go. One of that subject is where should we but a button to launch a specific action. While this is totally a valid question, the repeated problem is that at some point in the conversation people focus on the easiest way to do it instead of the best way to do it. Of course, the best way is always subjective, but the wrong way is normally accepted by more people.

Let’s start with some premises. No one can argue that an hidden feature is a bad. First of all, the name says it: it’s hidden. Users will not find it easily, thus not use a feature that cost money to the company to create. This can be even more drastic than that, people may just leave your product because they cannot find how to do specific action — the software is slowing the user down, make him frustrated. Worse, when evaluating a product, this can be a turn off because the user will not even notice that your product has that feature compared to your competitor which has the feature in the face of the user. Second, an hidden feature make a occasional users forget about it. Even if it was written in a documentation, the user will forget about it and not use it. On the other hand, if this one is clear in your user interface, there is a bigger chance of re-learning to use the feature because it was in a natural place, a visual place.

This lead me to two principles in Web Design that are wrong. The first one is the right click to have a context menu and the second is the double click event. Right clicking was something that developer high-jacked in the late 90′ to block people looking at the source code (Html, JavaScript) of a website. Some people were displaying an alert window saying that the source is not available. That trend didn’t last very long since browser got more and more incorporated with developer tools and some work around was possible. It was also very annoying because no default right click menu was present. Users couldn’t RightClickWebsave image for example. It’s been for a long time a well known pattern to not interfere with the browser context menu. The rule is that users expect something to be consistent across all browsers, all pages. Right now, when reading that text, you can save that Html, save images, copy text, reload the page, etc. Theses actions are also there on any websites. This is what users expects. Would you be surprised if I tell you that the way to start commenting on that blog would be to right click this article and select “Comments”? Well, for me yes; for most people too. This is why, below this article, there is a form with a submit button to send comments. It’s clear, it’s obvious and not confusing. However, some people would argue that it takes place for a feature that is not used that much, thus, should be in the context menu. This kind of argumentation is recurrent everywhere in the industry. This is wrong in most cases. The major exceptions is if you have an online text editor and that you want to have specific actions on selected text for example. But even there, a toolbar should let you do the action. The problem with right clicking is not only that it removes default right clicking actions but is that when you open a webpage that you cannot say where you can right click to do any action. Can I right click the article? A paragraph? Specific words? Just the menu? It’s the game of trial-and-error with more loser than winner.

The double click on Web Site also comes from Windows Application paradigms where you can double click a folder to open it and from very popular software like Outlook. However, double clicking in web were not supported until the last few years. While some limited use cases may be okay to use double clicking, it is not for most scenarios. Double clicking shares some problems with context menu — it’s hidden. On this website, can you tell me which Html element you can double click? Of course, you can double click any word to have this one selected : like expected on any reading or writing text software, but other than that? It is impossible to know. Can you double click the “Build Status” to get the full report? Can you double click a user name in Facebook to have this one added to your friend list? No and no. In fact, double clicking is even worse than the context menu because once executed in trial-and-error the action is executed. At least with the context menu, you could see the hidden feature before triggered it. It is also worse than the context menu because double clicking depends of the speed of the user to double click. It’s not for nothing that you can configure the double clicking rate in all PC settings. However, this is tricky for a user. Even young software engineer in good health can sometime not double click at the right rate, hence clicking twice. Double clicking is sneaky, because if you single click twice fast you trigger the double click event. If you single click twice slow, you trigger twice the single click event.

The solution of both of these hidden features is to think to a good design. You can most of the time create button to do actions. You have a group of action, than you can create a toolbar or you can create a button with a dropdown of action. You needs to have something big in a very tight space; click to expand that space to let you do more and than contract that space.