Working From Home Desk

The Reason

Working from home can be challenging for many reasons. My situation is that I have a four years old and a one-year-old child. I like working from home but not confined in a small room alone. With Covid-19, I have been working from home and my old Ikea desk was flawed. I had to barricade my surrounding with cardboard boxes to prevent my little crawler to pull my wires. Also, the older one was never sure when it was fine to come to see me or not.

My Ikea desk before the creation of a new desk

I decided to plan out in Sketchup a prototype of a desk that needed to fulfill some requirements.

  • Having my wires safe and visually limited
  • Having some privacy when in meeting
  • Having a way to work seated and standing up
  • Being able to conceal my work area in a clean way
  • The laptop needs to be cooled down in some way

The Plan

The idea was to create a desk slightly larger than the Ikea desk that I had, hence getting to 4 feet but having a desk not deep with less than 2 feet. The front portion would have my laptop and an easy way to swap between my working and home laptop while the back part would hide the second monitor. The second monitor would rise and decline when needed.

Design in Sketchup after about 30 hours

The desk is built for my size. I can sit or stand without having to change the height of the desk. There is a panel in the shape of the number “7” (right side of the previous image) that can be at the top of the desk to completely close it. The same part can be moved on the floor to act as a privacy wall (pictures later). The monitor is about 27 inches wide and is a hook on a Vesa stand. The stand rolls on two industrial drawer frames and a 26 inches actuator can move the laptop into any position along its length by a rocket switch. This way, it is possible to hide the monitor, the rise it in a comfortable position when seated. or rise it high when standing up.

Open Area Challenge

I am working in my living room. I could move my desk inside the master bedroom, but I would be alone for many hours every day. Also, during the evening when I am working on my personal project, I would also be far from my wife. The idea of establishing my home office in the open area added some challenges.

The desk Closed Up

There is a 30 inches area right next to a window in the living room. The place is strategic since it is hidden from the first look when visitors come in the house, but also give a great exterior view. During the evening, I can see my wife and talk to her while she watches television. During the day I can see the kitchen, the dining room, and some life moving around the house.

Home Office in the Corner of the Living Area

The space is very open and is designed to be light in term of object. The desk needed to follow the idea. Hence, when closed, it does not seem busy. While not perfect and simplistic, the rectangle hide papers, the keyboard, mouse and monitors.

Sitting Position

Most of the time, I will be seated. I needed a desk that the height is in the right position for me to be seated with my feet flat on the ground. I desired to have surroundings to hide most of my stuff while working and also reduce the amount of noise. While being in the corner, there is still a way that my baby crawl under the desk to pull the wires.

Desk in Seated Position

The idea was to have the top of the desk removable to use as a privacy wall. However, to be manipulable, it needed not to be too big. I also wanted to work upon my feet, hence needed a way to separate the back area where the monitor resides and the front where the working area sits. The following angle shows that the back area’s top board is separated from the front. The monitor is barely noticeable. The back area has two hinges that let the monitor expands.

Desk Environment

The desk has the left area which is the main console. The laptop shelf is at the right height for when I am sitting. The desk has enough space on the right to write on a paper and enough space to have a glass of water in front of me.

My Main View

The main console has a rocket button allowing to move the monitor up and down. In the previous picture, the monitor is up in the middle of its full extension. The microphone seems to be in the way, but it does not block the view at all.

Main Console

The laptop area has the Scarlett Solo that manage the audio in and out of my laptop. In the middle, it is an AC Infinity fan controls. It is connected to two fans and a probe that detect the temperature of the laptop. It adjust the fan speed to reduce the heat. Lastly, on the right side is the rocket switch to move the monitor.

The Two Fans and the Temperature Probe

For the curious, the probe catches a temperature of 102F when the system is off while running a couple of Docker containers, 4 VSCode instances and 30 Chrome tabs. When the system is on, it reaches about 86F, there is a 16F reduction of heat.

Standing Up Position

The standing up position is a matter of placing the “7” shape on the track. The laptop can be moved manually up and the monitor must be leveled up by the switch prior to closing the desk. While it might not seem logic to hide the switch in the desk, it is the more safe way to avoid one of my children playing around the motor.

Standing Position

It is my first wood desk creation and does not close perfectly but is good enough. There is no room for my feet, so in reality I need to have the laptop a little bit closer to the edge. Nonetheless, it is possible to work standing up. In the picture, you can see a black wire. It is the only wire leaving the desk. It is from a 1/4 inch hole with the source of a powerbar.

Privacy Wall – The Box Office

The panel in shape of “7” can be turned around to close the office. It creates a box.

The Top used as Privacy Wall

In this configuration, the 4 years old knows that she cannot come talk to me. The 1 year old one does not see me or any wires. I can have video meeting without having a curious face getting in front of the camera and people can move around without fearing disturbing a recording.

Top View

Satisfaction

For a first creation and many first times like staining, cutting plywood, installing an actuator, making pocket holes, etc I am satisfied. The desk is exactly like I planned.

After using the desk for a couple of days, I can see some improvements. Some improvement concerns the aesthetic: the stain is not uniform, the bottom of the privacy wall paint wasn’t top coated properly and the wood begins to show up. Some concern the location like the headset which I had to post-cut because I was touching it with my left knee. Some are related to fragility like if the back door is not open and the monitor gets up, the motor push upward and the door does not open but cause the screws to lose. Another slight issue is the detail of the cut. It is not perfectly square and closing the desk shows the imperfection. The wires are hidden but more attaching is needed.

Like any process, the experience gets into play. I’ve learned a lot and I am glad about my work. I have learned a lot by actually doing and will be able to use that knowledge later on in the next project. Now, I am ready to work from home in an environment I like, on a desk I built from my hand with the design I had in my mind.

Book of 2019

Since few years (2018, 2017), I dress a list of books that I read. Last year, I recognized that a brief summary along the book titles would be beneficial for me and for the reader.

Machine Learning

I started the year with a few books/reading on machine learning. I took the class “Machine Learning” at Georgia Tech as being one of the required course to get the machine learning specialty on the master degree. The book in question was “Machine Learning” by Tom M. Mitchell. This book is tough to read. It is deep in the subject, and it is not written for a beginner in the field. The cost of book is very high (above 100$), but there is a second hand for about 20$ which is the option I decided to pursue. The book is a must for the master degree because of the low quality of explanation in many courses. It is unfortunate but true.

Get it Done

In parallel, I started the audiobook “Get it done” by Sam Bennett. It is a medium size reading of 6 hours 21 minutes. The book begins with the perspective that everyone is an artist and to transform fears into curiosity. There is the notion of performing something every day for 15 minutes. Procrastination is seen has three folds which are that you do not care or that the goal is a shadow goal which means that you have no joy executing. The second fold is that it might not be the right time to accomplish the task. Finally, procrastination might be a sign that the task build-up fears. I could resonate on the last point. The solution is to acknowledge the fear and to see the task as fun experimentations where failing is okay. In the end, having this attitude let you go through and most of the time success will be there. There is an allusion that most of the time small details appear more prominent than they are and to illustrate this idea the picture of looking back to something that was stressful might surprise you that what was causing the fear was more significant than it was in reality.

The book is a series of small tips to change the perspective when motivation become the culprit of moving forward. From basic idea like breaking down an idea into small chunks, to taking the time to ask “does this matter?” are proposition. There is the notion of “pure preference” which is to have only one target that might sound crazy but to keep on the list as something that we really would like to do. Comparing idea like if they were in competition one by one and assign a point to the winner until all of them are compared is another suggestion. Other suggestions are more funny at first. For example, to rename a task into something more funny which make it less boring when it is the time to execute it.

Here are a list of suggestions from the book:

  • Create a “Could do” instead of “to do” list
  • The way to present or talk about something might not be the good one
  • It is important to develop a daily habit
  • Breakdown ideas and organize them (shuffle)
  • We have a left and right brain, we can keep the part of the brain that wants to do something by doing a repetitive movement (e.g., walking, taking a shower) to have the right part of the brain use its imagination to daydream.
  • Keep track of the progress, write down all wins and learning (positive) because we have a tendency to have a bias toward negative
  • Aim for a grade of ‘C’, not ‘A’ which is probably already more than the definition of a ‘C’ for other people.
  • Ship early, once the ship sails it is easier to stop procrastinating and adding details. The example in the book is around “WordPress version 2”. It reduces the stress of delivering something perfect.
  • Create a list of all that must be done with the cost of money, the time, the return on investment and determine what is really worth it.
  • Instead of over-complicating something, ask people to find someone who already did it.
  • Write a list of all the positive words people give to you. Refer to it.
  • Always have deadlines. People have tendencies of executing on the last minute of them. People on time or early only move the deadline closer.
  • Write down 30 ways to get what you want. Might be all possible but probably one of the items is good.
  • Delegate what can be delegated
  • Set clear boundaries between phone, sms, email, work, relaxing time.
  • Un-clutter your life. This section of the book is significant and take more time that I would expect. There is few good tricks but it was not the part that touched me much. One trick was interesting which was to pretend that someone need it, would you be willing to help that person? If naturally you would without hesitation, then it might be a clue that you can let go the object.
  • Adding the word “sometime” in front of a negative feedback can help to digest the idea.

Finally, the book is good, but the last part was not interesting to me with trick about getting unclutter. However, the best advice I read was the following. “There is no shortage of success, there is no need to be jealous or to envy others”.

Influence – The psychology of persuasion

This book is from 2016 and its size is above the average of what I like to read with 10h of reading on the topic of explaining why people say “yes” more easily. The book starts with the notion of making thing simple, not simpler.

The book is packed with psychology studies and right in the first half hour I was hooked. The first advice is to stay simple. As much as possible. However, it does not mean to aim for simpler. You can do something very complex but you need to present it in a simple manner.

Giving a reason help tremendously when asking for something. The first chapter talk about some word that trigger some behavior. For example, the word “because” is very effective when desiring something.

The following notion is about “perceptual contrast” where our decisions are influenced by the former choice even if the choice was not accessible. For example, rating people while watching a movie with pretty people were causing people to rate lower if they were watching a movie with pretty people. Or, for example weighting two different objects, something that might be light will be heavy depending of the first object touched. It leads that when selling we should sell the first element first which make all the accessories future sell after. It easier because these smaller sells look very small. It was proven that people take these optional choices in higher probability in that order than if they were presented first. It also works when annoncing something. Starting with a big new will make all the subsequent information easy to swallow.

The principle of reciprocity is natural for every race, nation, time in history that when you do something nice for someone that the person will desire to do something nice. One tactic is to give something small (e.g. a flower) and expect the person to give later something of a bigger value or importance.

The principle of concession: if someone reduce what he wants in a negotiation, you will be prone to reduce also. This is why people starts with a higher bid and go down to what they really want.

The principle of consistency. Once someone is involved will suddenly be a strong defender of the idea even if a few seconds before taking the decision the person was on the edge. For example, if someone bet on a horse before a race, the person might hesitate but once the bet is settled, the person will be a fierce advocate.

We become what we do. E.g. POW that was threatened as collaborator started to behave like collaborator after a while.

People comply once they have one foot in the door more easily. E.g. Once someone accepts a small sign to be placed on their properties they had less resistance to change the sign into a big one later on.

Strategy of commitment. Once you commit to someone you do not want to break the promise hence would do more than initially planned to fulfill the engagement. E.g. Advertising a specific toy before Christmas with the plan to have kids to ask for the toy. The manufacturer does not produce a lot of toys to have parents buy another kind of toys. Then, after Christmas shows again the advertisement with abundant supply causing the kid to pressure the parent to fulfill the initial promise, hence getting the toy (additionally to the one from Christmas).

Writing something down increase commitment. Public commitment is stronger than having the decision only in your header. Human wants to look consistent.

The more it is hard to join, the more the feeling of belonging increase as well as solidarity among the people of the group. There is something around the increase of commitment. E.g. university performing hazing activity to join a student fraternity.

It is more efficient to have a very small reward for a big commitment. Motivation must be from inside the person to be strong in time. The person must feel a personal responsibility and not motivated by the reward or a consequence to behave. E.g. The kids who were saying that a teacher would be angry if they played with a particular toy. It was working until the user was substituted. The group of children who had to receive the consequence speech played with the toy above 70% while the other group who was explained why without reward or consequences played with less than 20%.

Social pressure has a big impact to motivate action. The book presents the example of a kid that does not want to swim until he plays with other kids who can do it and the example of how sect operate. The social proof is that in doubt people follow the herd. It is the concept of pluralist. However, unknown groups of people have less change to help because they are unsure of how the emergency is and people in the group do not want to look bad. The trick is to remove uncertainty. In case people are uncertain, they will copy others.

The halo effect is when one characteristic overshadows all other traits of someone. Physical trait (how beautiful is someone) is a common one but being kind is another. Similarity produces liking. A compliment works well, even if fake, even if known to be fake. People like more people that have something familiar. They will believe the person more if they can think alike or have a similar past.

The book slightly discusses the influence of seeing something much time that alter our judgment. E.g., picture flashing in sight. There is also guilt by association. E.g., a pretty girl next to a car will influence the first impression of the car from the pretty girl.

The idea of performing something kind create the feeling that the person is on the same side and hence will be willing to comply easily. E.g. if a waiter at a restaurant help you save some money on an appetizer, his tips might be bigger at the end or a future more expensive suggestion will be with a higher probability to be accepted.

The fear of losing something is bigger than the motivation to gain. However, hope is a bigger motivator than fear.

Scarcity is something that seems rare in appearance and the feeling worth more for people. It works with time as well.

The Hundred-Page Machine Learning

Another book I read is “The hundred-page machine learning book” by Andriy Burkov. I read this paperback book as a compliment for my first course. The book is short and does not cover every topic as I was expecting (for example KNN is 1 page long) but the first chapters were interesting for me. Mostly about the notation summary because some of the topics were far in my memory.

Sending Telemetry from GraphQL under NodeJS without Spamming

I am running a GraphQL server hosted with NodeJS. Under the hood, it is an Express server with Apollo Middleware. I am collecting different information along with errors. The way it works is that I have an Elastic Search server fronted by a simple REST API facade. The endpoint on the REST accepts a single telemetry payload or a collection of them. The latter is recommended. When I use the REST facade for the web, I am collecting all call to the telemetry and batch the request every 5 seconds or when the browser send a beacon (when leaving the page). It reduces the load on the server by limiting the number of HTTP requests. For example, if there is 24 different telemetry within a few seconds, it performs a single HTTP request with 24 entries.

Telemetry Information flow

Under NodeJS, I could do something similar with a timer, but while reading how the DataLoader library handles batching I thought I could code a similar pattern. Instead of leveraging on time, I could batch every telemetry on a single NodeJS event loop. In the NodeJS world, this is called a “tick”. There are two ways to accomplish the batching, and I leaned on setImmediate.

NodeJS Event Loop

The idea is that NodeJS runs in an infinite loop. It is possible to mention to the system to prepare the execution later, on the next loop with setImmediate which execute when the “poll phase” is completed. setImmediate is different from the setTimeout because it does not rely on a time threshold. Often, libraries use process.nextTick. it processes the task after the event loop. I avoided using process.nextTime because in some situation it can cause an infinite loop with recursivity. setImmediate is enough to delay the execution. In fact, after using it for more than two weeks, every telemetry collected within a single GraphQL request are batched together which is perfect in my case.

The code in the NodeJS server is short. There is a class that consists of a boolean field that indicates if we have batched information. By default, the value is false until we invoke the first time the function to send the telemetry. When the flag is true, we keep calling the code that will add into an array all the telemetry but we do not call the function performing the HTTP request to the API; we wait that the setImmediate function callback is executed. When this one is executed and returned with a successful HTTP code, we copy the content of the array, flush the data from the list of telemetry, send the information and turn back the flag to false. Ready for the next round of batching. While the code is sending the telemetries, other telemetries can be collected. The data is added to the array to be sent. In case of failure, the data is added back to the next batch.

public send(data: TelemetryPayload): void {
    const dateToSend = { ...data };
    this.listDataToSend.push(dateToSend);
    if (!this.isTransmittingQueuedPayload) {
        this.isTransmittingQueuedPayload = true;
        setImmediate(() => {
            this.send(); // Perform HTTP requests with all the this.listDataToSend
        });
    }
}

Overall, the code is pretty clean and under one hundred lines of code. It reduces drastically the number of HTTP requests while being easy to read once we get the setImmediate detail clarified.

My Other GraphQL Blog Posts

Georgia Tech Online Master Degree Review after 4 semesters

I joined Georgia Tech Master of Science in Computer Science at the end of 2017 with the idea to expand my knowledge in an area I am interested: machine learning. At the time I am writing this blog entry, I have completed four semesters with a grade of “A”, and six more remain to complete the program. At the dawn of the Summer 2019 semester, I decided to drop-out the course I had started two weeks earlier named Reinforcement Learning. The major reason is the arrival of my second children. Nonetheless, I wanted to give some insight into the online program for those who might be interested in. Complimentary to my opinion, you can find online the website OMSCentral where students write reviews on each class. I decide to write non-anonymously my genuine unsatisfaction with the program.

Your Past Drive your Present

My journey started smoothly with my first two semesters with database systems concepts and design and network security classes. I was familiar with most of the topics and took these two classes to refresh my knowledge, to get accustomed to school after 10 years of hiatus and because Georgia Tech requires to have in the first year two semesters with a grade of “B” or higher.

Why is your past driving your experience at Georgia Tech? Let’s start with my context. I have been working with databases for more than 15 years and also working with the web for 20 years. While a lot of students were expressing their difficulties with some assignments that needed database or web knowledge to accomplish the primary goal, I was fine. However, in hindsight, it should have rung a bell. To be able to spend under 15 hours per week, you need to understand at least half of the topics or master assignments technologies before debuting the semester. For example, in the network security class, one assignment was using Wireshark, the next one a MySql/PHP database/website and another one a Chrome extension in less than two months. I am not even talking about the nitty gritty details of having to know different languages (Java, Python, etc.), libraries (Pandas, Numpy, Mathplotlib) and environment (virtual machine, Ubuntu). In the database class, there was a considerable amount of the final assignment directed toward building a PHP website which is not needed to evaluate database knowledge. It was not a predicament for me, but you can foresee that it could be a burden to add on top of a lot of new theories. The factor of serendipitous knowledge to have gathered play a significant role in the grade. However, this is not the worse. The worse is that satellite knowledge plays a vital role in your success. Even fast learners will feel strain because while learning the main subject, the numerous homework, assignments, and projects on top of the exams appear as obstacles instead of learning vehicle. The rapid pace impeded a full understanding. The numerous tangential topics, not well weaved together, constitute in mixed knowledge that looks like a bag of feeble crumbs — ready to be blown away on the next gust of subject. Riding on a slight understanding and moving on and forget what you barely explored to try to catch the next train is the modus operandi.

Instructors Do Not Teach

My third class was machine learning for trading. I learned Python on top of an overview of machine learning technics. I overall liked the class but the head instructor was mostly absent with acerbic interactions. A trend that will get worse with machine learning and reinforcement learning class were instructors are answering student in a bully way. There is absolutely no way that the words used by the instructors would be accepted in any workplace I have been working. When students ask questions, you can bet to read a scathing response. An example of a reply: “You know it in your heart” is a classic of one of the instructors instead of answering. The responses disincentivize people who misconceive subjects and would like further explanation. The public humiliation of receiving a stark reply stem, in some classes, the propagation of anonymous question (which is possible with the system) or simply in my case demotivate to ask a question.

The lessons recorded could be condensed by half or with better more thorough examples. Most of the lessons are full of pleasantries which sway the main topic out of attention and also increase anxiety because you want to understand but you keep having an instructor taking the subject lightly. In some class, it’s a recording from an actual class at Georgia Tech. The recording is made from a laptop. You can imagine the sound and video quality.

Finally, the instructors refer regularly to whitepapers. Not as a complementary explanation but as a way to avoid providing a thorough explanation in the lesson. I acknowledge that delivering clear explanations require a considerable amount of preparation but I entail, by joining a top 8 University in the USA, to not be deflected constantly toward whitepapers. Swimming in new concepts and having someone referring that the remaining of the information is in a whitepaper (and insinuating that students already have read them), is a lack of professionalism and it makes student losing track of the lesson. I found not educational to stop the lesson and to jump into these scholar papers, which takes hours to read instead of taking a couple more minutes to explain the essential parts.

Interaction on the Student Forum

This is a good segway into the type of questions. Most student will ask questions in the school’s student forum called “Piazza.” Most questions are on the assignments and not on the lecture or reading. The focus is on where to get points, not to dive into the topics. Because my goal is to learn and not to get points, I asked several questions in all my classes. Most of the time, students were answering me. The instructors and TA (teaching assistant) were absent or focusing on the assignments. Many TA couldn’t respond properly, either. An example, I took the time to screen grab a portion of the lecture asking why the instructor was mentioning an increment of +1 in a formula and after more than four messages the TA was saying that I could use anything. But, when I challenged him about why the lecture was explicitly using +1 or if I can use any constant, the TA was answering me that it would be better to have decay. I still couldn’t get an answer about the +1. I understand that the ratio of TA/students is enormous, and they are overwhelmed to grade assignments, but I was expecting more. I’ll skip some frustration with the delay on getting grade which does not help to give you a pulse if you are working on the expected way. In most courses, you will get your grade for the first assignment once you have submitted your second, hence could not really apply the feedback from the former on the latter.

As briefly mentioned in this section, what astonished me the most is that questions are all related to assignment not the lecture or the books or papers to read. I interpret (and also read in some Slack discussions) that people skipped the lecture because of the reasons discussed in this blog article: people are aiming to get the grade and move on. I am on the side that I want to understand and move on.

Grading and Subjectivity

The grading is also subjective and questionable. During my machine learning semester, I jumped throughout the session between 55% and 96% in my grade giving the same amount of thoroughness. The change was different TA was grading me. Also, even if I was answering every requirement there was always a place to subjectivity about that I should have discussed furthermore on different tangential topics. I noticed similar grading fuzziness in other classes where the report has an unclear requirement. Asking clarification on the assignment does not result in iterating on the assignment requirements but to many scattered posts in the student forum which make it hard to follow. I also found that it penalizes student that was not last minute. Many time, a clarification was changing the meaning of something and I had to re-write. Furthermore, asking for re-grade is heavily skewed to make you fail even more. In two of the four class, they had a blank statement that a request to re-grade would lead to a penalty if they deem that it is not justified.

Time Invested

I finally withdraw my fifth semester because of the arrival of my new child. On the first day of the semester, my son had three days. I naively thought that I would be able to handle one class while being in parental leave. I was wrong.

So far, I spent roughly 14, 14, 18, 23 hours per week on the network security, database systems concepts and design, machine learning for trading and machine learning. However, with this fifth semester, with reinforcement learning, the first two weeks cost me 28 hours each. The difference was that after these 56 hours invested was that I was mesmerized if the assignment was really for the class I had studied or not — a complete disconnect other than few keywords.

The time invested varies depending on how you understand the topic has mentioned earlier but still there is something wrong. It is supposed to be possible to mary work and school while you have to sacrifice a lot. In the two first weeks, I had to do two homework and starting to work on the first of three projects but also to listen to several lectures of the class and the complimentary David Silver lecture. These lectures are highly recommended but because the explanation is more detailed. However, it is from another school. I was projecting a considerable investment in time because that in reinforcement learning you cannot copy the code from other students. You read it properly. The machine learning class is basically copying from previous student assignment and write a report. It is explicitly mentioned to do so in the assignments. Naively started by not doing so, and spent way too much time.

Nonetheless, in reinforcement learning, even with all that time spent studying the theory and linking all possible source of information, the reality of misunderstanding stroke: the homework and projects were so disconnected that I had little idea what to code in Python. The additional stress of not having any base to start or any clues was un-educational. I agree with struggling to learn and working hard but not to ferment in limbo waiting to salvage pieces of information from a flood of questions asked by different and lost students. What was worth the majority of the points with the growing misunderstanding of the topics stirred with the lack of sleep led me to take a break. While the time invested every week varies greatly depending on if you read and watch all the material there is also group homework that can play for a big factor to increase the charge.

Team Assignments

The value of teamwork is often undervalued, however, is inadequate for a remote master program. The outline is that people are spread across different timezone. The variance between each individual in term of skills can be considerable. The motivation of each member between wanting to perform or just getting a fast degree can add additional load on your shoulder. And, the classic burden of teamwork where half of the team is doing everything and the other half just wait is influencing quite a lot the time you will invest every week.

The first two courses I took had team works and I got lucky to have people with more or less 3 hours different timezone and also found people who were not last minute. In one of my team, one guy was almost 12 hours difference influencing my choice of also communicating with him before my daily work. Nonetheless, consolidating family, work and school is already a challenge. Teamwork is adding an additional layer of difficulty. It makes the idea of flexibility out of the equation if you are diligent in your assignments.

Flexibility

One aspect of online school is that you have control of the time you want to study. But, the reality is far from being flexible. All the course I had, so far, was giving so many homework (up to eight), with additionally many projects (up to three), with a mid-term and final exam. And, most works require prior reading; hence, when you schedule all that needs to be done you are entirely on a rigid schedule. The critical path of success is on a thin line where shifting a couple of days is possible but increase the backlog of work later.

Also, most assignments are not in a final state when delivered. I had several graded works being in a draft for several days, and then being amended during the short allowed time to perform the work. The fluctuation of requirement disrupt people who like to start ahead. I am someone who likes to start and finish as soon as possible and got bitten often by assignments that were changing throughout the weeks hence favoring people working late. The result is that you have barely any room of flexibility. Once again, if you add that you need to schedule with many teammates, then you are simply in a corset with a low room to breath.

Quality of Content

Concerning the quality of the content, it varies but mostly poor. You can see it by yourself because most are available on Youtube. The problem is, depending on the instructor, you will get a lot of low production quality classes. Audio is often poor, examples borrowed from books, calculus are skipping step making it hard to follow, many jokes that make the week lecture up to 3 hours (the recording varies between 45 minutes to 1 hour 30 minutes but if you take notes it will go up to 3 hours). My main complains is that in 2019 we have so many good quality content for free on Youtube that Georgia Tech must level up their production quality with the aim to teach instead of just covering a topic. For example, covering a critical algorithm (I already forgot the name) in less than 5 minutes video is a blasphemy. Also, there is a lot of variation on the notation used compared to the whitepapers which might be accurate but very confusing for a neophyte to decipher whitepapers.

The lecture were made not with student trying to learn in mind. For example, when the instructor is writing formula, often, they will in post-production cut the time when the instructor is writing — it fast forward into the end of the writing. It forces the student to pause the video when taking note. Thus, the lecture are most of the time taking way more time because of all the time we need to pause. In real classes, student can write at the same time of the teacher, or take time to digest the message before moving on. Not that time, this responsibility is on the student. Furthermore, in real class, you can raise your hand to ask a question, get unblocked and being able to continue to follow. With recorded lecture you can only continue to watch and ask later in the student forum, or search online. The disconnect does not seem harsh for someone who understand but for someone not getting a specific key point in the theory can cause to not master well the remaining of the lecture. I am autodidact and found myself in few situations not totally sure about what the instructor meant. If the instructor is using personal jargon then you are on yourself to find out what it really mean because Internet will not be helpful. Relevant to the format of the lecture, because they were recorded many years ago without alteration, unclear passage remains unclear. In traditional lecture, the teacher can learn from the questions of the student and clarify next time this one give the course. This natural iterative improvement is not present in Georgia Tech because they do not actively work on the content.

In two classes (network security and machine learning for trading) I had to buy a completely useless book. The former one was a hacking guide that had literally 1 useful page. The latter, a book in which the instructor was the author — 60$ for a thin hundred pages. The concent could fit under 30 pages in letter format. It was a summary of the lecture. In my perspective, it reflects the quality in the sense that there was a lack of diligence on how useful was the recommendation. But, also an absence of a neutral party to asses how self-promoting material that compensates the instructor directly.

In each class, the enormous quantity of different topics can be reduced by half in most of the time. For example, in machine learning for trading, there is an entire class on “options trading,” but it is shallow without any real use in term of machine learning. In this same class, there is a whole week on the 2008 crashes. It is not related to machine learning. Hence, a couple of week of finance could be trim to go deeper into the machine learning part. Similar in the two classes, machine learning, and reinforcement learning. In the machine learning curriculum, we touch reinforcement learning. If you take reinforcement learning, two of the lectures are copy-pasted. Instead, in the machine learning class, a better explanation of non-reinforcement learning should be approached.

Finally, in every class I took, students who completed the course suggested to watch David Silver or Andrew Ng. But why? Because Georgia Tech courses are built in a very bumpy way. The lessons start slow, with (too many) jokes and then get shallow on many topics and then dive very deep in calculus, so much that you have a hard time to grasp concepts because of all the shortcut taken and references to whitepapers to finally get back very into a totally different aspect of the main class. For example, you can have 2 hours on game theories and then 1 hour on Markov.

Fearful Approach

Not every course, but some are driven by fear of expulsion of the program. For example, sharing student notes are forbidden. Even, sharing a few lines of code is forbidden while asking for guidance. I understand to not share a whole project or homework but there is a valuable learning value to have examples. The paucity of example provided by professors/instructors in the content lead to dry land of application. Fears drive students to stop asking a question or helping others. In the end, the only benefits are the TA who are grading the same assignments every semester because they do not have to change the requirements. Once again, students pay the price of not fully spread their wings to understand but are pushed to stay alone in their corner. That being said, side “Slack” discussion that vanish a few hours later are useful and seem to be one way to avoid the circumspect law that reign.

TA/Instructors Video Interactions

I was able to assist to a single online “live” video interaction with TA and instructor. For a program driven for “online” education, it is very awkward. The time for these “live” video-chat are supposed to be a time where you can ask questions and to have a discussion but it is more driven by students asking questions in the forum (Piazza). My experience is that you cannot rely on these because the time is mostly the wrong one for your life. For example, in one class, the time was while I was at my job and the other one in my commute back home. That being said, they are recorded, so you can listen to them later with the disadvantage that you cannot ask questions or clarification. In reality, I watched twice a recorded TA/instructors video because they are a very slow pace. Usually, the modus operandi is that the lead will introduce all the TA then read the Piazza questions and try to explain something. Most of TAs do not have a microphone outside there headphone or laptop hence the quality is low. For me, the worse is that every question launch discussion that digresses the main question. 10 questions could lead to 1 hour, and as mentioned earlier, time is scarce.

Am I alone?

In most of the classes, there is some side-discussion in Slack channel where I was able to catch the pulse of panic of most people not understanding what to do. In this last class, I asked the question to the few people that were already doing homework and projects in advance on how they were able to be so productive. Their answers are that they already work in the domain, hence understand most of the theory and only need to ace the work, or it was that they had another class that was similar and could relate better what was needed. Finally, the answer is that the “grading curve is generous” meaning that I could continue to walk head down and get a low score (as 40%) and be able to get a “B”. I was astonished by this last answer but I realized that it was true: most people can get an “A” or a “B” with a very poor understanding of the topics taught.

Browsing the OMSCentral or Reddit, and you can find several people aligning with the overview I am presenting in this blog article. The problem is that there is no direct was to provide impactful feedback. I would have flag many instructors and TA as inappropriate during the semesters, but there was only a last week feedback form which does not seem to do anything because a lot of complain about some classes lectures are still the same five years later of their inception. Even worse, one of the most inappropriate interactions from Georgia Tech I witnessed lead the professor to become a dean. It is clear that the quality of teaching is not a priority.

Why is Georgia Tech not Improving?

Georgia Tech hopefully knows, but move slowly which is paradoxical for a top University in technology. In machine learning, all the assignments have the mention to “copy” from previous students. A shortcut justified by letting students focus on the analysis but it is a sign that students lack the knowledge to complete the task. Instructors decide to employ different learning shortcut instead of guiding the student from the base to ground up. The issue is that the affinity between the classes is not well defined. A well-tailored program in machine learning should guide the student into several logical and ordered classes that go from the base to something complex. Instead, each class is distinct without any relation (and can be taken in any order which is confusing). Hence, each class is shallow and opt to different technic to compensate like allowing students to copy, or to have a big grading curve, or to repeat content from class to class.

Georgia Tech online system (student portal) to register to class is similar to the website you were visiting in early 2000. While it might not be a big deal, the scarcity of guidance on which class to take cause headache. Students must rely on an external source of information built by other students. Again, Georgia Tech is a top university and should have a system that guides users to thrive instead of delegating. I already mention that Georgia Tech delegates to whitepapers for depth in concept, and to other online videos for the poor quality of their own, they delegate to other books for the limited examples provided and now they delegate how to manage student curriculum to students. I’ll cut short in the tech-infrastructure pitfalls because this could be a post on its own (they also leaked student social security number and password). There the saying that you can judge a restaurant’s kitchen by own clean is the restaurant’s restroom. Unfortunately, you can also do the same for system and infrastructure to evaluate the quality of the teaching.

Solutions exist that Georgia Tech can put in place. A process of having someone to genuinely asses the quality of each course should be in place. Instead, I witness that the institution rather close their eyes and pat themselves on the back. For example, one of the most unsparing instructor in his way to behave with student got promoted to dean recently. He has a history of being a jerk. His contribution might be stellar in the research community does but is not a prerogative to teaching and acting poorly. Georgia Tech has a feedback system only available at the end of each semester. Nothing seems to get taken seriously. Instead, an on-going feedback mechanism should be in place without feering retaliation. It should be hosted by a third-party and acknowledged by the institution.

I understand that building a course involves a considerable amount of time. However, keeping heads down for many years (sometimes more than five years) with courses that barely involves (lecture are not re-recorded for example) foster the inadequacy of the system to evolve and to have the student as the priority.

What Now?

While I may sound negative about the Georgia Tech Master of Science in Computer Science it has something good: it really forces you to not give up and to really dig to understand. However, the school is not there to teach you anything. Georgia Tech throws on the wall as many topics they can. Then, they grade you on how you catch everything before it falls on the floor.

There is a huge opportunity for other online programs to annihilated Georgia Tech here: have a program that the staffs and materials stewards students to understand with a realistic life-balance approach. For example, the course machine learning, machine learning for trading and reinforcement learning have a lot of overlap that could be avoided in favor of a better and longer time on each topic. Also, one additional flaw of the course is the level of example that is often none or a single one taken from a whitepaper or book which is merely a requirement to understand. To alleviate these problems, clearer lecture with clear materials instead of having a list of +20 white papers would help. Creating a set of examples with step by step explanation where newcomer could consult and experts could skip is not a genius idea — it is what you get when you try to understand who consume your service.

Additionally, aiming for a maximum of 10 hours per week for a neophyte and testing that assumption would make the program available for an adult with a family and a daily job. But, quality in education requires iteration of the material. Re-recording, adjusting assignments, acknowledging feedbacks and acting on it, improving exams, etc

I patron a program that is not a walk-in-the-park, but there is a limit about how impractical you can be to make a life of people hard for no beneficial reason. Teaching is the art of making topics hard easier; not to try to make it looks harder than it is. Education is to be able to put yourself into your student shoes and to mold a better future by opening these doors that spark these “Ahhhh” moments. On my side, I believe I can succeed in the reinforcement learning class once my newborn gives me more than four hours of sleep per night. However, in the end, my goal is to “master” machine learning — not to play a game of aiming for the minimum (having my grade curved up) and hanging a degree on my wall. This is where I question Georgia Tech’s motives.

How to Collect HTTP Request Telemetry with GraphQL and RESTDataSource?

I have been working in a GraphQL solution at Netflix for few months now. I am always keen to have insight about what is going on. While I created from the ground-up the GraphQL server with NodeJS I had in mind to find a way to give me, the software engineer, some insights by providing some direct feedback in the bash console running the server but also to send the data to an ElasticSearch cluster. Before rendering or sending the telemetry, we need to collect it. In this article, I’ll write how I proceeded to get information on individual HTTP calls that the GraphQL query invoke to retrieve the information while resolving a graph of objects.

The solution is oriented toward the library RESTDataSource but it is exactly the same principle with Axios or other HTTP request library. The idea is to subscribe to a global event that is invoked when a request starts and when a response comes back. By having a hook at the beginning and the end, it is possible to collect the time elapsed without having to code something on every request.

RESTDataSource Override

In the case of RESTDataSource, it is a matter of overriding the willSendRequest function. It takes a request parameter. We will use the request to add in the HTTP’s header a unique identifier that will give the response function a reference to the originator. The second function to override is didReceiveResponse. This one has two parameters: the request and the response.

The willSendRequest function performs three actions. The first one is to generate a GUID that will serve as a unique identifier. It is added into the HTTP header. Then, the second action is to add in the GraphQL’s context a collection of HTTP requests. I created a type that will keep track of the time, the total bytes received, the URL, query string, the starting time and also the unique request identifier (GUID). The unique identifier is needed for the second function.

export interface HttpStatsEndpoints {
    requestUuid: string;
    url: string;
    urlSearchParams: URLSearchParams;
    elapsedTimeMs: number;
    totalBytes: number;
}

The didReceiveResponse function gets a response, but also the request object. Because we have the request, we can peek at the GUID and extract from the GraphQL context the information and subtract the actual time from the time saved when the request started. The number of bytes and the elapsed time is saved in the context until read by the GraphQL Extension.

public didReceiveResponse(response: Response, request: Request): Promise<any> {
    return super.didReceiveResponse(response, request).then(result => {
        const requestId = request.headers.get(HEADER_REQUEST_UUID);
        const startTime = request.headers.get(HEADER_START_TIME);
        const httpRequest = this.context.stats.httpRequests.find(d => d.requestUuid === requestId);
        if (httpRequest !== undefined &amp;&amp; startTime !== null) {
            const totalNanoSecondsElapsed = process.hrtime();
            const totalMilliSecondsElaspsed = this.convertNodeHrTimeToMs(totalNanoSecondsElapsed);
            httpRequest.elapsedTimeMs = totalMilliSecondsElaspsed - Number(startTime);
            httpRequest.totalBytes = JSON.stringify(result).length;
        }
        return result;
    });
}

GraphQL Extension

At this point, when all requests are completed and the GraphQL is ready to send a response back, a custom extension can come into play. I covered the detail of a custom GraphQL Extension in a previous post concerning telemetry and how to display it on the console. The idea is the same, this time we can read the GraphQL context and while looping through the telemetry display the bytes and time taken for each request.

Here are some of my GraphQL post of this series

Validating Consumer of GraphQL with Jenkins

I will demonstrate a way to configure a continuous environment that gives insight on if a change on the GraphQL server can impact negatively a consumer. The solution assumes that both code bases are under the same organization, not under the same repository and that Jenkins is used.

I like to start with an image of what I am describing. So here is the solution.

The diagram shows on the top-right the GraphQL engineer who push a change in the repository. The normal continuous integration (CI) kicks a build when it receives a new code. So far, nothing changes from a typical workflow. However, the third step is where it changes. The idea is to create a new Jenkins job that is independent of the GraphQL server and independent of the consumer application. The independence of this job keep both existing builds untouched and the whole ecosystem tidy. The Jenkins jobs wait for the GraphQL job to complete. It is possible to configure the option under Jenkins Build Triggers.

Jenkins’ Trigger

When the GraphQL changes, the Jenkins job fetch the consumer code and fetch the GraphQL code. Do not forget we assumed that we had access to both source code. However, if the GraphQL is not internal, you can always download the GraphQL unified schema and accomplish the same end results. In fact, the next step is to run the GraphQL server script that builds the latest schema (stitches the schema). Then, it runs a tool that validates the consumer code. The tool looks for gql tags inside TSX and TS file (TypeScript) and analyzes all queries against the schema. If a breaking change occurs, the build fails and an email is sent to the engineers to act before it reaches deployment.

The Jenkins’ execute Shell script install NPM because it must run the tool which is from an NPM library. It gets the server because we do not have a graph unified (stitched) schema. Then, it runs the graphql-inspector tool.

npm install
git clone ssh://git@youServer/yourGraphQLServer.git graphql_repo
cd graphql_repo
npm install
npm run generateunifiedschema
cd ..
./node_modules/.bin/graphql-inspector validate ./src/**/*.graphql ./graphql/unifiedSchema.graphql

A special mention, the graphql-inspector does not work well (in my experience) with analyzing .ts and.tsx files. However, it worked perfectly when I moved the GraphQL query into .graphql files.

Most of the time, breaking change should not occur. GraphQL allows additional change and to deprecate fields slowly instead of working with several versions. Nonetheless, having additional awareness is something beneficial and by having the tooling configured to automatically verify potential breakdown we reduce the stress of introducing consequences in production.

My GraphQL Articles

A Single GraphQL’s Schema for Internal and External Users

Imagine the scenario that you have a GraphQL schema that is consumed only by engineers of your organization. However, some engineers work on external facing applications while some other works on internal applications. The difference is subtle, but some information must remain confidential. While it might be more restrictive than most GraphQL that are fully public, there is the benefit that the schema is private and we can leverage that detail.

Internal vs External

In the scenario described, the GraphQL server is deployed into an Amazon VPC — an Amazon Virtual Private Cloud. The environment is secured and the authorization is granted to only a limited to specific security groups.

Internal vs External Environment

The challenge is not to limit the schema definition into an internal and external, but to have the external users not able to craft a request to access unexpected information. Because the GraphQL server is behind the VPC, it is not directly accessible by the user. The external application can communicate to the GraphQL server and every front-ends request to fetch data are proxied by the web server application. The proxy is important. It means that the browser of external users is not directly connected to the GraphQL server. Instead, the browser performs an AJAX call to the respective web server and this one, on the behalf of the user will conduct the GraphQL query. The proxy is conducted with the ProxyPass instruction of Apache.

Internal applications do not have many constraints but keeping the same pattern of proxying is a good habit. It simplifies CORS because the browser performs HTTP requests to the same server, and only underneath it communicate to other servers. It also simplifies the security by having a central point of communication (the web server) to communicate with backend secured services.

GraphQL Single Schema

An obvious solution is to have two schemas: one internal and one external. The solution is the only choice if you need to expose the GraphQL schema externally without exposing the definition of internal fields and entities. However, because I had the constraint of not exposing GraphQL outside, I could simplify the maintainability by having a single schema. The problem with many schemas is that it does not scale very well. First, when adding a field that is good externally, you need to add it twice. Then, when it is time to modify or remove, you need to keep in synchronization the different schemas. Any little boilerplate cause the engineering experience to be a burden but also is prone to errors.

In an ideal world, a single schema exists and we flag the field or entity to be only available internally. That world can exist with the power of GraphQL directive.

The GraphQL Directive Idea

GraphQL allows enhancing the graph’s schema with annotation. Let’s start with the end result which should talk more than any explanation.

type MyTypeA {    
    fieldA: EntityTypeA
    fieldB: EntityTypeA @internalData
}

The idea is to add “@internalData” to every field that must be only visible to internal usage. The annotation marks field but also can mark a whole type.

type MyTypeA @internalData {    
    fieldA: EntityTypeA
    fieldB: EntityTypeA
}

The idea is to have a single schema that had an indication that the field is added into a request will have some consequence. Because it is a single graph, the field appears in the interactive Graphql Playground and is a valid field to request; even externally. However, when invoked, GraphQL at runtime will be able to read the directive and perform a logic. In our case, the logic will be to verify the source of the request and figure out if the request is internal or not. In the case of an internal request, the data will be part of the response. If the source is external, an exception will occur and the field will be undefined.

How to build a GraphQL Directive?

The directive is two parts: one is in the GraphQL language (design time) and one is the logic to perform at runtime.

In any .graphql file, you need to specify the directive to let know GraphQL about its existence. I created a file with the name of the directive and added this single line. The directive indicates that it can be applied to a type (OBJECT) or to a field (FIELD_DEFINITION). The directive could also have arguments. For example, we could have a more advanced need to specify which role can access which field.

directive @internalData on OBJECT | FIELD_DEFINITION

The second part if to handle the directive. When merging all the resolvers and type definitions you can also specify the collection of directives. What you need to pass is a key-value pair with the directive name and the class of the directive (not the object). It means that you do not instantiate (new) the class, but only give a reference to the class.

const schemas = makeExecutableSchema({
    typeDefs: allSchemas,
    resolvers: allResolvers,
    schemaDirectives: {
        internalData: InternalDataDirective,
        }
});

The class must inherit SchemaDirectiveVisitor. Then, because we have specified that it can be applied to a field and a type, we need to override two functions: visitFieldDefinition and visitObject.

export class InternalDataDirective extends SchemaDirectiveVisitor {
    private static readonly INTERNAL_APP = ["app1", "app2", "app3"];

    public visitObject(object: GraphQLObjectType): GraphQLObjectType | void | null {
        this.ensureFieldsWrapped(object);
    }

    public visitFieldDefinition(
        field: GraphQLField<any, any>,
        details: {
            objectType: GraphQLObjectType | GraphQLInterfaceType;
        }
    ): GraphQLField<any, any> | void | null {
        this.checkField(field);
    }

    private ensureFieldsWrapped(objectType: GraphQLObjectType | GraphQLInterfaceType) {
        if ((objectType as any).__scopeWrapped) {
            return;
        } else {
            (objectType as any).__scopeWrapped = true;
        }
        const fields = objectType.getFields();

        Object.keys(fields).forEach(fieldName => {
            const field = fields[fieldName];
            this.checkField(field);
        });
    }

    private checkField(field: GraphQLField<any, any>): void {
        const { resolve = defaultFieldResolver } = field;
        field.description = `&#x1f510; Internal Field. Only available for: ${InternalDataDirective.INTERNAL_APP.join(", ")}.`;
        field.resolve = async function(
            source: any,
            args: any,
            context: GraphQLCustomResolversContext,
            graphQLResolveInfo: GraphQLResolveInfo
        ) {
            if (
                context.req.appOrigin === undefined ||
                !InternalDataDirective.INTERNAL_APP.includes(context.req.appOrigin)
            ) {
                throw new Error(
                    `The field [${field.name}] has an internal scope and does not allow access for the application [${
                        context.req.appOrigin
                    }]`
                );
            }

            const result = await resolve.apply(this, [source, args, context, graphQLResolveInfo]);
            return result;
        };
    }
}

The directive converges the two entry points (field and object) into a single function. The two functions are called once when the class is instantiated by the GraphQL code at the startup of the server. It means that you cannot have custom logic in the visit functions. The dynamic aspect appends because we wrap the resolve of the field. It means that the actual resolution is executed but the code specified in “checkField” is also performed at runtime. In the code excerpt, we see that it checks for a list of accepted internal applications. If the field has the directive, it goes into the directive’s resolver and checks if the origin if from the list of accepted internal application. If not, it throws an error.

A little detail, it is possible to inject a description from the directive that is set on the initialization of this one. In my case, I specify that the field is private and mention which application can access it. If a software engineer needs an application to be on the list, it requires a code change. This is not something that happens often and because a code change is required it involves a pull request where many people will have a look.

Example of how it looks from the GraphQL Interactive Playground. The engineer who build the query knows that it is an internal field as well as under which application the response will return a value

Conclusion

The more I work with different organizations, code bases, and technologies, the more I lean toward simplicity. There is so many changes, so many ways to get very deep into subjects and so little time. Getting into complex solution often cause the maintainability a nightmare or make some people very dependent. The solution of a directive in GraphQL took less than 150 lines of code and can scale toward the entire graph of objects without having a dependency on a system to manage many schemas. The security of the information is preserved, the engineers that consume the graph are aware when building the query (description) and while executing the query (error), and the engineers building the graph can add the tag to the fields or types which take a few seconds without having to worry about the detail of the implementation.

My Other GrapHL Blog Posts

How to Pass Value to useCallback in React Hooks

UseCallback allows having the same reference for a call back function which has the benefit of only changing when something changes. This is primordial when callbacks are passed down the React component chain to avoid component to be rendered without an actual real reason causing performance issue. The following snippet of code shows a simple button that when clicked invokes an action that set the name to “test”. You can imagine that in a real scenario that the string would come from a real source instead of being hardcoded.

<button
  onClick={() => {
    dispatch(AppActions.setName("test"));
  }}
>

The action can often be handled without passing data, or by passing a React’s property and hence can access it from the handler of the action. However, in some cases, where the value is not accessible directly from the outer scope of the handler function, it means that we need to pass by parameter the value. I am reusing a Code Sandbox slightly modified to have the useCallback with a value passed down. The use of useCallback or simply the refactoring of the above snippet into a function that is not directly bound to the onClick is similar. We are moving the accessible scope. When the function is inline, the function can access anything that defined the button. It can be the React’s properties, or the “map” index if it was inside a loop or else. However, extracting the function out require some minor change to still have access to the value.

 const setName = useCallback(() => {
    dispatch(AppActions.setName("test"));
 }, []);

A quick change with React Hooks to produce the desired scenario is to use useCallback at the top of the component and access it directly in the onClick function callback.

<button onClick={setName}>Change name</button>

At the moment, it works. However, we are not passing any information. Let’s imagine that we cannot access the data directly from the useCallback, how can we still invoke this one?

const setName = useCallback((event: React.MouseEvent<HTMLButtonElement, MouseEvent>) => {
    dispatch(AppActions.setName("test"));
  }, []);

The idea is to have a callback that return a function that as on its turn the input event.

<button onClick={setName("Testing Name")}>Change name</button>

The invocation code change by passing the data. In that example, it is a string, but you can imagine that you are passing the index of the map function or data coming from a source inaccessible from the callback function.

  const setName = useCallback(
    (name: string) => (
      event: React.MouseEvent<HTMLButtonElement, MouseEvent>
    ) => {
      dispatch(AppActions.setName(name));
    },
    []
  );

My rule of thumb is that I do not need to have this convoluted definition if I am accessing directly properties of the component. Otherwise, I am passing the data needed. I always define the type, which gives me a good quick view about what is passed (name is a string and the event is a mouse event) without having to rummage the code. Here is the code sand box to play with the code of this article.

TypeScript Exhaustive Check your Reducer

A few weeks ago, I wrote about how to use React Hooks useReducer with TypeScript. The natural follow-up for many is to ensure that the set of action allowed is all served with the reducer. Not only it helps to tidy up the accepted actions by reducers when building the reducer, it also help ensuring during the lifetime of the reducer that the list of action remains up-to-date.

If we recall, the reducer is taking the state and the action. The action was typed to be a list of function that must be part of the AppActions. An utility type was used that allowed to union many set of action, but not used since we were using a single type. Nonetheless, everything was in place to ensure a flexible configuration of actions.

export type AcceptedActions = ActionsUnion<typeof AppActions>;
export function appReducer(
  state: AppReducerState,
  action: AcceptedActions
): AppReducerState {
  switch (action.type) {
    case ACTION_INCREASE_COUNT:
      return {
        ...state,
        clickCount: state.clickCount + 1
      };
    case ACTION_SET_NAME:
      return {
        ...state,
        activeEntity: { ...state.activeEntity, ...{ name: action.payload } }
      };
  }
  return state;
}

While we cannot add subjective case with action not defined in the AcceptedActions type, the weakness of the code is that we can remove one of the two cases without being noticed. Ideally, we would want to ensure that all actions are defined. In the case that an action is not anymore required that we would need to remove it from the list of action.

The solution require only a few amount of lines. First, you may already have have the core of the needed logic: an exhaustive check function. I have covered many months ago the idea of an exhaustive check in this article. In short, it is a function that should not be reached, when TypeScript found a logical path that can reach the code, the code will not compile.

export function exhaustiveCheck(check: never, throwError: boolean = false): never {
    if (throwError) {
        throw new Error(`ERROR! The value ${JSON.stringify(check)} should be of type never.`);
    }
    return check;
}

The use of reducer and TypeScript’s exhaustive check pattern is similar to what we would have done for checking if all values are covered on an Enum. The code needs to have a default case which we do not expect the code go fallthrough.

The two new lines:

    default:
      exhaustiveCheck(action);

Removing a required action cause TypeScript to go in the exhaustive check and since the function is marked to accept a never argument does not compile.

TypeScript catching the missing action

I have updated the original code sandbox. Click on the reducer.ts and try to remove on the of action.

In conclusion, the solution might not be ideal for you if you have all your actions into a huge function, or if you do not even group your action might not be even possible. However, grouping actions tidy up your code by having a better idea of what possible actions are expected in different domain of business your application handles. It is not much more work, and it self-document the code. The exhaustive check is an additional step to maintain order.

The authorization and authentication in GraphQL

Implementing a service that covers many systems might be frightful in term of data exposition. While wandering the Internet for resources about GraphQL and security, we often see cases where the security is not paramount — it is easier to ignore security. The reality is that most corporation that desire to bring GraphQL will also need to secure the information. This article sheds light on how I approached the security in my implementation of GraphQL at Netflix for Open Connect.

Prelude

In an ideal world, we have the GraphQL server getting the authorization from the corporate security infrastructure and the GraphQL delegate downstream to the actual data source the responsibility of returning an error. For example, if the data source is a REST API, the token of the user is used for the call, hence can return an HTTP code 401 and GraphQL will handle the error. However, in maybe the GraphQL exposes some internal services that were secured only by the premise that it was under a VPS (virtual private server). Meaning that not validation is actually performed. In that case, some custom code is required by the executioner of the service: GraphQL. Another case could be that the security is by a specific group of entity (organization, user, etc) meaning that you do not want user A to access user B information. Again, a strong security model would perform the validation at the source (Rest, SQL views, etc) but in the real world, it is not always the case. To mitigate the possibility of security issues among the sea of services that was cover in my scenario, the security was placed in GraphQL. Meanwhile, further modification in the data sources could be planned and put in place without compromising the delivery of the GraphQL server.

Exploring the possibilities

One strength of GraphQL is the flexibility. The flexibility nature remains true for security and it opens many doors to where to secure the service. As mentioned earlier, the NodeJS server that host Apollo is behind Apache. The reason is that at Netflix, we have many tools integrated within Apache to secure the user like single-sign-on, OpenID Connect and OAuth 2.0. The Apache module is great for authentication but not for authorizing. It does check if the user is one that can access the GraphQL but does not discriminate on which information the user can consult.

Flow of the request from user to services that contain the information

Apache gives information about the user and provides the additional HTTP headers to NodeJS. The first stop is a custom NodeJS Express middleware. The middleware is a piece of code executed in each request. The middleware check if the user is a Netflix employee with a specific grant right. If that is the case, it marks a particular field in an enhanced request object to signal the user as “authorized for full access.” The idea is to avoid every future validation that can be performance costly. This shortcut works well because the Apache module can be trusted with its information. It might not work well in your organization, thus do your homework. The next stop is at the GraphQL context. In my case, I have no validation to do at that level because I did the check in the NodeJS Express middleware. However, if you are not using NodeJS, it would be the place to do HTTP request checks. However, I instantiate a secure object at that level that contains functions that check particular lists of objects. The lists of objects are specific ids of secured entities that the user has access. Then, the context performs a query on specific backend services to fetch what objects ids the users can access. The list goes in the request. The idea is to have before the user reaches the resolver a well-defined list of authorized entities and ids.

It is possible to perform at the resolver checks, but the problem is that if you are not querying for the field that contains the ids that the user can access that you will not have the value available. For example, if a user can only access the data of the organization that he/she belongs and that the user requests for the organization by id for its name then you could block. But, if the user request for a sub-entity, for example a contact and then in the query’s tree the name of the organization, without the organization id, then the resolver cannot check if the organization’s data belong or not to the authorized ids.

Finally, the place I found the best to handle authorization is at the data loaders level where every HTTP requests to service are performed. Upon reception of a query, the data is examined to check if the payload contains information from the entities we want secure. If the response contains an entity that does not belong to the user, an exception is thrown and GraphQL bubble up the exception to the resolver who initiated the request. GraphQL handles the exception properly and your message is given to the user.

Conclusion

The solution is not ideal because it requires the user to have an additional call, per service, to have a list of entities and ids. I opted to have the GraphQL cache all the entities-ids per user in a server cache (not a request cache) for few minutes. The solution has a flaw that the request is still performed to the service. The reason is the lack of transparency from an entity B on entity A before getting the data. Nonetheless, it secures because the response does not go beyond NodeJS, it is not cached either. These loops are required because of weakness at the leaf of the architecture: the service that has access to the data. As a reminder, even if you are building an internal service that is secured by a network, it is always better to not rely on that infrastructure and to perform database checks. The future is never known, infrastructure change, security change, a potential consumer of the information evolve, and we never know when something will be exposed. For future resiliency and for an optimal defense: always authorize at the source.

My Other GraphQL Articles