I recently searched for a specific picture of my daughter on my local drive with some issue. First, I am taking a lot of picture and it was hard to find. Second, I have some good pictures and some average, but I keep them all, hence I have thousand and thousand of picture that are not easy to find. However, I always had since 2003 got a systematic way to store my picture which is a main folder that contains one folder per year and every year has many folder per event. The event folder always have the format “yyyy-mm-dd-EventDescriptionIn2words”. I also have the habit to prefix the best pictures with an underscore inside these folders. Still, the picture name are always the sequential number of my camera and they are not consequent in time. There is no way I can search for “Alicia happy in red dress during summer 2015” for example.
Here come the idea that I started few weeks ago: having a training set of pictures that will serve as a base for the system to figure out who is in my picture and having a service that analyse what is in the picture. On top of the data, a simple website that let me query the database of pictures and return me the best match with a link to the actual full quality picture. Before going any further, a word of caution, the idea of this project is not to develop something that will scale, or a stellar code, hence the quality of the code is very average, but workable solution. Everything is developed with NodeJs, TypeScript, Microsoft Cognitive Api, MongoDb and doesn’t have any unit tests. I may refactor this project someday, but for the moment, let’s just get out head around how to do it.
I’ll write several posts around this project. In fact, at the moment I am writing this article, I have only done half way through the first phase which is analyzing a little subset of my picture. This article will serve more as a description of what will be build.
First thing we need to do is to read a sample of all the images. For me, instead of scanning and analyzing my whole hard drive for picture, I will analyze only picture between a specific range of date. At this date, I have 34 000 pictures taken since 2009 (since I met my wife) and in this population 2 000 have been identified with an underscore which mean that I really like them. For the purpose of having a smaller set of search and not having to analyze for too long time I will only use pictures with an underscore. Second, in these pictures, I can stay that roughly 75% of people are my wife, my daughter or me. Hence, I will only try to identify these people and mark others as “unknown”. Third, I want to be able to know the emotion and what is going on in the picture. This will require a third party service and I will use Microsoft Azure Cognitive API. I’ll get more in detail in the article about the api.
Once the picture will be analyzed, the data will be stored in a MongoDB, which is a JSON based storage. This is great because the result of all the analysis will be in a JSON format. It will allow us to query the content to get results to display in the website. To simplify this project, I will mark the first milestone as scanning the picture and create one JSON file per underscore file inside a “metainfo” folder. The second milestone will be to hydrate the MongoDB and the third one to create a simple web application that will communicate and display the result from MongoDB.
I’ll stop here for the moment. You can find the source code of the progress of this project in this GitHub repository : https://github.com/MrDesjardins/CognitiveImagesCollection