Similarity and Dissimilarity – in Data Science and in Life
A Letter from Dr. Ashirbani Saha
Dr. Saha is the first holder of the BRIGHT Run Breast Cancer Learning Health System Chair, a permanent research position established by the BRIGHT Run in partnership with McMaster University.
Hello BRIGHT Run Family,
Hope you had a peaceful holiday with your friends and family.
As for me, I stayed in Hamilton. I made virtual connections to some friends I haven’t talked in a while and spent time with my family, which includes my family in India as well.
This seasonal holiday in Hamilton (and in other parts of Canada) reminds me of a particular holiday season in my state (which has similarity with a province in Canada), called West Bengal, in India. During that season, sometime in Autumn, we celebrate the festival of ‘Durga Puja’ with a lot of pomp and cheer and of course, with food! The feelings of festivity that permeates through these respective geographical regions during the holidays are similar.
That brings us to the concept of similarity. In the world of data science, similarity or dissimilarity plays a very important role when comparing datapoints.
As an example, you can consider the globe as a collection of several locations, represented by three variables – latitude, longitude, and altitude (height from sea level). Two locations (or datapoints) on the globe can be compared based on the dissimilarity e.g., distance (a mathematical function) between those points. A point in my data-world might need to be expressed by more than three variables. In addition, I might need another specific or relevant function to calculate their dissimilarity/similarity.
What is important here is the quantification of similarity/dissimilarity for the data-related problem at hand. Many a times, it takes a huge scientific and technological rigour to accomplish similarity – e.g., the Google image search that can be used to retrieve similar images to the ones you have.
As for the new year, I hope it goes way differently (hence, dissimilar) than the past two years in terms of the pandemic effects that we all had been experiencing. I hope it gets more close (hence, similar) to our happiest and blissful moments of the past. Of course, we do not mind dissimilarity if we are happier than ever before, right?
Happy New Year 2022!