How normal am I?

Expe­ri­ence how «arti­fi­cial inte­lli­gence» judges your face

Start the show ›

Sour­ces

Language Use

This project purpo­se­fully avoids using the word «Arti­fi­cial Inte­lli­gence», since there is nothing inte­lli­gent about these systems. I prefer to use the terms «machine lear­ning» or «statis­tics on steroids», but I’ve sett­led on algo­rithms here.

Machine Lear­ning models

Almost all the machine lear­ning models used were down­lo­a­ded «pre-trai­ned» from open source projects I found on Github. This was done to make a point: we often say we should improve biased and/or error-prone machine lear­ning models, but the reality is that most orga­ni­sa­ti­ons don’t train their own models. Instead, they use third parties that supply machine lear­ning servi­ces, and it’s in the inter­est of these parties to keep their systems as gene­ric and «one size fits all» as possi­ble. And then you have the parties who just imple­ment whate­ver they can get their hands on, and hope nobody asks diffi­cult ques­ti­ons.

The beauty scoring model was found on Github (this or this one). The models to predict age, gender and facial expres­sion/emotion are part of Face­A­piJS, which forms the back­bone of this project. Do note that its deve­lo­per doesn’t fully divulge on which photos the models were trai­ned. Also, Face­A­piJS is bad at detec­ting «Asian guys».

I actu­ally trai­ned the BMI predic­tion algo­rithm myself because I couldn’t find any exis­ting models that were small enough to use online. I down­lo­a­ded all the BMI predic­tion projects I could find on Github, and was asto­nis­hed to find some of them came with vast troves of photo­graphs. I felt firty using them, but also felt that reve­a­ling what was going on was more impor­tant. I’ve docu­men­ted some of the dodgy things I disco­ve­red along the way in this blog­post.

Videos, scre­ens­hots and other visual mate­rial

The videos were mainly cons­truc­ted out of public domain source mate­rial from pexels.com and Pixa­bay.com. To the photo­grap­hers who so kindly shared their work: thank you all for your wonder­ful gene­ro­sity!

Other direct sour­ces were used under the artis­tic, jour­na­lis­tic and educa­ti­o­nal copy­right excep­tion.

Speci­fic sour­ces

Attrac­ti­ve­ness

  • TikTok (front page, recor­ded with a screen recor­der). Their prac­tise of showing content from beau­ti­ful people is explai­ned here. While I couldn’t confirm if TikTok does this algo­rith­mi­cally, Tinder actu­ally has been on the record about this prac­tice.
  • This acade­mic rese­arch was the source of the beauty judge­ment inter­face. Anot­her source was the SCUT-FCB data­set.

Age

  • Inno­va­trix offers survei­llance systems that analyse the demo­grap­hics of store visi­tors. The picture was down­lo­a­ded form their website.
  • Tinder (front page, scre­ens­hot).

Gender

  • Check out This arti­cle if you want to unders­tand why an algo­rithm that tries to sort people into just two cate­go­ries can upset some people.

Body Mass Index (BMI)

  • The BMI predic­tion project that was crea­ted by rese­ar­chers who work at Google can be found here (not anymore, they dele­ted it!). Although all signs point to this project being a part of Google’s prac­tise, I sent an email to the makers to verify this, and got a quick response that it was a perso­nal project. I then chan­ged the video to say «rese­ar­chers who work at Google» intead of «Google’s health lab in India».
  • For more juicy details I refer you to the in-depth blog­post menti­o­ned earlier.

Life expec­tancy

  • There are a lof of projects on Github that explore this idea, some of them as part of (Kaggle) contests set-up by the insu­rance industry. After explo­ring how these worked, I crea­ted my own wishy-washy imple­men­ta­tion. There is no machine lear­ning invol­ved here, it’s just a lookup in a table of life expec­tan­cies per country, the average BMI in that country, and then a calcu­la­tion on how BMI might affect that. There is very little merit in this calcu­la­tion, but then again that doesn’t seem to be a requi­re­ment for calling your­self a data scien­tist.

Closer / Face print

  • Deep­ca­mai.com is no longer online, perhaps it was remo­ved after the scan­dal around Rite Aid. It can still be found using the Inter­net Archi­ve’s wayback machine. The Chinese parent company Deep­cam has a Chinese website. There is also an Austra­lian website. Wait, it turns out the USA company has rebran­ded itself to PDAc­tive. The website is virtu­ally iden­ti­cal to the old deep­ca­mai.com website.
  • The visual effect that shows how the video feed is first turned into a mathe­ma­ti­cal repre­sen­ta­tion based on contrast is built using HOG descrip­tor. The code was modi­fied to work in the brow­ser.

Emoti­ons

Surf­ing beha­vi­our

  • Visual Website Opti­mi­zerVideo source
  • HotJar. Learn more about the screen recor­ding feature here.
  • Mouse move­ment was recor­ded using a small script called Wix client recor­der (MIT license). As always, this runs in your own brow­ser – the recor­ding never leaves your compu­ter, and will be gone as soon as you close the brow­ser window.

Tip: you can protect your­self from these prac­ti­ces by insta­lling brow­ser addons such as uBlock Origin and uMatrix. Also check out Privacy Badger, HTTPS Everyw­here, and Decen­tra­leyes.

Conclu­sion

  • Jon Penney rese­ar­ched how Wiki­pe­dia was used after the Snow­den reve­la­ti­ons, and noti­ced that pages about sensi­tive topics such as terro­rism were visi­ted less. Later rese­arch strengt­he­ned the notion that this was caused by self-censors­hip.
  • You may also enjoy anot­her work I made: Soci­al­Co­o­ling.com.

EU funded

This project has recei­ved funding from the Euro­pean Union’s Hori­zon 2020 rese­arch and inno­va­tion programme, under grant agre­e­ment No 786641.