009 For the Love of Algorithms – For the Love of Data

Worst Pun Ever: Today, we are talking about Al… Al Gore… Al Gore Rhythms… Algorithms!

Definition: a step-by-step procedure for solving a problem or accomplishing some end especially by a computer1

Inputs:

  • Many algorithms use census data or FICO score as one of their prime inputs
  • Plus any custom information you give a website
  • Plus any information they glean about you from other sites (when you visit a site with a Facebook Share button, Facebook can track that you’re there17)
  • Websites are constantly looking at ways to break our anonymity (fingerprinting) so they can track us and serve us more relevant or lucrative ads5.

Fun Stuff

  • Chess – algorithms are so good that humans haven’t been able to beat a 4- CPU  PC since about 20056
  • Rubik’s Cube – the machine record is 0.887 seconds vs. just over 5 seconds for a human7

  • Poker – scientists solved all moves for Heads Up Limit Hold ‘Em – 3.16 x 10^17 moves. You may be able to win individual games, but it is HIGHLY unlikely that you can win over time8

Machine Learning

  • Can include intentional or unintentional bias.
  • @JonathonMorgan did a post on Medium and a podcast on Partially Derivative about using a machine learning model to find alt-right white supremacists on Twitter and track their degree of radicalization over time. He did this by training a model with their tweets and analyzing their usage of words like “Jewish” vs. more mainstream usage3,4.
– From Medium / Jonathon Morgan’s Post

Pricing2:

  • Amazon lists its results over competitors, even when higher including shipping for non-prime customers; however, it claims it’s algorithm is customer-centric
  • Princeton Review charges between $6,600 and $8,400 for its online course in some zip codes. It charged higher in zip codes with higher incomes and some with higher Asian populations.

News/Search:

  • Link Analysis, how two entities relate to each other is used by Google’s PageRank, Facebook’s News Feed, and LinkedIn’s job/connection recommendations. It was developed in 1976 and first used by two other search indexes before Google began using it in 1998.13
  • However, algorithms cause sites to cater to information similar to your preexisting views, or for what they think you will find interesting, rather than presenting balanced, holistic content.14
    • Medium recommends articles based on how long it thinks you will read.
    • Some sites tailor related content, content types (video, etc.), and sharing buttons based on where you enter their site from.
    • These choices and filters can lead you into a content bubble that leads you down a path of more and more specific, and sometimes extreme, viewpoints.
  • Facebook uses hundreds of features, or input variables, when assigning a relevancy score to posts you see in your news feed.15
  • When you have a Facebook account and you visit a page that has a like or share button, Facebook can log your visit and use that to tailor content or ads when you visit their site.16 See here17 for a relatively up-to-date list of features used in Facebook’s newsfeed algorithm (time spent viewing, friend’s posts receive priority, likes/reactions, etc. are all key inputs).

Serious Consequences

  • Some algorithms for car insurance weight FICO credit scores higher than drunk driving convictions.9,10,11
  • Cathy O’Neil calls them “Weapons of Math Destruction” (WMDs) if they are: widespread, secretive, and have the potential to do great harm9,10
  • Kronos, a small big data HR company hired by large firms to screen applicants employs a personality test as part of their screening of candidates. Some argue that this unfairly excludes them from jobs, with no explanation of the reason, in a manner that violates the American’s with Disabilities Act (ADA).9,10,12

Theme Music: Algorithm of Desire by Measles Mumbs Rubella, courtesy of FreeMusicArchive.

Sources:

  1. http://www.merriam-webster.com/dictionary/algorithm
  2. https://www.propublica.org/article/breaking-the-black-box-when-algorithms-decide-what-you-pay
  3. http://partiallyderivative.com/podcast/2016/09/27/s2e14-the-model-is-racist
  4. https://medium.com/@jonathonmorgan/the-radical-right-and-the-threat-of-violence-f66288ac8c4#.kssqef9jz
  5. http://fivethirtyeight.com/features/internet-tracking-has-moved-beyond-cookies/
  6. http://www.extremetech.com/extreme/196554-a-new-computer-chess-champion-is-crowned-and-the-continued-demise-of-human-grandmasters
  7. http://gizmodo.com/in-just-0-887-seconds-another-machine-has-already-shatt-1758009774
  8. http://bigthink.com/ideafeed/computer-scientists-create-unbeatable-poker-playing-computer
  9. http://fivethirtyeight.com/features/whos-accountable-when-an-algorithm-makes-a-bad-decision/
  10. https://weaponsofmathdestructionbook.com/
  11. https://www.wired.com/2016/10/big-data-algorithms-manipulating-us/
  12. https://www.theguardian.com/science/2016/sep/01/how-algorithms-rule-our-working-lives
  13. https://medium.com/@_marcos_otero/the-real-10-algorithms-that-dominate-our-world-e95fa9f16c04#.8mczwtxzt
  14. http://www.cjr.org/news_literacy/algorithms_filter_bubble.php
  15. http://www.slate.com/articles/technology/cover_story/2016/01/how_facebook_s_news_feed_algorithm_works.html
  16. https://www.technologyreview.com/s/541351/facebooks-like-buttons-will-soon-track-your-web-browsing-to-target-ads/
  17. https://blog.bufferapp.com/facebook-news-feed-algorithm