Ads and Data Analysis

Whether or not our devices are listening to us is a pretty big concern, and ads are usually provided as evidence. However, warentless wiretapping is as old as The Patriot Act, and always-on devices can catch private details. However, that’s a sticky topic I do not want to address. I did once have a coworker though who swore by VPNs for everything, and almost refused to give his fingerprint scan to punch into work. He was afraid that someday the cops might try to track him down and find him with that specific fingerprint. While that is theoretically possible there would likely be other ways to catch him.

We tend to overestimate the capabilities of algorithms and technology. One common fallacy is that if a computer is large enough no amount of data is too large for the computer to handle quickly. While there is certainly a need for strong hardware to break through large amounts of data, optimized algorithms are also required. Even supercomputers can take an unreasonable amount of time searching through large amounts of data using the wrong algorithms. Furthermore, if the data is not preprocessed the findings could be complete gibberish.

There is no doubt you’ve noticed an ad that was relevant to your conversation at some point in your life. However, have you noticed all the ads that were irrelevant? I remember one story about a guy who went on a trip to New Zealand, and six months after he returned he was still getting ads relevant to that trip. An algorithm was able to determine his location at one point in time, but was slow to update his new location. There’s a lot of reasons that can happen, including the cost of retraining models.

Algorithms are computational and mathematical models meant to find trends in your day-to-day life. Marketers cannot afford to assign someone to you to watch your every movement. Therefore, they try to collect as much data as they can, figure out what’s relevant to them, and use a model to determine how to best catch your attention. Sometimes it’s creepy how effective it can be, and oftentimes laughable or annoying how ineffective it can be. Unfortunately, correlations are not always obvious either.

In the book The Rise of Big Data Policing by Andrew Ferguson he mentions a correlation he discovered that people buy pop tarts before major storms. To anyone able to think critically that’s a silly correlation and tells us nothing at all. However, computers are not capable of critical thinking as we are, and they just regurgitate their findings. Whether or not that could change is probably a philosophical topic, and according to an attractive woman I met once I’m training skynet. I thought she thought it would be neat that I programmed Descartes’ argument concerning God or something in Prolog. She didn’t. Lesson learned.

Anyway, unless you’re doing something illegal near Alexa you’re probably okay. Apriori systems do not benefit from tons of information, and computers do have limitations. They’re actually able to reach the largest possible audience with minimal effort by anonymizing your data. I understand sometimes data breaches happen, and that’s due to primary sources poorly handling sensitive information such as PFI (Personal Financial Information), PII (Personal Identification Information), and PHI (Personal Health Information). While these things seem connected, beneath the surface the connection is not nearly as strong as they seem.

Leave a comment

Your email address will not be published. Required fields are marked *