The Ethics of Data Collection and Data Privacy or the Wild West of Consumer Data

Francia Riesco
8 min readMay 29, 2022

Like many people, I wake up in the morning, pick up my cellphone, and open Twitter. To check, if something beyond normal happened while I was sleeping. After, I open my computer and check my emails to ensure no last-minute requests that I need to complete in the morning. Since the pandemic, I have worked from home, so I don’t commute, and I have more time to be on the internet than ever before. During the day, I switch between a laptop, an iPad, and a cellphone, and I can quickly notice that each device knows what I was searching, reading, or looking at on the previous device. I know that because I browse using google products on the three devices with the same log-in account. My google account knows where I am, how long I travel, and many more things that I don’t even know. This level of tracking a user and crossreference between devices is not new, but it was not that invasive either. The user’s data collection has evolved over the years triaging music preference, favorite color, food with our location, and political views, among other things. For that reason, the consumers, the tech companies, the governments, and the data scientists, we need to stand and work together to allow data collection and data mining to improve products and services but not transgress the privacy of all of us.

A Brief Story About Tracking on the Internet

The tracking of the consumer on the web started in the 90s when the internet became popular in many households. At this point, the newborn eCommerce websites needed to distinguish between consumers and identify what they were looking for in their sites. That is how the HTTP cookies were created. This simple technology allowed the websites to assign unique cookies for each customer on their site to store their session and know who they were and what they were browsing. There was nothing fancy about HTTP cookies; they were issues by each site and could not be shared between sites. But that was not the end of the story; tech companies identified the millions of possibilities to create a personalized experience based on user habits on different sites (The Evolution of the Internet, Lab, n.d.). As a software developer in the early 2000s, I saw and coded this kind of ad. For example, a third-party company wanted their ads on our site. Therefore, we passed basic information from our customers to their ads, such as gender, age, and location, to name a few tags; in return, the ad system showed an ad targetted for that specific customer group. Similarly, we have the same ad services to create targeted ads to bring traffic to our site. As simple as it started, this evolved from a permanent tracking of everything action of the customer on the internet to a more deeply invasion for the name of a personalized internet experience.

The Loss of Our Innocence and Our Privacy

Everything was fine when personalizing advertising started, but the technology grew. The digital footprint of consumers created on the internet is quadruplicated with the help of new services, and how users utilized multiple devices on the internet. Free services, online games, and social media have become popular and used by almost everybody, from teens to grandparents, and all these services need to monetize their products. To handle this immense volume of data created, the technology became faster, more accurate, and more sophisticated to collect, transfer, and store this information. All these investments needed to bring revenues, so companies started to develop more complex algorithms, not only for target ads but also for political purposes, among other invasive targeting, approaches. At this point, collecting the browsing activities of the consumers was not sufficient. Therefore, the tech companies moved to more intruding techniques by using audio devices and recording conversations to create targeting features based on these invasive approaches. If that was not enough, the companies that saw value in collecting their customer data to provide a personalized experience decided to sell this information and generate extra income from the resources they collected. With the lack of privacy laws or loopholes on them, the customers were left helpless in this wild west of the personal data collection.

The Champions of Data Privacy

In this rampage of data collection and privacy loopholes, consumers’ data can be collected and passed from multiple companies without the user’s consent. Companies that originally collect data from their consumers, they were able to sell their consumers’ data to third-party companies that don’t even have the same terms and conditions that the consumers may agree to in the beginning. Although these practices are questionable, consumers are not always protected. Sadly, it depends on where the customers live and where the data is collected. This means customers’ data are under different laws, and each company has different privacy terms and conditions. Consequently, ethics of data manipulation fall under the good intentions of the people who work doing data collection and data mining.
Let’s start with the privacy laws that protect consumers in the US. For instance, it does not exist a standardized national law that forces companies to notify their clients of any data breach. Also, companies can sell their customers’ private data to third parties, and these companies can resell the data again without consent or even inform you. Additionally, the US has several laws for data privacy, and these laws are for different types of data, such as the Health Insurance Portability and Accountability Act (HIPAA) for patient and doctor confidentiality and; the Fair Credit Reporting Act (FCRA), among others. That is why the US has several loopholes that allow companies to use their customers’ data without any scruple. In contrast, the European Union has robust privacy laws called General Data Protection Regulation (GDPR). The GDPR forces companies to have formal permission to share any information of their clients, and the individuals have the right to request to access and delete their data (The State of Consumer Data Privacy Laws in the US, 2021).

With all the ambiguities in the data protection laws, the data engineers, data scientists, and software engineers have the moral and ethical duties to advocate for better laws to protect the customers and their privacy and ensure that the data that we gathered is appropriately handled. To illustrate, any company that collects and manages users’ information should explicitly request consent to store and use personal information. All data collected should be protected when it is in transit and at rest. This means end-to-end encryption, secure storage, and limited access to remove any data breach. Suppose the data is sold to third parties. In that case, it should be already clean of any personal information that allows identifying the user identity. We need to stand up and be the champions for data privacy and promote laws that protect users’ privacy. This will allow the future of data science to be accurate, safe, and trustworthy.

Conclusion

There can be no doubt that data mining for business has changed our consumer experience for the better. Getting a tailored experience for our everyday needs simplifies our daily routine and safe us money and time. But this entails opening our privacy and being subject to deeper surveillance that must be controlled, limited, and audited. For that reason, consumers, companies, and governments have to work together to protect the data and use it for the welfare of everybody on the internet.

StarTrek Data Meme May 2022

References

Accuracy measures for a forecast model — Accuracy. (n.d.). Retrieved May 9, 2022, from https://pkg.robjhyndman.com/forecast/reference/accuracy.html

Become A Data Privacy Week Champion. (n.d.). Stay Safe Online. Retrieved May 13, 2022, from https://staysafeonline.org/data-privacy-week/become-dpw-champion/

Carpenter, A. (2020, August 10). The Ethics of Data Collection. Medium. https://towardsdatascience.com/the-ethics-of-data-collection-9573dc0ae240

Cate, F. H. (n.d.). Government Data Mining: The Need for a Legal Framework. Harvard Civil Rights, 43, 57.

Cross-site tracking — Read our definition on the tea house. (n.d.). The Tea House by Fifty-Five. Retrieved May 11, 2022, from https://teahouse.fifty-five.com/en/glossary/cross-site-tracking/

Data protection. (n.d.). [Text]. European Commission — European Commission. Retrieved May 13, 2022, from https://ec.europa.eu/info/law/law-topic/data-protection_en

Data Protection and Privacy: 12 Ways to Protect User Data. (n.d.). Cloudian. Retrieved May 12, 2022, from https://cloudian.com/guides/data-protection/data-protection-and-privacy-7-ways-to-protect-user-data/

Data protection in the EU. (n.d.). [Text]. European Commission — European Commission. Retrieved May 13, 2022, from https://ec.europa.eu/info/law/law-topic/data-protection/data-protection-eu_en

DataPlanet — What Are My Ethical and Legal Responsibilities in Using Data and Statistics. (n.d.). Retrieved May 11, 2022, from https://dataplanet.sagepub.com/data-basics/what-are-my-responsibilities-in-using-data-and-statistics

DuckDuckGo. (2022). In Wikipedia. https://en.wikipedia.org/w/index.php?title=DuckDuckGo&oldid=1086836111

Duhigg, C. (2012, February 16). How Companies Learn Your Secrets. The New York Times. https://www.nytimes.com/2012/02/19/magazine/shopping-habits.html

Google won’t read Gmail emails anymore for advertisement — GHacks Tech News. (2017, June 23). GHacks Technology News. https://www.ghacks.net/2017/06/23/google-wont-read-gmail-emails-anymore-for-advertisement/

Google-run’awayer. (2021, November 8). 7 Best Private Search Engines that won’t track you like Google does. Comparitech. https://www.comparitech.com/blog/vpn-privacy/best-private-search-engine/

Haselton, T. (2017, December 6). How to find out what Google knows about you and limit the data it collects. CNBC. https://www.cnbc.com/2017/11/20/what-does-google-know-about-me.html

Helft, M., & Vega, T. (2010, August 30). Retargeting Ads Follow Surfers to Other Sites. The New York Times. https://www.nytimes.com/2010/08/30/technology/30adstalk.html

How to Notify Your Customers of Your Privacy Practices, and What Not to Do. (2017, September 9). Woopra. https://www.woopra.com/blog/how-to-notify-your-customers-of-your-privacy-practices-and-what-not-to-do

Kantor, J. (2014, August 13). Working Anything but 9 to 5. The New York Times. https://www.nytimes.com/interactive/2014/08/13/us/starbucks-workers-scheduling-hours.html, https://www.nytimes.com/interactive/2014/08/13/us/starbucks-workers-scheduling-hours.html

Legal and Ethical Issues in Obtaining and Sharing Information. (n.d.). Morris Manning & Martin, LLP. Retrieved May 13, 2022, from https://www.mmmlaw.com/media/legal-and-ethical-issues-in-obtaining-and-sharing-information/

Nield, D. (n.d.). All the Ways Google Tracks You — And How to Stop It. Wired. Retrieved May 11, 2022, from https://www.wired.com/story/google-tracks-you-privacy/

Optimizing Your Ads for Google’s Mobile Search Pages. (2019, August 8). Metric Theory. https://metrictheory.com/blog/optimizing-your-ads-for-googles-mobile-search-pages/

Schofield, J. (2018, April 19). What’s the best email service that doesn’t scan emails for ad-targeting? The Guardian. https://www.theguardian.com/technology/askjack/2018/apr/19/whats-the-best-email-service-that-doesnt-scan-emails-for-ad-targeting

The Evolution of the Internet, Identity, Privacy and Tracking — How Cookies and Tracking Exploded, and Why We Need New Standards for Consumer Privacy — IAB Tech Lab. (n.d.). Retrieved May 11, 2022, from https://iabtechlab.com/blog/evolution-of-internet-identity-privacy-tracking/

The State of Consumer Data Privacy Laws in the US (And Why It Matters). (2021, September 6). Wirecutter: Reviews for the Real World. https://www.nytimes.com/wirecutter/blog/state-of-privacy-laws-in-us/

Title Capitalization Tool — Capitalize My Title — Title Case Tool. (n.d.). Capitalize My Title. Retrieved May 11, 2022, from https://capitalizemytitle.com/

What is Consumer Privacy and Which Laws Protect It? (n.d.). SearchDataManagement. Retrieved May 13, 2022, from https://www.techtarget.com/searchdatamanagement/definition/consumer-privacy

What is Data in Transit and Data at Rest. (n.d.). Retrieved May 12, 2022, from https://www.quest-technology-group.com/academy/what-is-data-in-transit-vs-data-at-rest

What is End-to-End Encryption. (n.d.). Retrieved May 12, 2022, from https://www.quest-technology-group.com/academy/what-is-end-to-end-encryption

What Is SEO / Search Engine Optimization? (n.d.). Search Engine Land. Retrieved May 12, 2022, from https://searchengineland.com/guide/what-is-seo

What Search Engines Don’t Track You? (2021, October 20). Brave Browser. https://brave.com/learn/no-tracking-search-engine/

(N.d.-a). Retrieved May 11, 2022, from https://www.metarouter.io/blog-posts/the-ethics-of-collecting-consumer-data

(N.d.-b). Retrieved May 13, 2022, from https://www.metarouter.io/blog-posts/the-ethics-of-collecting-consumer-data

--

--

Francia Riesco

Software engineer. Interested in Data Science, Cosmology, and Computational Astrophysics. MLA Harvard, PhD. (c) CSU.