Disengagement Data

Data analytics. Data-driven. Big data. Data mining.  Data, data, data. It’s the buzz-word these days in the publishing industry. And for good reason. All our data is being collected – regardless of we’re aware of it or not. Whether it’s through the big three: Facebook, Amazon, Google, or just by loyalty cards at your grocery store or apps to track your fitness. In the Canadian book market, BookNet helps the industry by giving publishers consumer data, metadata from other publishers, and more. It would be silly for a publisher to not capitalize on this wealth of information to try to sell more books and try to survive in a tough market like books.

There’s so much data to scan through and collect. It’s important to identify what exactly would be beneficial for you as a publisher and how you can use that data to improve your services. Personally, if I was a publisher I would want disengagement data. Specifically, I would want data telling me what sections of the text the reader started to disengage. I think this would be an especially useful tool to have in education publishing.

Educational publishers provide students with textbooks, course packs, non-fiction books, educational picture books, etc. If I could get data on when students start to lose focus, skim over passages, get frustrated, or simply lose interest, I could then hopefully make the learning experience much better.  The process of taking complex subjects and translating it to a lay audience can be quite challenging. I saw this issue time and time again in my undergraduate lectures. I had super smart professors that were highly specialized in their fields, however, when it came to deconstructing the material to explain to students in a simple manner, many of them did not do a good job. We would leave lectures feeling confused and frustrated. We would then to turn to textbooks or other reading material that would also fail to help us understand. Sometimes professors can’t be helped. But I think books can be improved – especially because there’s a team of people working on them.

Knowing disengagement data can help publishers, editors, and writers improve their work. In future editions, visuals can be added, paragraphs can be rewritten, chapters can be restructured, supplemental resources can be offered.  This data can also be offered to educators who can see where students are losing touch, and lesson plans can be modified to address these issues. I’m a big believer in that anyone can learn anything if it’s taught properly. Over the past year, I’ve heard many of my peers say they hate numbers or they’re not good at math. I don’t buy it. I think everyone could be good at math. They just need the right learning tools and methods that are suitable for them.

To collect this data in a non-intrusive way I think the most straight forward way would be to ask students. When they buy a textbook or a digital textbook, perhaps they are given the option to highlight or mark up pages or passages that are confusing to them. They can offer suggestions of what other things they’d like to see – maybe more definitions, maybe more diagrams. This would make the learning process more dynamic as well instead of in a one-way direction from teacher/book to student.

The other option would be to tell students they’re tracking their learning process as they go through the book. For example, a digital e-book can inform students at the beginning that their reading process is being monitored and explaining why. Students can then have a choice to opt-out. Offering perks (like a $50 Starbucks card) may motivate students to opt in.

Though I make this sound easy, I’m aware of all the challenges that can arise. It’s expensive to collect your own data… to have the tools and means to do so. Knowing exactly why students disengage can be quite challenging to understand. It can be due to personal learning challenges, it may have to do with their personal history with the topic at hand (maybe they had an awful math teacher that scarred them for life and now they can’t look at a math textbook without puking). The technology might not be there yet either.

Overall, I’m in the opinion that education is the key to most things in life. If there was a way to make teaching tools better, I would jump at the opportunity – while being respectful of peope’s privacy and information.


The Social Life of Numbers

Increasingly, data analytics is becoming a major driver in many markets. This is largely in part due to the proliferation of data that is out there and the many sophisticated tools that people have developed for analyzing this data. Now, more than ever, businesses are able to make informed decisions, and conversely businesses are realizing that to ignore data would prove detrimental to their success. Publishing is seeing uptake of this mindset with initiatives such as Booknet, Nielsen BookScan, (now The NPD Group), and Bookstat, among others, which track book sales, and projects that attempt to mine the data of literature at more granular levels, such as plot and sentence structure. Other initiatives are aiming to crack the “blockbuster” code—that is, scan manuscripts using a sophisticated algorithm to determine whether or not this book could be the next big hit.

I support the gathering and usage of data at the point-of-sale level. This data can provide insights about the size and shape of the publishing industry, help publishers manage inventory and distribution, and can also be used to help predict sales, which can help publishers at numerous stages of the acquisition and production process. I believe that this kind of macro-level data can support the human decision making process without supplanting it, and it is for this basic reason that I object to the use of algorithmic data to scan manuscripts. I believe that data use in this way would fundamentally stifle innovation because the algorithm would essentially be backward-looking, because it was built using books already published. For this reason, I also feel like it may be unable to accomplish the task it was designed to do. Blockbusters are so successful partially because they are doing something new or fresh—readers are intelligent, and they know when they’re being sold something that they’ve seen before.

Where I feel that data could be used more meaningfully and beneficially in publishing is in the area of marketing and social media. Increasingly it seems to be the case that books live or die depending on their author’s social media platform and presence. I believe that this is owing to the ubiquity of social media—people are now able to be connected to almost everyone almost always, which has conditioned them to want this. Consequently, the figure of the author is becoming more and more central to a book’s success.

So, what if there was a way to analyze an author’s social media presence and reach in a streamlined way, and then apply that knowledge to knowledge of the social media market on a large scale, to help construct and plan a social media strategy to gain that author the greatest reach possible? An algorithm could be constructed based off of press campaigns for past books and authors, sales data, and social media reach before and after the campaign. Ideally, the algorithm could also look at market distribution to help publishers plan book launch tours based on where receptive audiences (according to interest, affiliation, etc.) cluster.

Essentially, I’m not comfortable using data to help shape the history of literature. I believe that that should be done with the human eye, to allow for and encourage innovation. I do, however, believe that we could be using data in a more meaningful and robust way to help market books once they have been selected for publication.



Yes to Share

Since the rise of the Internet, more and more businesses are focusing on all the data they can collect and buy in order to generate more profit and attract more customers. The goal of data democratization is allowing anybody within the industry to use data at any time and make decisions without any obstacles. Data democratization can be of great use to collectively help the growth of these businesses, but in the world we live in, a democracy cannot be attained easily. When it comes to data democratization, each entity looks at it in a different way. Business who have a monopoly are less willing to share their data, while small businesses that do not have a monopoly can benefit more from receiving data and are willing to share their own in return. There are pros and cons to data democratization in the publishing industry. Freely sharing data in the publishing industry could be beneficial considering the people who work in the industry are usually passionate about what they do and are more interested in sharing their projects than doing strictly business.

In the publishing industry, data is needed now more than ever. About 1 million books are published in a year in the US only but the sales numbers are unpredictable. Tracking, analyzing, and understanding the readers is critical to the survival of the book. We are now witnessing the rise of new startup platforms whose main goal is to collect not only sales data but focus on the reader’s habits too. Having all the data from all the publishing houses are combined, not only will it result in making better business decisions but can also decrease the “book rejection” percentage.

Data democratization in the publishing industry means also Amazon should make their data available. Since Amazon is a dominant player in the publishing industry, this cannot be seen as a possible option for the time being since. Not having Amazon’s book related sales data, leave a huge gap in the data of the publishing industry. But that should not stop the publishing houses and the (online)bookstores collectively combine their powers and share their data. Consolidating the power of the publishing houses and the platforms that collect data within the publishing industry can truly make a  difference in the future of the publishing industry. From acquiring authors and titles to publishing the books.

Considering the data-driven era we live in, now is the time for publishing houses to share and combine all their data, not tomorrow. We have numerous authors rising and a huge number of decisions to be taken. If the publishing industry focuses on following only their gut and not the data, the sales numbers will remain unpredictable, and the levels of book rejection will stay high.

The perfect world of metadata and all the diverse things it can lead to

Perfect, high quality, complete metadata. Sounds like the modern publisher’s dream. I’ll focus on what my perfect metadata world would look like with a focus on diversity.

Diversity in content being discovered 

I can see complete metadata allowing more diverse content becoming searchable and discoverable. If publishers or a metadata “inputter” took the time to put in the correct keywords and tags that are related and respectful of the text, I think more books and other media can be discovered and accessed.  This is obvious and important.

However, I believe with greater access and discoverability comes greater responsibility. As material becomes highly discoverable and spread around, there may be cases where the text is being misused or not understood in the right context. Some texts circulating outside of a certain community or group of people may not be used as was originally intended. If our perfect, high quality, complete metadata has taken this into consideration, systems would be put in place so that if texts need to be used or read in a certain way, the metadata will tell you. The example that comes to mind is the Traditional Knowledge (TK) Labels. Below is a quote about what it is:

“The TK Labels are a tool for Indigenous communities to add existing local protocols for access and use to recorded cultural heritage that is digitally circulating outside community contexts. The TK Labels offer an educative and informational strategy to help non-community users of this cultural heritage understand its importance and significance to the communities from where it derives and continues to have meaning”

I believe such labeling systems must be incorporated into metadata so that we can prevent books and other media that we’re not familiar with from being misused. Here are some examples of the TK Labels:

Diversity in metadata formats 

Some may say that the perfect, complete, high-quality metadata might follow a universal structure. This might be an unpopular opinion but I don’t know if metadata should be in a universal format. My world would have many different metadata formats. The easiest analogy I can think of this explain my reasoning is the metric system, the imperial system, and various other measuring systems that are not so common. Many people argue for a universal metric system…however this may not necessarily better or a useful solution. My housemate, a math teacher,  was telling me about his experiences living in Thailand and learning traditional weaving from a group of indigenous Thai women. He learned they had their own form of measuring and math that suited their needs and was appropriate for them. It bared no resemblance to the metric or imperial system…which would have been completely useless to them.

I might be taking the analogy too far, but I see the same thing occurring in the publishing world. Based on the publisher’s content a universal metadata format (like subject category schemes such as THEMA or BISAC)  might not work for them. For example, if you’re a publisher that is focusing their work on a certain group of people or interest and the differences in content are very clear for you, you may want a metadata system that fits and that can categorize based on the content. This type of cataloging might be missed or ignored in a more universal system and your diverse books may be lumped into one group. Perhaps, if a universal system like THEMA can put systems in place to achieve such diversity, then maybe it can work. (Apparently, they are according to this Booknet article. In April 2018, a version 1.3 of THEMA included 260 new subject categories and 150 new qualifiers.)

Diversity in monetization 

I might be stretching it with these “diversity in” headings but this is the last one, I promise. Another thing I can envision with perfect, complete, and high-quality metadata is the different things within a published work that can then be monetized. For example, in this Publisher’s Weekly article from 2018, it states that an Indian publishing service called Lumina Datamatics is working with the scholarly publisher Wiley to “use metadata to string together disparate strands of content to create new assets”. Wiley’s published works have good metadata attached to each, and so Lumina can easily discover and pull things like visual content (figures, diagrams, graphs) from academic papers to then make available and sell separately. It creates a new source of income for Wiley. Having spent thousands of dollars on Wiley textbooks and resources during my undergrad, I’m a bit bitter about this and not very supportive of Wiley’s new venture…but I can see how this would be a useful thing to do as a small publisher strapped for cash.



I think the world of perfect, high-quality, complete metadata is very enticing and alluring. I believe it’ll lead to a lot of benefits such as diverse material being discovered and new assets being formed. However, it does come with more challenges that need to be considered as we move forward with optimizing and encouraging metadata input.

The 2012 Publishers Perspective article on How to Sell More Books with Metadata had made an argument that “Enhanced metadata can increase discoverability of books and provide marketing information to the entire publishing supply chain.” It is clear that while other sectors of the publishing industry have been making more use of metadata; the book publishing industry (and e-book) have not fully maximized the potential of metadata. One of the key issues is in the lack of a standardized set of metadata. In an ideal world where big publishers, small publishers, and even Amazon could come up what this standard would be, it would greatly improve the industry and perhaps actually sell more books. Better quality metadata is certainly important in a world where more of our buying habits are moving online.

As Jamie had mentioned in his presentation, most publishers don’t have the resources to devote to producing this high-quality metadata. The Scholarly Kitchen frames the use of enhanced metadata as “marketing investment of the digital age”. Framing it in this manner could help publishers allocate money/resources into producing better quality metadata. This is where the integration of an automated program may be beneficial. Perhaps if an algorithm could be trained to scan books and gather this information and have publishers review this information to ensure accuracy in what is being produced. The publishing industry would probably need to rely on a third-party company to execute the use of algorithm into their process.

One of the key precautions that the industry would need to have is over-reliance on a single company. In an ideal world, once the metadata fields have been standardized across, the work should go to numerous small tech companies rather than the whole industry relying on one. This would most likely address the issue that we’re facing of companies becoming too large. Some of the potential dangers of using a single company for this type of service would be enabling them to become a monopoly and could potentially drive prices at a rate that is unattainable for smaller publishers to afford.

Data Democrazy

In the game of monopoly, the player that ends up owning the most houses win, stealing all of the opponents’ properties and leaving them in bankruptcy. The real life version is the same: the top dominant companies share the same sin: greed. In business, the main objective is to earn the most money, so it shouldn’t be a surprise when a business wants to be the biggest, wealthiest player by vacuuming the smaller companies and gaining the most profit. There is a large, growing danger that one day, if that day comes, the biggest monopoly crashes and leaves the entire economic market in footprints of dust. What will we do? What will we do when all of our information, fed through the algorithms to the big monopoly business’ selfish profit, is gone? I understand that it’s hard for multi-billion companies to want to control the metadata that makes them succeed in their business endeavours. In Joe Karaganis’ article, “The Piracy Wars are Over. Let’s Talk About Data Incumbency,” he shares that

 “The reason for this secrecy isn’t a mystery. It’s a big advantage to know more about your market than your competitors, users, customers, and—ultimately—regulators. Controlling this information raises barriers to competition and makes it easy for anyone sitting on the information-poor side of a negotiation to get taken advantage of without quite being able to say how.”

Essentially, big companies leave us in the dark. All they do is gain and all we do is lose our information to location services, customer surveys, liking things on Facebook, adding Amazon deals into our wish-lists, scrolling through infinite meme threads, etc. Karaganis continues that “in practice, almost all successful steps toward systemic data disclosure have been linked to regulatory pressure or fears of liability… it took a decade of escalating scandals and congressional threats to push Facebook into data-sharing arrangements with academics.” This left me wondering how much more would it take for the democratizing of metadata. Could there be a world where there are no gatekeepers and everything is an open-data agenda?

Bernard Marr in “What is Data Democratization? A Super Simple Explanation And The Key Pros And Cons” explains that the key benefit to data democratization is that “when you allow data access to any tier of your company, it empowers individuals at all levels of ownership and responsibility to use the data in their decision-making.” It could be a game-changer, where all parties within the economy can have equal use of consumers’ information. Can you imagine how the publishing industry would change if everyone had access to Amazon’s data? But I can’t imagine a world where Amazon would ever allow that. In the defeat of Amazon, could another Amazon reform?

I admire the idea of metadata democratization because it could create a fairer market. It could help smaller companies better understand the value gap within each market and the size and power of each market, specifically benefiting the creative markets. However, I’m not convinced that this is possible in our current market (or near future one?). If everyone has a seat at the table, then who is out competing for the food? I don’t believe that any business can survive without a competitor, even including non-profit companies. Competition is a useful tool in gaining new perspectives and growth. Competition allows brand authenticity and uniqueness. If everyone is the same, then why would a person choose one over the other? If there is no choice to be made, then there is no data, no business, no market, I don’t know what there is. I don’t believe we will reach a time where there isn’t a big scary, mysterious Amazon in the picture, but for right now, I believe we can keep the dialogue and discuss/ share new ideas on how to make the playing field a little more fair, but controlled. My idea is to steal a couple ‘get out of free cards’ and stash them in the bottom of the deck… what’s yours?


An easy way for me to wrap my head around metadata was the hashtagging style: a style of tagging that rose to popularity while I was a digitally active teenager. An idea launched into the Twitter ether by former Google developer, Chris Messina, would help sort and categorize ideas, without the need for any special backend working or any sort of coding knowledge. “He chose the # symbol because it was an easy keyboard character to reach on his 2007 Nokia feature phone and other techies were already using it in other internet chat systems”, as explained in this article.

As it usually happens when change is introduced, Messina’s new idea got its fair share of hate. He said:

People were like, that’s weird, that’s kind of dumb.

Yet it was an idea that caught on. Now, hashtags are decided, created and user tested before campaigns are formally launched on social media; the hashtag being of prime importance to decide the campaign’s social media success. A very successful example is the recent #metoo hashtag; with global reach, it is now called the #metoo movement.

Similarly, metadata is easily explained by Edward Nawotka as

All of the information associated with a book or publication that is used to produce, publish, distribute, market, promote and sell the book.

In the publishing realm, perfect metadata can better serve niche audiences. In addition to word of mouth, mega-metadata can round up the thematic content in one place. Similar keywords would yield consolidated searches, thus making discovering a particular genre or topic relatively more straight forward.

Secondly, I think algorithms could improve. Mega-metadata means the algorithm could respond to our queries in an exact way and maybe even give perfect suggestions.

Thirdly, I feel that SEO (Search Engine Optimization) would have to be re-worked or maybe even eradicated since people would be able to find what they wanted with a couple of correct keywords. Maybe there would be a website that has an anthology of all the keywords ever registered! I imagine it would look like Craigslist (hopefully with a less offensive blue). Mega-metadata has the power to make finding/searching more convenient, although it asks for a painstaking categorization and curation of information at the publishers’ end.

I’m not very certain but I also think that marketing would not be the same as it is today. Book Marketers/Publicists would have to change tactics to work around equally discoverable titles in a sea of keywords. Since searching for a particular keyword could bring forth all the relevant titles, marketing might have to go through some extra steps to get a particular book noticed. Everyone could get the same amount of exposure; it would be just “fads” dictating the bestsellers’ lists.

I’m kind of excited for this: since I often fail to find a similarly themed book without going through Reddit (which, for me, is the least credible source). My quest for engrossing content leads me on many online voyages which costs me time and effort (not to mention being an excellent way to procrastinate).

It is a concept too good to be true, but maybe we see mega-metadata in a couple of years.

Envisioning better metadata

What might be possible/different could you envision if we had perfect, high quality, complete metadata that was community based? If we could get the publishers and Amazon to cooperate?

I think achieving perfect metadata would be impossible, but I can envision what it might be like to have a community that was based around achieving greater metadata. To me, libraries and wikipedia are examples of community-based projects that are dedicated to preserving and sharing knowledge, and I think a similar mission could be reached for creating better databases for books. As Pressbooks points out in their article “What We Talk About When We Talk About Metadata,” metadata is an incredibly valuable resource that can make or break a title (or even a publisher).

In the article, Laura Dawson states that,

“The publisher (and retailer) with the best, most complete metadata offers the greatest chance for consumers to buy books. The publisher with poor metadata risks poor sales–because no one can find these books.”

In this data-driven economy, good metadata is essential for discoverability, and there are likely so many titles that have fallen through the cracks because of poor metadata. Even in the fanfiction community, proper tagging is a must. You want your title sorted with the right trope, for example, to ensure that you reach the right audience. Too many tags can be off-putting, and the wrong tag or “keyword” can prompt ill-will from readers who feel that they have been misled. Without metadata, nobody sees your product through online databases, and you never achieve visibility in the vast sea of other titles or products that are out there.

For independent publishers, libraries, and self-publishers, understanding how to create strong metadata is especially important. A community based on creating better metadata on behalf of libraries, for example, then libraries could keep better track of how many books they have, when the sequel to a book they have will be published, and other details that could prove valuable to readers. Better metadata makes it easier for both librarians and the users who use the online databases to determine availability and the quality of a title. Again, this in accordance with Dawson’s point that now more than ever, readers are looking to see more metadata: “Consumers wanted to know as much about each book as humanly possible. They wanted cover images, robust descriptions, and excerpts.” This is equally true for consumers who are using libraries, and so better and more complete metadata would libraries would have a tremendous impact.

If independent publishers and self-publishers had better metadata, then they could compete with commercial publishers at a much higher level. The ebooks of self-published writers are especially susceptible to having their books being lost in the void that is the internet, or Amazon specifically.

I do think it would be incredibly difficult to persuade publishers and Amazon to contribute, however. Amazon jealously guards all the data they have about their consumers and algorithms. Metadata is no exception to this. As Dawson argues, strong metadata is a competitive advantage, one that Amazon is excelling at, and I cannot imagine that they would forfeit that advantage unless they were legally obliged to do so.

In conclusion, I can indeed imagine what things would be like if we had better and higher-quality metadata, but I think getting bigger publishers and Amazon to fully cooperate would be challenging.

Works Cited

Don’t Mine, its Mine

We always underestimate what we have until we lose it. My location tracking according to Google started in 2016. It is scary to know how much data is collected about you, how your personal information that you once thought nobody knew is all stored somewhere. Data privacy is an issue people are starting to be aware of. A survey conducted in 2016 (see graph ) showed that, globally, over 50% of Internet users were somewhat more concerned or much more concerned about their privacy than in 2015.  This is understandable as more companies like Facebook, Google, and Amazon are using and selling our information without our full awareness. Data privacy is a problem that has been recently identified and actions should be implemented to solve this issue before it escalates thus making it even harder to find a feasible solution. I think at this point we should focus on pushing for transparency as it is unlikely that social media companies will stop collecting our data. If users are at least informed about where their data is going, they can be a bit more in control of it by deciding whether to join the website or share their information with them or not.

The Internet is not what it used to be. In the beginning, we would use it to send and receive information. Privacy was a small concern. Now, Zeynep Tufecki describes the Internet as a surveillance machine. Facebook, one of the main companies that own a lot of user data, collects user data to create a platform for advertisers that will generate billions of dollars. Facebook is not open about this aspect of its business and only discusses its intention to connect people around the world. Does this make us as users angry? Yes! Why? For a lot of us, it is not because Facebook has our data. Let’s be honest, we have been suspicious of  Facebook for a long time. The problem here is transparency; how does Facebook use our data? Facebook has been selling our data to other organizations like Cambridge Analytica, who were using the data for things like the American presidential election without our consent. This made users concerned about what truly happens behind closed doors in companies with access to so much valuable personal information.

Data is a fairly new term that business and people have been recently using but not everyone fully understands it. Those in charge of making laws should be people who are fully aware of how data is collected, how social media platforms work, and how privacy can be breached.  A recent example of how politicians are not informed on the topics they should can be seen in Mark Zuckerberg’s hearing in the U.S. When he was questioned by the US Congress, it was obvious by the kinds of questions some members asked that they did not understand how Facebook worked.

One of the business models I personally admire is Everlane, a clothing brand. They simply focus on being transparent in every step they take in their business where they provide the actual cost and the markup compared to other stores. People appreciated it, loved it and bought their product. Although the Facebook business model cannot be easily changed, maybe transparency can be seen as the first step towards a bigger solution. If users are fully aware of how social media companies process their data and the benefits it has for them, there would not be as much anger and they might be more appreciative. Giving users the opportunity to agree or opt out of having their data collected and sold in exchange for a benefit (for example, it lets Facebook show you relevant content and the service remains free) would allow people to make informed decisions. If someone did not want to have their data collected, Facebook could provide the option of paying a small monthly fee instead. It is important to remember that when a service is free, it is because the user is the product.

Facebook will not stop collecting data; data is now considered as the main reason for business growth.  Therefore, instead of being against it, we should appreciate where we are at now and companies should use it to benefit the users. Laws should be implemented not to get rid of companies’ ability to store our data but so that companies are transparent and users are aware of what is being collected and for what purpose. That way, everyone can provide informed consent rather than being in the dark.

Orwell Would Be Proud: Privacy, Corporations and Data Surveillance

What’s the year? 1984. Not quite, it’s 2019 despite the fact that mega-corporation Facebook is running social experiments, the government is listening, and Amazon is watching. Multi-billion dollar corporations and the government are in bed together, and they’re clearly benefiting from each other and all the information they’ve collected on us. We’ve sold our souls (private data) to the Devil (Facebook, Google, Amazon) for eternal euphoria (funny cat videos). But we agreed to it, right? It isn’t spying if we consent to it, whether we’ve read every word of the terms and conditions or not. Maybe sharing your information with one corporation would be better? Let’s combine multiple platforms and just put all the data collection in a one-stop-shop, as Mark Zuckerberg is proposing. You only need one app, one platform, one secure place. You can communicate with your friends and family, make purchases, share images, whatever you like, and it’s all private (right?). Hey, it’s working for China, so why not North America and the rest of the world.

Worst case scenario? We live in an even more Orwellian future than we do now. One single source of information with one single entity in control who is watching us inside and out. Amazon has developed camera technology which they use in their Amazon Go store that can tell the difference between each product in the store and charge the customer accordingly. The fact that these cameras can tell the difference between a soup can and a bag of trail mix isn’t terrifying, but imagine if that technology advances to the point where it can recognize one person from the next. As per usual Amazon is as opaque as ever about what they plan to do with this technology, and there has been speculation whether they’ll sell it to other companies or not, even though they claim they have no plans to. Oh, wait! They’re already selling facial recognition technology to law enforcement and the US government. Better yet, it’s not fine-tuned which leads to more problems than solutions with racial and gender biases. Can you imagine these cameras on every street, watching every move and reporting back to the government (corporations)? Google already knows where you are, but know they’ll be able to see you too.

Best case scenario? We stand up for our right to privacy and put privacy laws like the General Data Protection Regulation in place, which is a decent start to getting these companies to being more transparent. Whether we like what we see when we actually get to see it is another story, but at least we wouldn’t be blindly consenting (which is the biggest paradox) to the kinds of data collection they’re doing and who they’re giving it to. It’s not like all data collection is bad, and it can feed some algorithms (but not all) that help us with discoverability but we need to take the time to examine the ethics involved in data collection and the predictive analytics and data that result from it. There are concerns of social inequality, discrimination and privacy that data mining brings and that have very real effects outside of the digital world. As a society we need to think more critically of who is controlling the algorithms, the data collection and what they’re doing with it because every corporation has their own motives that they’re not keen on sharing with us.

🎶Don’t Wanna Be an American Idiot 🎶 (looking at you, Congress)

Overall, I am unsurprised by the lack of data privacy online. I’ve known for a while now that something is tracking what I’m doing as I do it, whether it be Google, Facebook, or Apple. However, it is a bit frightening to see it all laid out in places like Dylan Curran’s twitter feed and to see how google maps tracks our movements throughout the day. What frightens me more than either of these things is what unregulated entities might do with that data on a personal and political scale.

Although I would like to believe the government is attempting to regulate big businesses like Facebook and Google, every day we see that they are focusing on the wrong things. In the Google Congressional Hearing, held on December 11th, 2018, the American Congress had the change to question google on how it abuses data privacy and its way of handling that data after compiling it. Instead of doing that, however, the members of congress decided to focus on things that had nothing to do with privacy and everything to do with the more self explanatory algorithms almost anyone under 50 can understand (Lapawowsky, Congress).

Footage of me watching congress date itself to the age of the dinosaurs

This not only proved that Congress is incredibly out of touch (watch this video for evidence- these congress people are ridiculously embarrassing) but that the government in general is focused on only the superficial issues surround tech giants because they do not understand the more pressing matters. Not to mention, the big companies do not want regulation and we know that big companies have a big stake in government, regardless of what people say.

We’ve seen how companies like Facebook an influence political situation through the 2016 election, with the Cambridge Analytica Scandal. But on a more personal note, a lot of these companies gather data about buying habits that can negatively impact people on a day to day basis. In this case, I will refer to the experience of Gillian Brockell, a woman who continued to receive ads as though she gave birth to a baby after delivering a stillborn child (Kindelan, Woman).

She posted on twitter, stating;

“Please, Tech Companies, I implore you: If your algorithms are smart enough to realize that I was pregnant, or that I’ve given birth, then surely they can be smart enough to realize that my baby died, and advertise to me accordingly — or maybe, just maybe, not at all […] We never asked for the pregnancy or parenting ads to be turned on; these tech companies triggered that on their own, based on information we shared. So what I’m asking is that there be similar triggers to turn this stuff off on its own, based on information we’ve shared…” (Kindelan).

This is just the tip of the iceberg on the way that data mining infringes on privacy. Situations like the Google hearing and like Brockell’s situation (in which I doubt much has been done to change the algorithm, despite public outcry) make me doubt that any government backed venture or internal change is likely to happen any time soon. Until then, I’m just going to accept that I have to be careful with my searches and try to limit what I put online.


Work Cited

Kindelan, Katie. “Woman Demands Change from Tech Sites like Facebook, Instagram after Receiving Parenting Ads after Stillbirth.” ABC News. December 13, 2018. Accessed March 13, 2019. https://abcnews.go.com/GMA/Wellness/woman-demands-change-tech-sites-facebook-instagram-receiving/story?id=59799116.

Lapowsky, Issie. “Congress Blew Its Hearing With Google CEO Sundar Pichai.” Wired. December 11, 2018. Accessed March 13, 2019. https://www.wired.com/story/congress-sundar-pichai-google-ceo-hearing/.

All Hands on Deck: Government Intervention in Data Privacy

Capitalism is so embedded in the way in which our modern North American society operates, impacting all of the transactions and interactions that we have with companies. Big corporations worth billions of dollars have such an incredibly strong sway in what happens in the marketplace, that it seems nearly impossible for an individual or small group to lobby and influence how they do business. In order to gain hold of our data privacy and stop the momentum of surveillance capitalism, change will need to happen at the institutional level. We need to get the government involved.

The data privacy issue continues to grow as more and more details come out about the seemingly endless data that is able to be mined about us right down to our exact travel path on a daily basis (plus our search history, files of all kinds from texts, photos and voice messages, and the list goes on). Unfortunately, I am not the slightest bit surprised when confronted with the amount of information that tech giants like Google and Facebook collect about us. The technology that we use in our daily lives (phones, smart watches, apps, social media platforms etc.) is so interconnected, easily trackable and constantly backed up to servers. We appreciate these services when they help us access information that we want to store like our emails and anything we choose to put into the cloud like documents and photos. We also want instant access to the data of our friends and family (and sometimes even strangers) through our social media accounts and we willingly input data into these services on a daily basis. Our input helps these tech companies create ever more robust platforms that continually learn more and more about us.

What we are much less comfortable with is the data that we don’t see and how that data is ultimately being used. For the most part, our data is being used for capital gains. When it comes to data collection, I believe it’s important to remember that we as users are not really the ultimate customers of services like Facebook and Google. Yes, they have to deliver on some promises in order for people still want to use their services, but ultimately these tech giants are serving the needs of advertisers rather than the readers, browsers and users of their platforms. The bigger they get the more advertising dollars they can bring in.

The tech giants are out to dominate their industries and claim the lion’s share of their markets and they do so by cashing in on more new tech. Giant corporations scoop up new ways of gathering data and tracking users by investing in their own research and development or by buying smaller tech startups (see a list of acquisitions that Facebook has made here) who have tapped into something of interest. Because of their sheer financial power to dominate over other businesses and bully the market, the government is required to step in. 

It is quite interesting to note that even Mark Zuckerberg himself feels that it’s important for data to be regulated, but the big issue remains, how? There are a few examples of cases where the the government has stepped in, such as the California Consumer Privacy Act which was passed in 2018. The three major tenants are:

1. You will have the right to know what information large corporations are collecting about you.
2. You will have the right to tell a business not to share or sell your personal information.
3. You will have the right to protections against businesses which do not uphold the value of your privacy.”

It’s hard to tell presently how well this is working in the state of California, but it shows that passing this type of law is something that people are very interested in doing (even if the big tech giants strongly opposed the bill). But it is these tech giants, and their seemingly unlimited funds, who need to be stopped and the government can’t let them just throw bunch of money around to try to stop the regulations.

We still have a lot of work to do in Canada as the Privacy Commissioner stated that they don’t have the funding they need to adequately protect Canadians against privacy issues. We as citizens need to get more involved to keep pushing our law makers. A new privacy law now ensures that Canadian companies have to let their customers know when their data has been leaked, but what recourse do we have once it’s been leaked? That clearly isn’t good enough.

It’s very easy to feel disenfranchised when you see that corporate giants like Amazon are buddies with the government bodies like the Department of Justice for example, but it is still important that we continue to push law makers for better protection. In reference to this Mike Shatzkin article (via hypothes.is), SFU Master of Publishing student Jaiden Dembo stated “If law can be put in place to help these behemoths grow and dominate the market, then the opposite can be true as well.” Though there is a lot of muddy water to sift through when it comes to data protection and change will take time, it’s something that’s worth fighting for.


I have no data to hide, do you?

It shouldn’t be a huge surprise that the internet lacks data privacy, despite the top tech companies saying that they will implement better security and privacy, like Mark Zuckerberg’s new vision of an “a privacy-focused messaging and social networking platform where people can communicate securely”, or the US government’s initiative of establishing better antitrust laws, like Elizabeth Warren’s presidential campaign proposal to dismantle the biggest tech companies, Facebook, Google, Apple, Amazon, and forcing them to separate and restrict major mergers. I walked into this idea of data privacy with a popular mindset: I have nothing to hide, so why should I be afraid if someone has the balls to hack and expose me. I still struggle to believe that a place like the internet can be a private place, and can’t help but reflect that as much as we don’t like these big tech companies stealing our data, it is like a paradox. We, as users of the technology, don’t want them stealing our data or sometimes having our data at all, but we still contribute to this big capitalistic system by using their technology. In order to benefit technology as a whole, data is required to make better products for our needs. Could it be for the greater good? I agree that when data is taking from us without our permission, we, as users, can feel a mistrust with the tech company. As Avvai shared in her blog post, “Facebook’s new privacy plan might not actually be helping us out” it’s not about not wanting using technology at all for the best form of privacy. They can be “really useful tools. We just don’t want it being shared without informed consent.” 

Businesses try to gain as much information about us as possible so they can gain the upper hand from their competition and create products that best tailor to our consumer demands. I feel like a lot of people are aware of this issue, ever since the circulation of government surveillance ideals from George Orwell’s 1984. This leads me to believe that there isn’t such a thing as privacy within a public sphere; there can’t be. If you truly don’t want someone exposing you or knowing something about you, then your best chances are living with a dead person.

I came across this article by Thomson Reuters Foundation that suggests future cities exist by data-driven sustainability. In the article, Toronto is described as a “smart city”, where future developments or enhancements to the city would be made by installing digital systems in public/private spaces to record data of what inhabitants do with their garbage, water, and power. However, in a recent survey from McMaster University, 88% of Canadians state that they are extremely concerned about their privacy, and 23% of them are “extremely concerned.” This makes me reflect that it’s not so much about educating the public on data privacy; a lot of people are more than aware that it is an issue. It’s understanding what we, as tech users, should do to become better equipped with our data and to gain agency and authority to not let big tech companies steal the information without our permission. Tech companies have become so dependent on our data. Could there even be another way around this? Without data, how could we see the improvement to any innovative endeavour within the technology in our lives? Or in a city, we can live in like Toronto. Geoff Cape from Future Cities Canada shares that “despite the privacy concerns, effective data use is crucial for combatting the environmental challenges cities face and making them better places to live for growing populations.” Tech companies have become so dominant in our society, I’m not convinced that a proposal like Elizabeth Warren’s can save us now. We’re in too deep.

If the potential for data privacy breach is the enemy, awareness is your weapon

Undeniably the issue of data privacy has become an increasingly important issues after numerous data breaches throughout the 21st century,  the scandal around Trumps election, and the growing concern surrounding Facebook and its collection/use of personal information. Data being everywhere yet many of us don’t really know its hidden value! We as a society value personal privacy, but we often fail to think twice about our privacy in an every growing digital world. For me growing up, the idea that whatever you put online will can never really be delete has stuck. Whether this was true or not, it framed the way I interacted with the internet and determining what information I put out there myself. The starkly twitter thread by Dylan Curran scarily shows us that boy, oh boy, that it’s true.

With almost all of North America actively engaging with the internet and children starting at a very young age, the question becomes why isn’t there greater awareness of data privacy? For me personally, internet safety was not an issue that was fully addressed within my own education experience.  While reviewing the BC educational curriculum, it is unclear whether data privacy is something being taught (or to what extent) in the classrooms. Under digital literacy the following content is the intended teaching outcome:

Internet safety

  • digital self-image, citizenship, relationships, and communication
  • legal and ethical considerations, including creative credit and copyright, and cyberbullying
  • methods for personal media management
  • search techniques, how search results are selected and ranked, and criteria for evaluating search results
  • strategies to identify personal learning networks

As children are engaging with technology at a much earlier age, schools should be doing more to educate students as they are probably one of the most vulnerable and soon enough the target market. While the current curriculum does address some of the topics, it might be helpful to have a better understanding of privacy policies and settings that can protect you and your network of friends. As for the rest of us, it would be helpful to become self-aware of data privacy issues. We could probably start by reading the terms and conditions of the sites we are engaging with. It is refreshing to know that it is something that is starting to be discussed on a political scale with Elizabeth Warren proposing to break up large companies such as Facebook in her presidential campaign. While other candidates sharing the same/similar sentiments.

Facebook’s new privacy plan might not actually be helping us out

This week Mark Zuckerberg announced on his blog a new vision for Facebook, social media, and the web.  He wants to build a messaging platform that’s privacy-focused. He dives into the seven principles he wants to enforce: private interactions, encryption, reducing permanence, safety, interoperability, and secure data storage. He compares this space to a ‘private living room’ compared to the ‘town square’ approach to social media.

The Guardian response to Zuckerberg’s post illuminates that this would be done by integrating the messaging systems of Instagram, WhatsApp, and Messenger.

I think there are two problems with this:

  1. Integration
    I can see the appeal of integrating the different messaging systems into an all-in-one platform. You don’t have to waste time checking multiple apps, you won’t have to worry about which platform to message a friend, etc.   Personally, I would find this annoying as I use these apps for various purposes and check them at differing regularity. I don’t necessarily want the be seeing messages all the time from the various different networks of these apps.However, aside from my personal views,  I think this new move allows Facebook to be an even more powerful factor in our lives. It wants to curate and shape our living room / private space as well. Not only that there are still problems that can occur in these so-called “private spaces”. For example, India will be having it’s presidential elections this year and it’s been dubbed the “Whatsapp elections”. Whatsapp is highly popular in India, and political parties have been recruiting these “cell phone volunteers” to create neighborhood Whatsapp groups to spread biased information around. The same issue with Facebook and the spread of misinformation can still occur on private messaging platforms like Whatsapp. According to the news article “The misuse of WhatsApp has been connected with at least 30 incidents of murder and lynching, for example following the circulation of children abduction rumors.”
  2. Ignoring the original problem 
    In his blog, Zuckerberg starts off his piece by saying:

    “Over the last 15 years, Facebook and Instagram have helped people connect with friends, communities, and interests in the digital equivalent of a town square. But people increasingly also want to connect privately in the digital equivalent of the living room.”

    Sure, we want that! But judging from the reaction of the Cambridge Analytica scandal what people really want is their private data not to be sold to advertisers without our informed consent. It seems like Zuckerberg is ignoring the problem (or perhaps just trying to shift our focus) of Facebook’s data-surveillance business model and trying to grow and expand his already massive business by implementing a new platform. Data that was supposed to be only shared with our friends and family and people we chose to be on our Friends List was sold to third-party advertisers. How is creating a private messaging system going to solve that issue?

    Facebook is not getting rid of the newsfeed… which I don’t think people want anyway. I think we still want to share things to a wide range of people. We just don’t want Facebook sharing our private data from our private profiles and from our apps. For example, Sophie shared an article with us about how apps like a menstrual-cycle tracking app and a heart rate app are sharing the data with Facebook who in turn sells this information to advertisers.  We don’t want to stop using these apps – they can be really useful tools. We just don’t want it being shared without informed consent.

Overall, I think Mark Zuckerberg is not addressing the problem the public is criticizing him with and instead introducing new growth models for Facebook. I’m not sure if we’re gaining anything from this new policy move. Zuckerberg is obviously a smart guy. My personal thoughts are that he’s very aware of our growing fear around sharing information in public spaces now. He might also be forecasting a decline in using public spaces like Facebook and Instagram as more and more of the public gets to understand the data privacy issues. Therefore to keep his business growing, he’s trying to expand his services into the private communication sphere because until we become telepathic we’re still very much dependent on communicating with one another through technology.