Yik Yak Auto Downvote Script

For research purposes… ¯\_(ツ)_/¯

window.scrollBy(0, 10000);

function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
async function downvote() {
var L = document.getElementsByClassName(“downvote”);
var i = 0;
for (i=0; i<L.length; i++) {
console.log(“Downvoting…”);
setTimeout(L[i].click());
await sleep(2000);
window.scrollBy(0, 10000);
}
downvote();
}
downvote();

Overwatch

I play the game for the end-game music. True story.

Overwatch OST - Play of the Game     

Installing Hadoop 2.6.X on Raspberry Pi B Raspbian Jessie

Configure Java Environment

With the image Raspbian Jessie image, Java comes pre-installed. Verify by typing:

java -version

java version "1.8.0"
Java(TM) SE Runtime Environment (build 1.8.0-b132)
Java HotSpot(TM) Client VM (build 25.0-b70, mixed mode)

Prepare Hadoop User Account and Group

sudo addgroup hadoop
sudo adduser --ingroup hadoop hduser
sudo adduser hduser sudo

Configure SSH

Create SSH RSA pair keys with blank password in order for hadoop nodes to be able to talk with each other without prompting for password.

su hduser
mkdir ~/.ssh
ssh-keygen -t rsa -P ""
cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys

Verify that hduser can login to SSH

su hduser
ssh localhost

Go back to previous shell (pi/root).

Install Hadoop

Download and install

cd ~/
wget http://apache.cs.utah.edu/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz
sudo mkdir /opt
sudo tar -xvzf hadoop-2.6.4.tar.gz -C /opt/
cd /opt
sudo mv hadoop-2.6.4 hadoop
sudo chown -R hduser:hadoop hadoop

Configure Environment Variables

This configuration assumes that you are using the pre-installed version of Java in Raspbian Jessie.

Add hadoop to environment variables by adding the following lines to the end of /etc/bash.bashrc:

export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")
export HADOOP_INSTALL=/opt/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin

Alternative you can add the configuration above to ~/.bashrc in the home directory of hduser.

Exit and reopen hduser shell to verify hadoop executable is accessible outside /opt/hadoop/bin folder:

exit
su hduser
hadoop version

hduser@node1 /home/hduser $ hadoop version
Hadoop 2.6.4
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152
Compiled by mattf on Mon Jul 22 15:23:09 PDT 2013
From source with checksum 6923c86528809c4e7e6f493b6b413a9a
This command was run using /opt/hadoop/hadoop-core-2.6.4.jar

Configure Hadoop environment variables

As root/sudo edit /opt/hadoop/conf/hadoop-env.sh, uncomment and change the following lines:

# The java implementation to use. Required.
export JAVA_HOME=$(readlink -f /usr/bin/java | sed "s:bin/java::")

# The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=250

# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTSi -client"

Also, we need to edit “yarn-env.sh”. Uncomment

#export YARN_NODEMANAGER_OPTS

and write:
export YARN_NODEMANAGER_OPTS=”-client”

Note 1: If you forget to add the -client option to HADOOP_DATANODE_OPTS and/or YARN_NODEMANAGER_OPTS you will get the following error messge in hadoop-hduser-datanode-node1.out:

Error occurred during initialization of VM
Server VM is only supported on ARMv7+ VFP

Note 2: If you run SSH on a different port than 22 then you need to change the following parameter:

# Extra ssh options. Empty by default.
# export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"
export HADOOP_SSH_OPTS="-p <YOUR_PORT>"

Or you will get the error:

connect to host localhost port 22: Address family not supported by protocol

Configure Hadoop

In /opt/hadoop/conf edit the following configuration files:

core-site.xml

<configuration>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/hdfs/tmp</value>
  </property>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:54310</value>
  </property>
</configuration>

mapred-site.xml

<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:54311</value>
  </property>
</configuration>

hdfs-site.xml

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>

Create HDFS file system

sudo mkdir -p /hdfs/tmp
sudo chown hduser:hadoop /hdfs/tmp
sudo chmod 750 /hdfs/tmp
hadoop namenode -format

Start services

Login as hduser. Run:

/opt/hadoop/sbin/start-dfs.sh
/opt/hadoop/sbin/start-yarn.sh

Run the jps command to checkl that all services started as supposed to:

jps

16640 JobTracker
16832 Jps
16307 NameNode
16550 SecondaryNameNode
16761 TaskTracker
16426 DataNode

If you cannot see all of the processes above review the log files in /opt/hadoop/logs to find the source of the problem.

Social Web Readings and Commentary

Herein lies a text dump of the reflections I did for my Social Web class. Meh.

==================

I found the article confusing at first because I lacked the context of what MOOs or MUDs were. Once I read up on those, I realized that what this article in summary was the shift in decision making power in the game to the majority, as well as the creation of tools to police its users.

This reminds me of the game Twitch Plays Pokemon, where players could determine what the game character does by pressing key combinations. The program would process each player’s key in sequence. The task then was to see if the internet hivemind could complete the game even with the chaos that ensued. It took some minor updates to the methodology before it was completed.

Considering this in the context of LambdaMOO, it could be said that the the viewpoints of the majority can be considered the most amenable in any dispute or decision-making process. This is why the final choice was made to implement a voting scheme in-game to carry out tasks which require administrator rights. The implied notion was that concentrating power in the hands of a few people was not desirable. In addition, the administrators did not want to arbitrate in any more disputes.

In sum, making a decision such as kicking out a player is difficult to do so in a personal capacity. However, once you have the majority of people agreeing to your decision, it is easier to do so. And this is what LambdaMOO became.

I find the tables in the paper a challenge to make sense of. However, from the graphs that follow, the main takeaways from the paper are that high editor concentration leads to greater improvements in article quality. However, as time passes and the article improves, more power needs to be relinquished to the masses because ostensibly, the latter does a better job in the maintenance of the article. This is new to me because I have always thought more people involved would always lead to better outcomes. I also like the longitudinal study done for the “Music of Italy” article as it shows the activity of certain groups of users during the infancy of the article.

This article was novel because it postulated a model for conflict detection and resolution. I think it is eye-opening to see how researchers think of quantifiable statistics as proxies for issues like maintenance work (like reverts) and conflicts (Controversial Revision Count).

Also, I think their work on the Revert Graph opens up a lot of possibilities for improving Wikipedia moderation tools. For example, such a tool can be run on each article, giving everybody an overview on the possible bias that they might subconsciously take on the article. Since Wikipedia prefers a nuanced perspective in article writing, extra caution might be given to users who fall on extremes of the Revert Graph. Also, an alert can be given to users who have a biased perspective (based on their reverts) before they commit any changes to the article.

==================================================================================

week 5

The Effect of Online Privacy Information on Purchasing Behavior: An Experimental Study

The conclusion of the study is that “people will tend to purchase from merchants that offer more privacy protection and even pay a premium to purchase from such merchants.” However, as seen from Figure 4 on page 15, there is also a substantial number of people who ignore the presence of privacy cues. For the case of sex toys, with privacy information disclosed, more than 30% of participants bought from the lowest priced vendor, comparable to the most privacy-protecting (highest-priced) vendor. This shows that price is still a strong predictor if a buyer is going to purchase from a vendor, irregardless of privacy concerns. Finally, I feel that this study is at best, unrealistic in the real world. P3P is all but forgotten on the internet and only Microsoft Internet Explorer supports it. If you inspect the network data when you surf Facebook, there is a header that says ‘p3p:CP=”Facebook does not have a P3P policy. Learn why here: http://fb.me/p3p’ – Facebook says that the P3P standard is now out of date.

I regretted the minute I pressed share: A qualitative study of regrets on Facebook.

This study looks into the different types of posts that Facebook users are likely to make and then regret. The main types revolve around posts made in the heat of the moment, associated with undesirable activies as well as those that were not intended to be seen by a specific group of people. A useful extension to the study would be to analyze posts made on Facebook that were subsequently deleted. I cannot confirm but it is likely Facebook has a copy of all posts made, even though it was subsequently deleted, as some (undisclosed) information is deleted only when one permanently deletes one’s account. From this post history, users can be asked the reason why they deleted certain posts, and more accurate data can be gleaned.

Jason Hong: Smarter Phones

In his talk, Hong surmises that privacy concerns arising from the use of smartphones are mainly due to users not being aware of the things that smartphone apps are doing to their personal data such as location and device ID. To dispel such concerns, users should be made more aware of the details each app is requesting. Personally, I feel that a context-sensitive alert about the requests for data each app is using would be beneficial. For example, when downloading Google Maps from the App Store, a prompt that this app would be using location data would not be shown because most users know that this would be needed. However, if a bible app requests for location data, a visible prompt would be shown that such location data is probably not needed for proper functioning of the app. A possible way to get information on what app requires what kind of data to work properly would be to crowdsource information from users. For example, what percentage of users deny the app’s request for data – if the number is high, it is highly possible that users do not think that such information would be necessary.

Making Sense of Privacy and Publicity

In her talk, Boyd points out that privacy is alive in the sense that people do care about it. Privacy is about having control over how information flows. However, companies fallaciously think that people do not really care about privacy because the latter discloses their personal information on the online public sphere, often unintentionally. She then proposes that companies do not egregiously change privacy settings on behalf of users, or ask users to opt out of a possibly controversial feature rather than opt in. She also adds that people takes the public sphere into account when sharing information. When they feel comfortable, they tend to share more, often unknowingly. The onus is then on the company to ensure they are well-informed.

This brings to mind a feature that Facebook implemented a few years ago with the introduction of the Activity Feed called Beacon. Facebook intended to extend the social graph from the Facebook site to what users do on other sites. When someone gets a movie from Fandango, this activity would pop up on the Facebook feed. There was great controversy when users pointed out it would be problematic when someone buys a book called “Coping with Cancer” and the purchase was posted on Facebook for all to see.

==================================================================================

WK 5 EXPERIENCE

Facebook’s privacy policy is located at https://www.facebook.com/about/privacy and it is a refreshing change from the legalese that other websites have. Of course, Facebook has an offical one (https://www.facebook.com/full_data_use_policy) that overrules the more interactive version.

It is grouped into 3 main parts – what information is collected, how Facebook uses this info, as well as how they share such data

In the first part, the main types of data being collected are listed. The more notable ones are data about the frequency and duration of the user’s activities, which are possibly used to determine who the user is engaging with. This might be scare some people who are not used to such a granular level of tracking. According to one website (http://www.huffingtonpost.com/2013/10/30/facebook-track-cursor_n_4178508.html), Facebook is even tracking mouse movements and clicks, in a bid to improve user engagement via A/B testing. Another data of concern is those from third-party partners. Althrough phrased in a nuanced way, these can include data brokers who seek to link your offline transactional data with those that are conducted online via the myriad Facebook buttons on countless websites these days.

In the second part, they clarify that the information is used to improve their services. The one that is of note here is the part about info being used to show and measure ads and services. They have an entire page dedicated to ad privacy and explanation – https://www.facebook.com/about/ads which is also written for the layman. They also mention cookies, but users might not know what they are. A possible way to improve in this aspect is to call is “cookies or tracking beacons” to make know to the user the intent of said cookies, that is, tracking.

Finally, the longest and most detailed portion is the one on information sharing, and rightly so. It is grouped into two parts – sharing on Facebook and sharing with external parties. For the first part, it covers most of what we know about Facebook sharing. When certain Facebook features are used such as Facebook Login, your public info is shared with them – this includes your username or user ID, your age range and other information you choose to share. They are also allowed to share information with other companies owned by Facebook. The second part covers sharing with third-party partners which include advertising agencies as well as research partners. While the former has language that only “Non-Personally Identifiable Information” will be used for the purpose of advertisement targeting and tracking, the latter does not have such information. In the latter case, such partners “must adhere to strict confidentiality obligations in a way that is consistent with (said) Data Policy”. However, this means that this could be the weakest link in the chain that could deanonymize users. If said vendors or partners not adhere to Facebook’s policies, private information that could identify users could be leaked out. This is the most seriously lacking part of the whole policy page.

Overall, I think Facebook’s privacy policy is presented in a very succinct and easy to understand manner with great use of subheadings to delineate each portion. Links to interactive help tools are provided allowing the user to see how their public profile look like from a search engine, for example. Some of the information presented by Facebook is intentionally vague, possibly because it might be too technical for users to understand. Of course, a more conspiratory perspective is that Facebook is hiding something so that it can hide behind vaguely fleshed out terms.

==================================================================================

MIDTERM COMIC 1

One day companies might know you better than you know yourself.

It has already happened – Target is able to identify 25 products that can accurately ascertain if you are pregnant or not – and send you coupons before you even need them.

When you get a membership card from Target, they are not offering you discounts for free – they seek to link your real world purchases (credit cards used) with your online impulse buys. Some even sell or ‘exchange’ your anonymized information with so-called data brokers. However, research has shown that such anonymized info can pretty much identify a specific person with great accuracy.

One day, such tracking will be so pervasive and invasive that we will wonder if there was ever a notion of privacy.

==================================================================================

Week 6

How to Ask for a Favor: A Case Study on the Success of Altruistic Requests

In summary, this paper concludes that showing gratitude, showing intent of reciprocity and an urgent need, as well as a higher status within the community, will increase the probability that one will have one’s request fulfilled. Mood, politeness and similarity were not significant factors.

Before, looking into the meat of the paper, I predicted that only gratitude, reciprocity and urgency were contributing factors. I did not find status to be particularly important because status indicators on Reddit require one to click through to the user profile; unlike forums where you can see how many posts or the ‘flair’ the user has. My guess is that because money is involved, people tend to see that the user was not just created to get free pizza from people. They would like to see an active user in Reddit – as a proxy for how much contribution said user has made in the community.

Investigating the Appropriateness of Social Network Question Asking as a Resource for Blind Users

I like this idea of crowdsourcing for social good. I downloaded the VizWiz app to see how it worked first hand but could not get it to work as the app keeps on crashing at some portions. As expected, the anonymity and speed of on-demand services like Amazon Mechanical Turk makes it a very compelling option to get crowdsourced services at a fee. I did some research on fraud rates on Amazon MTurk to see how much fraudulent completion of tasks were there. One possible reason why Facebook would not be effective is because Facebook’s EdgeRank selects what would be shown on a person’s newsfeed – if a user is not particular active on Facebook, or posts updates that are not particularly popular, their updates would be shown on the website as much. In recent months, there have been a shift towards contextual display of updates to people who demonstrate interest in a particular topic. For example, if I post an update about a new Pusheen Box, Facebook would show my post more to people who like Pusheen. Depending on perspective, this might be a good or bad thing to VizWiz – good because the posts would be more targeted to people with the knowledge to answer questions correctly – bad because the posts would see a more limited audience now. Personally, I feel that Facebook would be a less effective medium for VizWiz users because those users do not ask questions which require subject matter experts.

EXPERIENCE

I had some experience with Tomnod – a volunteer crowdsourcing website that takes satellite imagery from DigitalGlobe to find things. One of their more notable ones that I signed up for was the search for MH370. They provided satellite imagery of the sea floor for people to identigy wreckage. For this task, I decided to try Tomnod again. I started off with the California Valley Fire 2015 campaign, which asks users to look for damaged buildings and impassable roads based on imagery before the fire as well as infrared imagery after the fire. I was pretty frustrated doing it because it seemed they hadn’t improved their infrastructure since I last used it. The site still lagged like crazy and the images refuse to preload, making for a very frustrating experience scrolling the huge expanse of land. My browser crashed a few times during this ordeal.

I switched to a different campaign – tagging Indonesia forest fires and burn marks. I spent around 1.5 hours on this campaign. It was easier to use than the previous one, probably because there is only 1 image to load for each tile. In my time there I explored 3455 tiles (~1700 sq km), tagged 245 objects and achieved consensus on 210 of them. Consensus meant the number of people who agreed with me on my tags. The other features of the site was progress updates disguised as a form of gamification. For example, when I explored 100 Km2, a notification pops up saying “100 Km2 Super Explorer!
You have explored through a full one hundred square kilometers!” I feel that the interface could be made better so users don’t feel irked by the clunkiness. Instead of showing the whole map, they could show the users one tile at at time. Also, they could potentially highlight areas where other users have highlighted to show examples of what is correct or wrong.

2nd Readings

Showing Support for Marriage Equality on Facebook

I read Malcolm Gladwell’s epic paper on slacktivism (Small Change) and his results are particularly relevant. Slactivism does not require much effort on the part of users, and that is why it is such an easy and popular way to support a cause. It is an effective way to get people to sign up for a cause. Moreover, such slacktivism harnesses so-called weak ties prevalent in social networks – people do not know each other personally on Twitter for example. To bring about real change, strong ties must be formed. This happens when we know someone personally taking part actively in a cause. When we know someone personally taking part, we are more likely to transition from slacktivism to actual activism.

In defense of “Slacktivism”: The Human Rights Campaign Facebook logo as digital activism

Slacktivism can be helpful if the people who ‘passively’ contribute from the comfort of their homes actually believe in it. However, as it is, many people jump on the bandwagon so that they are not left out, or they want to be seen as ‘liberal’ in the eyes of their friends. This isn’t particularly helpful. However, I surmise that companies treat slacktivism as a form of viral marketing – such word of mouth will further their organization’s aims through awareness – whether it makes a real impact there and then does not matter.

===============================================================================

We Are Dynamo: Overcoming Stalling and Friction in Collective Action for Crowd Workers

The paper discusses the idea of galvanizing crowd workers to band together to further their rights, akin to a virtual workers’ union. Similar to how Uber labels its drivers as contractors, online workers have limited recourse. For example, on Mechanical Turk, requesters often mark work as unsatisfactory even though it might be done well, so as to reduce their cost. In their online community, stalling and friction served to undermine their efforts to establishing effective collective action. The solutions the authors propose is useful not only in collective action but serve as good general guidelines for a community.

The Future of Crowd Work

The paper discusses ways in which online crowd tasks sites could be made better. Much of the problems crowd workers face are that instructions from requesters are not clear, or they are not paid for tasks which they completed but the requesters declare to be not up to scratch. The authors propose creating a career ladder for such crowd work but I feel that this is not feasible. The state of micro-tasks are such that they are meant to be low-paying, non-skilled tasks. The fact that a lot of these involve liking Facebook posts, or even clicking on ads mean that such tasks are not very legitimate after all. 3 years ago, I actually did make my own micro-tasks website, modelled after microworkers.com, which aims to help workers out by reducing fraud performed by the requester, by implementing an API that automatically verifies the work done by worker, rather than leaving the requester to manually approve them.

2nd Readings

The Mirage of the Marketplace

I believe that Uber was so successful simply because it disrupted the taxi industry which had remained pretty much static for the past few years. Previously, taxis with their medallions were pretty much entrenched in the transportation industry in US as the medallions gave them a virtual monopoly. Moreover, in other countries, pricing tiers meant that taxis were not available at certain times of the day. For example, in Singapore, taxis would go into hiding 30 mins before the peak hour rates start, so that they can bide their time until those higher rates start. Instead of alleviating traffic jams, taxis increased congestion during those peak hour periods. However, Uber came in to solve this problem.

Having said this, Uber is simply using technology to faciliate transportation. Much like how eBay and Amazon used technology to get goods delivered from manufacturers to consumers, Uber works on the basis of delivering transport from drivers to riders. The ‘algorithm’ which the article says is a mirage, is simply used to better the rider experience by improving speeds and satisfaction, even though some cars were simulated.

EXPERIENCE

I took a few Uber rides and asked them why they wanted to be an Uber driver. One of them said the flexibility appealed to him. He said he was a night person and liked driving passengers at night. He added that he earned more too, from fetching passengers on red-eye flights from PIT. When I asked if he minded not having any employee benefits, he shrugged it off and surmised that there had to be a trade-off. He added that Uber’s technology has indeed made connecting passengers to drivers easier. When asked about the future, he said he had no plans to change jobs unless legislation forced him to otherwise.

Another time, the Uber driver said that he intended to use his Uber earnings to get a foothold into the real estate business. His brother bought a few houses from his savings, rented them out (to sketchy people month-to-month) and has a stable income now. He said he does freelance web consulting work at home and whenever a potential passenger requests for a ride, he takes it up to get extra income. He also added that he does not like the rigidity a 9 to 5 job entails, and that was why he does freelance work – it even pays better. Finally, he said he hoped that Uber and Lyft instills some kind of community in its drivers – a lounge for other on-demand drivers to connect would be good.

========================================================================

Using social psychology to motivate contributions to online communities.

This reading discussses two experiments about overcoming social loafing – whether framing uniqueness and benefits, or goal setting are possible solutions. While uniqueness increases contributions, benefits do not, probably because users probably have subconscious reasons for doing so (ie. altruism) and feel that reminding them of benefits reduce their proclivity for contribution. I did not expect this result as from a web marketing perspective, one would remind potential users about the benefits of signing up. According to Kraut, users might have seen those messages as manipulative and acted opposite to their recommendations simply to preserve their autonomy.

The second study showed that specific goals increased user contributions. What I hoped they explored was whether a very low goal will reduce user contribution. Their lowest goal level was the median number – 8 ratings per week. What if they had intentionally put a very low goal of 2 ratings per week? Would it otherwise persuade people who just logged in and did not rate because 8 ratings were too effortful? Would it reduce the ratings of very active users? The conclusion of the study was that group work increased contributions (contrary to hypothesis and what we know about the collective effort model) and very high goals reduced the rate of contribution.

Understanding the motivations, participation and performance of open source software developers: A longitudinal study of the apache projects.

The paper talks about whether various factors influence use-value and status motivations, as well as level of contributions to the field of open source software. I was confused as to what use-value motivation actually meant (and still is) while I know status motivation refers to the motivation derived from knowing one’s status can be elevated in the future. The paper states that use value refers to the contribution to solve a problem of personal use benefit. Does this refer to a case where I contribute to a code project because a feature which I personally find useful was not present, hence my use-value motivation increases? In any case, a part which I found interesting was that being paid to contribute increases intrinsic motivation by virtue of increasing status motivation. This runs contrary to what I learnt in social psychology. When people are paid to do things, they attribute their continued action of that task to the money, and not an intrinsic reason. If one is not paid and continues doing the task, there is no extrinsic motivation (money), hence I will (perhaps falsely) attribute an intrinsic one subconsciously – I like doing it. Hence, it is posited that monetary incentive would reduce the motivation to participate.

Evidence-based social design: Mining the social sciences to build online communities.

Although a tad long, the paper was very insightful, with very actionable strategies for motiviating contributions online or offline. With the deluge of information online, it is easy to execute some of the ideas in the paper. For example, many marketing funnel pages employ such strategies by stating a limited time period (“you have 3 hours left to sign up”) or showing that other people have joined, or requests that target people’s interests. As a corollary to the previous reading, this paper states that reward increases contribution if it is a sufficiently high amount; it can be done away with if you are given a small contribution. One point to think about is how or why micro tasks websites are so popular given that the pay rate is so low. Do people unknowingly become self-absorbed into the tasks that they spend too long doing it? Or do they like contributing to the myriad research studies on MTurk?

==========================================================================

Experimental Study of Inequality and Unpredictability in an Artificial
Cultural Market

This was an interesting study, but with a few weaknessses. It is strange that their subjects are mainly teenagers and not adults, yet they argue in their paper that even if they tested it with adults, the overall amount of inequality and unpredictability would be similar. It is a bad assumption to make, because it is possible adults are not swayed by weak cues like amount of numbers. This is something to think about. Additionally, in experiment 2, they ordered the songs by popularity of downloads, with less popular songs very far down the page. Did this choice of presentation influence the unpredictability? I am of the opinion that those music that appeared above the fold will be seen more, and thus downloaded more, while those down the page would not be, thus exacerbating the Gini coefficient. I believe an additional control case ordering songs with least popular songs at the top of the single column page would allow the researchers to control for page layout bias.

Everyone’s an Influencer: Quantifying Influence on Twitter

This paper concludes that a cost effective marketing strategy could entail normal users sending out sponsored messages, rather than a very popular celebrity. Also, in addition to their current methodology, one could extend upon this project by considering the speed at which a URL is shared. A link shared 100 times over a few minutes could be considered to have greater influence than one shared over a few months. Finally, an interesting point to note is that a significant amount of content on Twitter is actually spam (https://www.fastcompany.com/3044485/almost-10-of-twitter-is-spam) with spam accounts retweeting the same set of links over and over again.

Reading, Writing, Relationships:

This study was pretty in-depth and dispelled some of the common misconceptions about how Facebook is disrupting traditional communication methods in a bad way. On the contrary, such social networks can complementing existing ones. Also, having more weak ties do not crowd out strong ties, so it can be said that having more friends on Facebook does not mean that one knows less people more closely. The second research conclusion was that direct communication like messages help increase tie strength. This conforms to what people would hypothesize – pokes and likes are not meaningful forms of expression. The final conclusion was that losing a job increases weak tie interaction and decreased satisfaction arising from communicating with strong ties, especially if the latter is not in the same pickle as the former.

===========================================================================

1) I asked my friend to make a Wikipedia edit. He protested that Wikipedia is already comprehensive as it is and any further edits by him would not stand to scrutiny. I told him to make an edit anyway and he told me edited the article which expressed an amount in 1920 dollars in more recent amounts. I believe a bot came by and wrote on his talk page that his edit has been reverted, telling him to use the sandbox while he worked on citing his source.

2) I asked my friend to hwork on Galaxy Zoo and she felt frustrated. Some of the photos that were given were photos which had too many tiny galaxies, making it hard to classify them properly. I believe this was a problem faced by many of people in class as well.

3) I got one of my friends in Singapore to set up an OKCupid profile since he is a bit lacking in the love department. He had reservations – “What if my classmates see me there? They will think I am desperate”. He put a profile photo of a plate of noodles instead of his real self.

4) I got my friend to read the iTunes legalese. He said no thanks. I showed him this link (http://itunestandc.tumblr.com/) that presented the same in the form of a comic. He enjoyed it.

5) I did a twist on the crowd-working one by asking someone to work on 2Catpcha which is a middleware API for defeating captchas by getting lowly paid crowd workers to solve them manually. I got dissed at – imagine earning 75 cents for solving 1000 captchas and a chance to get carpal tunnel syndrome.

———————————————————-

Dr. Google will see you now

Like Ryan, I believe that search trends on Google can be useful but we would do well to be careful to curve fit our hypotheses and bias to the data we get. One of the more notable projects at Google Research is the Flu Trends project (https://www.google.org/flutrends/about/) which was recently closed. Because flu is such a common symptom, the results consisntely overpredict the severity of flu in regions – too much can lead to paranoia. That being said, ‘nowcasting’, as this strategy is being called, does have a place in economics and predictions.

The Wisdom of Crowd

I trade for a living, and I see this phenomenon of crowd wisdom every first Friday of the month, when the Non-Farm Payrolls (NFP) comes around. NFP is the biggest news event which shows the employment health of the US economy. Every month, people will post their predictions on NFP on Twitter. One of my favourite things to do is to compile a sample of predictions of Twitter users and get the average of their predictions. They are actually surprisingly quite accurate. I guess this prediction model can be improved if I had chosen only US based Twitter users, who are more likely to be in tune with the ebbs and flows of the US economy. Another interesting example I can think off hand is the use of satellite images to capture the parking lots of various (strip) malls. The more cars parked in front of a business compared to the previous year, the more likely it is doing well. Hence, investment firms can have better conviction in buying their stock, even before earnings results are released. This shows the power of crowd wisdom.

Mistaken Analysis

Ahhh, I didn’t read this before I made my first article post, but yes – analysis needs to adapt to changing technological trends. For example, big data analysis is used in Singapore to predict traffic patterns and determine the best traffic light timing for each junction, depending on the road and time of day. If we were to wholesale use data from 10 years ago today, it will definitely not be as useful.

Should Reddit be Blamed for the Spreading of a Smear?

I believe that as a platform, Reddit isn’t too blamed. The discussion that arose from that subreddit is a form of crowdsourced, citizen journalism, a field of which merits a lot of controversy in itself, if you look at how people takes things in their own hands in China. The thing is, I believe education is the key to making such platforms work. Just like trolling is commonplace online, supposed facts should be taken with a pinch of salt online. People who are influencers need to be more discerning about the things they disseminate on their online social media accounts lest they spread half-truths and sully their own reputation.

EXPERIENCE

I perused topics from this list – https://en.wikipedia.org/wiki/Wikipedia:List_of_controversial_issues and settled on the Historicity of Jesus which is a hotbed for religious controversy as seen on the talk page. A quick glance of the talk page tells me that in a way, meaningful contribution to Wikipedia entails the jargon that comes with editing the page – things like WP:SPS, WP:SELFPUB and WP:DUE are thrown about as though it was the norm. Those, by the way, are acronyms used to refer to making a neutral point of view in writing a Wikipedia page. It seems like the main source of contention on the page is the veracity of certain sources, as well as a certain Christian slant which frankly I do not really detect. Also, there has been huge debate on what is history and what is not factual (and hence not considered history).

=======================================================================

Mining Our Reality

This article is pretty outdated in the sense that data sharing between entities have become more secure and standardized. For example, there is a service (I forgot the name, but it is pretty big and used by many businesses) that allows you to get the person’s physical address from his/her email address, without both parties disclosing their email addresses. Both parties run their email addresses through a standard one-way hashing algorithm known to both parties and see if they match up. If there is a hit, they retrieve the other details. In the same vein, much of the internet’s data brokers work the same way. The more egregious ones attempt to link up your real life activities with your online activities. They do this by offering you membership cards and loyalty coupons so they can track you both in-store and online. Data is being sure that’s for sure, but there needs to be a line drawn. The service I mentioned at the start of my post is very creepy.

Inferring friendship network structure by using mobile phone data

This study shows the advatanges of using handphone behavioral data over self-reported data. Increasingly, such features are used to cut down on cognitive biases inherent in self-reports. For example, Dark Sky the weather app uses the pressure sensors in iPhones to crowdsource data on whether a particular place is going to rain. Also, Waze the map app uses your location data to automatically tell other users if there is a traffic jam on a particular stretch of road, based on how fast you are going. In this respect, data insights from smartphones are able to serve as a proxy for other types of data such as relationships.

Reality Mining: Sensing Complex Social Systems

I’d like to point out that Bluetooth technology these days have been revised in view of the privacy concerns of yesteryear. Last time, people could send other people contacts via Bluetooth without the recipient knowing. I created a MIDP app to do such pranks on the trains. Lol. But seriously though, Most bluetooth devices these days are not broadcast publicly anymore – the user has to explicitly say they want to pair their device with other users. Nonetheless, I like this study alot as it took an innocuous piece of technology to identify social patterns. I can see how this is useful in situations like urban planning and crowd control during events. The privacy implications are even more real with the 3m accuracy provided by GPS and GLONASS technology. Moreover, it is easy for apps to get location data on Android phones since the permissions are baked in upon installation.

The Livehoods Project

I found it hard to appreciate the implications of the paper. The clustering algorithm, could cluster checkins of people to distinct groups, but that’s about it. Without further research like interviews with people in those clusters, not much information can be gathered. Moreover, in any city there are bound to be clusters of folks which makes up the identity of an area. I expected that their algorithmic clustering would have highlighted certain distinct features of an area – for example, the flow of people from other regions to Market District to buy groceries, for example. This would be more useful in the sense that we now have the ability to see which neighborhoods are underserved in terms of these amenities.

EXPERIENCE

I used to use 4square before it became Swarm, primarily because I used it to document places that I have been to in a particular region. So when I was cafe hopping around NYC last winter, I checked in at each place I went. When my friend asked me what to eat at NYC, I just gave her the list of check-ins. Pretty nifty. Over the past few days, I checked in to mostly lecture halls and classrooms. Pretty boring to be used on campus actually. But I guess that’s CMU for ya. Anyway, Swarm is so much more gamified than last time. And there are more obnoxious ads forced in your face right now. I really want to like it, so I will give it a spin during winter break. A few of my friends use it, including my dad, so it is fun to see what they are up to (and where they work and live :O ) Again, if you are checking in to place X every 9am and place Y every 6pm, people are going to know where you work and live. Use judiciously.

==========================================================

Cooks or Cobblers? Crowd Creativity through Combination

Cooks or cobblers? Cooks take raw materials and conjure a scrumptious meal out of it. But cobblers take something slightly imperfect and mend it to be somewhat functional. Hence, I would have to say this paper strongly proves the point that crowd creativity is like cobblers. The chair, whilst creative (which I think is already pushing the limits of its definition), might not be ergonomic, functionable or useable. Also, many users have their interpretation of what should go into a product, leading to feature bloat. Say you are designing a stool for a person to sit on. Someone might add seat contours so that it fits your butt cheeks snugly, another might want to add an AC below it to cool your bottom during summer, or a deodorizer underneath so your farts won’t stink up the room while seated. Point is, this study centers on the fact that all coordination can be mediated through designs, which I think is a blasé perspective. If you want to see a good way on how such crowdsourcing can work well, take a look at Quirky. I am a member there for quite some time, and the processes there for such work impresses me. (https://www.quirky.com/invent)

Redistributing Leadership in Online Creative Collaboration

The author is right in pointing out that Pipeline bears much similarity to GitHub; he argues that Pipeline aims for breadth rather than depth, but GitHub also has very strong administrative tools like Pipeline made, in addition to its world-class version control system. That said, there has been much movement in software development towards depth aka “niche” areas. You have Foursqare and Swarm as one example; and Facebook’s main app, Messenger, Camera, Groups, Pages, Notify, among many other apps that serve only one purpose. There is an article online on why they decided not to integrate all functionality into one single app – those interested can do a quick search to read it. I would say in terms on task delegation, Pipeline’s tools reduced the friction of admins – the main problem they were trying to solve. Yet, I feel that just like GitHub is known for code collaboration and Wikipedia for knowledge editing, perhaps Pipeline should focus on one niche area – image collaboration perhaps? Then they can develop tools such as realtime online editing of images (plenty of plugins online to do this) as well as displaying comments on the image canvas.

Science by the Masses

This was a very interesting article even though it was dated 7 years ago. When I started reading it, I doubted that Innocentive would be successful, because I thought a technical field like science and technology would be difficult to vet and verify, especially over the web. However, it seems like I was wrong. A key factor on why they were successful is that solutions to science often require a different perspective from a person in a field other than the one being researched on. This is shown by the conclusion that “the further a challenger was from a person’s field of interest, the more likely they were able to solve it”. Also, such specialized knowledge may be possessed by researchers not employed by the company, and such crowdsourcing allows companies to reach out to a wider audience. Moreover, it is cheaper than traditional research methods – why not harness the collective brainpower of the world’s scientists? Perhaps one day InnoCentive can expand into other domains too, not just science and tech.

=======

DMCA: A Balanced Public Policy Brief

11 years ago, the authors of the paper recognized severe shortcomings arising from the DMCA

15 Years After Napster: How the Music Service Changed the Industry

This article talks about how Napster forced industry’s hand in trying to change how music distribution works. Industry was slow in predicting people’s music needs, and thus Napster and later, iTunes then Spotify filled that gap. While I concede that it is copyright infringement, the fact remains that there will stil be a big group that will not pay for music no matter how easy any service makes it to be. This group is content with searching YouTube or Chinese websites to satisfy their music craving, even when popular services are down. We can see this in the recent Popcorn Time takedown – not only did other services take its place, more popped up like a two-headed hydra. These group of users was never a lost revenue stream for companies. In fact, the freemium models of Pandora and Spotify allows artistes and companies to make some money out of this group of people, when previously they could not.

The 2 Teenagers Who Run the Wildly Popular Twitter Feed @HistoryInPics

This is an eye-opening case into the issues of copyright infringement in social media. Even Facebook (the company, not its users) have been accused of piggybacking on the original creators of viral videos, which Facebook users have reuploaded on Facebook. Previously, Facebook embed Youtube videos in news feeds; while they still do now, its algorithm shows videos uploaded on Facebook’s platform much more than on ther platforms – to keep eyeballs to their site. Facebook relies on safe harbor protections – it is not responsible or infringements by user-generated content, unless it blatently ignores DMCA notices from content owners. To this end, I can see that the 2 teenagers running @HistoryInPics doing the same to avoid copyright problems – they can simply ask for submissions by its wide swathes of followers. I am sure they would have no lack of submissions. If they receive a violation notice, they simply take it down promptly.

Tracking Users for Fun and Profit

Advertising in today’s world has been pretty much accepted by most Internet users. A useful rule of thumb is that if a website is not selling you anything, then you are the product. After all, many websites exist with a profit motive. However, advertising has also spawned a cottage industry in tracking users effectively on both online and offline mediums, and this has caused much concern to other people. The end goal of such tracking is to persuade and influence people to buy more of their products by analyzing their shopping habits. In this short paper, I provide a brief overview on the modes of online advertising and how they attempt to influence people to buy more of what they are peddling.

Social Media Tracking

It is indubitable that Facebook and Twitter have become ubiquitous. They also contain a treasure trove of personal data. Previously, search engines like Google had to infer the age, gender and demographics of a user based on the websites he or she visits. With Facebook, people willingly provide that info when they complete their profile or like pages and links. While there is much advice to fill up your profile with fake personal info, most do not take heed of that because Facebook is ostensibly more useful when you provide real information. After all, who would want other people to wish you happy birthday on your fake birthday? Hence, it is easier for companies like Facebook to target you with more relevant ads. Are you a 21 year old female college student? Perhaps ads for a sale for clothes could be shown. What about an unemployed 40 year old man? Ads purporting get-rich-quick schemes can be displayed. In itself, this might not be a bad thing. However, there should be some concern that such a considerable amount of personal information is concentrated in the hands of a few companies.

In one such case, Facebook intended to partner with other websites to share activities of users on external websites directly on Facebook’s news feed. This service, called Beacon, created an uproar because it did not allow users to opt out of the feature. For example, if one bought a book called Coping with Cancer on Amazon, one’s purchase will show up on other people’s news feeds as “Your friend XXX bought a book “Coping with Cancer” on Amazon. Click here to check it out”. Facebook’s intentions were clear – from persuasion theory, social proof is one way of influencing people to take action on partner websites. Simply put, people are more likely to buy something if their friends bought it. However, they implemented it in a way that third parties were privy to personal information on the Facebook platform. As can be seen, such data sharing among different websites can lead to privacy leaks and unwanted information disclosure.

More worryingly, there are companies such as Lookery that attempts to segment users into various demographics based on their website visits over participating websites. In fact, I used Lookery a few years ago and they paid $0.10 per 1000 visitors, which while seemingly very little, is quite substantial considering they place no ads on your website – only an invisible JavaScript beacon tag is placed. Although they promised that no personally identifiable information is stored and transmitted, prior incidents like the AOL search data leak leaves much doubt. In that particular case, AOL released detailed search logs to many users for research purposes. Although AOL did not specifically disclose which user queried what terms, it was possible to identify users based on their search terms. In many cases, specific individuals were even identified. It is of concern that such advertising beacons, while in itself is anonymized, can be cross-referenced with data from other providers to provide enough information to identify users. Therefore, online advertising can be used to identify users, which is an ethical problem in the broader schemes of persuasion.

The Rise of Data Brokers

The traditional role of data brokers was to provide firms with a way to get demographic data, or verify personal information in that firm’s database. For example, in the case of fraud prevention in an online shopping portal, firms can cross-reference a potential customer’s email to their shipping address. How it works is that the firm provides a one-way hash of their customer’s email and address and provides it to the data broker, and the latter returns a result indicating if they have an exact match. Either parties do not have access to the email or physical address that is being queried, since a one-way hash is being passed, with the hashing algorithm known only to those two parties at that point in time. However, some people are still spooked by the fact that personal data, even though encrypted, is still being passed around different companies in the normal course of business. With so many parties involved in a transaction, there are more points of failures in a privacy leak.

The more contemporary role of data brokers now involve bridging real-world and online actions to a person. For example, if Coca-Cola launches an online advertising campaign on Facebook, measuring its effectiveness would be hard without data brokers. These companies, via deals with multiple retailers, get transactional data of specific products in purchases made in-store, aggregate them and present them to clients to gauge advertising effectiveness. It can also work in the opposite way as well – Walmart might want to track and compare the shopping habits of those in-store vis-à-vis those online. In this case, the “bridge” is your loyalty card – if you use your card at a brick and mortar store, then enter it online to enjoy a discount, then retailers will know that the online persona is the person registered in your membership card. Data brokers can then help Walmart market products that you are more likely to buy on not just Walmart, but on other websites as well. At this point, your online identity can be associated with a physical one. An example of such a company is Datalogix, which tracks more than $1 trillion in consumer spending across more than 1400 retailers. This data is then passed to data cooperatives which act as a central clearinghouse for personal information across multiple companies, online and offline so that they can be homogenized into a machine readable format, to be resold to other advertising agencies.

As some critics rightly point out, data brokers have operated even before the Internet matured, perhaps even more unscrupulously. If you have written down your details to take part in a sweepstakes or lottery, you have effectively signed away your personal information to a company which purportedly helps facilitate the lottery. In fact, data brokers back then were subject to little regulation. Paid services existed which allow you to get someone’s address from their phone number, and vice versa, for telemarketing purposes. The modern form is encapsulated in the company Towerdata which prides itself on finding a person’s social media accounts and physical addresses based on their email addresses. Additionally, this service called Email Intelligence allows interested parties to buy “demographic, interest and purchase data for more effective list segmentation and personalization”. As can be seen, such exchange of personal information is not new, but due to increased frequency of data leaks, people are more concerned now than in the past.

Conclusion

In sum, online advertising can be seen to infringe on users’ privacies, all for the ends of getting people to buy more of their products. This has very much to do with the ethics discussion we had this week. Previously, advertising was pretty much a benign medium – you search ads with a particular keyword, and was shown ads with those keywords. Now, advertising has gotten so far to the point that Walmart and Target are able to know whether you are pregnant, even before you actually conceive – all based on big data analysis of your online habits derived from a variety of online and offline sources.

What a world we live in.

Discussion

Here’s an interesting discussion question that allows you to think about the future of advertising.

  1. Do you think privacy and advertising can ever be segregated? That is, can effective advertising happen without breaching users’ privacies? Do such advertising firms exist now?

United Breaks Guitars – Lessons for the Public Relations Playbook

Abstract

The United Breaks Guitars case study shows that social media cannot be overlooked in a company’s customer service operations. Dissatisfaction with the company can propagate like wildfire over the Internet. In the short discussion piece that follows, this author attempts to show the significance of social media in public relations, as well as suggesting how United could have done better amidst this negative publicity.

Introduction

It is indubitable that social media has changed the public relations playbook for many companies. Previously, before the user-generated content Golden Age that Web 2.0 brought about, people who felt aggrieved had little recourse against the company. Now, people are able to tweet their frustrations, write a rant on the company’s Facebook page, or in this case, make a music video. In all cases, the actions taken by them are highly visible. Contrast this to a private email complaint to the company – the latter is definitely less visible and subject to less public scrutiny. In addition to increased visibility, the complaints are also able to be propagated much more quickly. Case in point – when someone uploaded a video of a FedEx delivery person throwing his fragile packages into the porch, that video was shared thousands of times over various social media platforms. Taken together, social media is a potent force to be reckoned with in a company’s public relations playbook – one that requires a different strategy than the one of yore to counteract.

For ease of discussion, I will segment my analysis of the case to distinct time periods in the life cycle of a public relations incident – pre-incident, during incident and post-incident. This allows the reader to gain a perspective on what a company should do in a situation like United Airlines was in.

Calm before the Storm

Since social media is such a powerful voice online, the attention given to it should be commensurate to its influence. In the case of United, we see that there were glaring ineffective use of social media. In July 2009, United Airlines used Twitter to only disseminate promotional messages and flight disruptions to its followers. While some followers might like the fare deals posted on its feed, United could have taken a more proactive approach on social media.  A cursory glance shows that tweets that mention a company on Twitter are usually negative. For United, this would mean complaints about delays, rude ticket agents, inept efficiency and its ilk. There was no such effort on United’s part to address these complaints. Today, the situation on the ground (pun intended) has improved drastically with United actively responding to customers’ complaints on Twitter, inviting them to message them so that they can investigate further. In one instance, someone reported an issue with the gate number on their e-ticket. United replied telling them to message them so they can ask the Mobile Apps team to rectify this. This is a step in the right direction. Moreover, most of the replies to complaints took less than an hour (minutes in fact), which is commendable. In a time-sensitive business like flights, customers expect their concerns to be addressed swiftly [1]. Customers use social media to air their complaints because it is easy and fast to do so; likewise, they expect a response to come quickly too. As can be seen, United Airlines used social media ineffectively previously.

Next, it would seem like United did not have dedicated personnel for their social media presence. From the paper, “United employees were encouraged to monitor social media for mentions of United Airlines”. Crowdsourcing was a good step by United so that each staff member has a stake in addressing issues concerning the company. This was instrumental (pun unintended) in the early spotting of Carroll’s video, and the subsequent reach out to him by the managing director of customers solutions at United. Without this policy in place, the video could have been seen only after all the mainstream media outlets have reported  on it, which could be even more disastrous. Whilst such crowdsourcing is a not a replacement for full time social media service staff, companies that opt for this route can improve on this further by providing incentives for staff to report customer incidents which are left unaddressed to the main customer service team. All in all, companies will do well to have dedicated customer service agents to address social media issues.

The Aftermath

Of course, the United debacle would not have happened had the United agent approved his claim in the first place. In this part, we take a look at how United handled the social media firestorm. Firstly, United was vague in their Twitter reply to Carroll. While this might be an off-the-cuff reply, there were no subsequent follow up on what exactly they did to “make it right”.  It was not until after they have reached out to him did they detail their poor response to his claims. Notice how Rob Bradford reached out to him. He is the managing director of customer solutions at United. What United should have done was to get the CEO of United to reach out to him instead. Since his complaints had been seen by millions of people, the CEO should have apologized, not anyone down the corporate ladder. This shows that United is not serious about the matter. Also, it would seem that United had taken to more actively tweeting to tell people their solutions. Perhaps a better approach would be to embrace traditional media and online news sites by writing a press release.  They could also have placed a statement on their corporate website or via a shareholders’ meeting. Thirdly, the response by United was weak because they did not address any punitive actions that United would take should such an incident happen again. They only promised to use that in training materials, but failed to communicate to its customers how its customer service would be overhauled. For example, one can take a look at how Amazon does customer service. While there are customer agents at every step of the way, one can email the head of Amazon Jeff Bezos directly at his personal email – jeff@amazon.com. [2] This shows sincerity in reacting to complaints. United could follow in Amazon’s footsteps if they really want to go all the way in this respect. A highly-reactive from-the-grounds-up approach would be to fire employees who lack customer service discretion. However, this has the knock-on effect of decreasing employee satisfaction so the pros and cons definitely needs to be weighed. Hence, one can see that United not only exhibited a weak response during the incident, it also failed to show its customers its sincerity in improving its services after the fact. Sometimes, from the customer’s’ point of view, education of frontline employees is not enough, they want to see a hardline stance on egregious actions by employees.

 

Conclusion

In conclusion, United’s social media action plan seemed to have improved since the article was written. The key points to take away from this is that companies should use social media to not only produce content, but also consume feedback from its social media followers. Also, they need to address such feedback promptly, with dedicated personnel to handle such matters, rather than delegating it to everyone. If a public relations disaster would occur online, companies should get their top guy to address the issue personally, especially after it has blown up to such proportion. In communicating to the public, companies should demonstrate sincerity in changing their practices by thinking from the customer’s perspective. The best solutions are always to involve the customer experience personally – for example, the ability for customers to contact the CEO directly. Strategies like educating customer service employees are likely to be seen as impersonal since customers are not really privy to any improvements behind the scenes.

References

  1. http://www.convinceandconvert.com/social-media-research/42-percent-of-consumers-complaining-in-social-media-expect-60-minute-response-time/
  2. http://gizmodo.com/jeff-bezos-if-you-have-a-problem-with-amazon-email-me-1724561248

Is Crowd Wisdom Really Wise?

With all these talk in recent years about crowd sourcing and big data, I wanted to delve a little bit deeper on the costs of crowd wisdom. It is easy to see how crowd wisdom can be beneficial to society as a whole; for example, there are many crowd volunteering websites that aim to aggregate crowd decisions for social good. One of them is Tomnod which rose to fame when they used satellite imagery to crowd source data on where the missing MH370 plane is. However, in subtle ways, crowd wisdom might not be all that good.

In one of the more telling problems of crowd wisdom (herd mentality in particular), Sunil Tripathi was implicated as a suspect in the Boston marathon bombings because a Reddit user posted in the forums that they looked alike. Soon after, more people started agreeing with the first poster, and spread like wildfire over Twitter. It went into mainstream notoriety when mainstream media reported it over TV and newspaper. The lack of fact-checking by the latter was disturbing. However, this shows one thing – that crowd wisdom can turn mad. Critics of this point might point out that crowd wisdom works best if people are independent – they have no knowledge of other’s ideas. While this can be replicated in a laboratory setting, the fact is that it is difficult to ensure this online. People invariably will use search engines or share it on Facebook, which leads to a decoupling between reality and ‘reelity’. It is what makes crowd wisdom a dangerous idea to wholeheartedly believe in.

Secondly, the issue of privacy comes in. In recent years, due to the proliferation of smartphones with embedded sensors, app developers have been using the sensors to predict weather or traffic. Dark Sky, a weather app on the Apple App Store, uses pressure sensors on the latest iPhones to crowdsource data on whether it is going to rain in a particular region. Waze, a traffic directions app, uses your phone’s speed and location to crowdsource data on traffic conditions. If you are moving slower along a particular road, the app may intelligently tell other road users that a traffic jam might be imminent, and to reroute road users accordingly. It seems beneficial until you realized the same app can be used for nefarious purposes. Previously, Tinder, a dating app, uses one’s locations to tell potential dates where you are. Whilst other people see you as “3 miles away” in a vague sense of distance, some tech-savvy people figured out a way to perform deep-packet inspection of the traffic to find out your exact coordinates. It turns out that the actual coordinates are actually being sent to your phone, which then processes the data in conjunction with your location to provide you with the distance. Hence, it is possible to know where you potential ‘dates’ live, and possibly stalk them. This is a design flaw more than anything else though, but it still shows the flaw of crowd wisdom – when companies attempt to collect everything that it is possible to gather from a particular user, the possibility for irresponsible disclosure is heightened.

Thirdly, it might not be even accurate at all. As seen in class, the more variables there are in a particular prediction, the less accurate it becomes. For example, predicting how many leagues in the depth of the Marianas Trench requires people to have some notion of a league, in addition to predicting the depth of the trench. A more popular example which deals in big data was Google Flu Trends, which was recently shut down. Although it wowed researchers by predicting flu trends ahead of a full-scale epidemic, it constantly overstated the severity of it. People were too prepared, it seems. Whilst this might not be a bad thing, I can see the flaws of trying to extract information from crowd data – it might be fairly accurate now, but might not be a few years down the road.

Finally, I would like to end off with some food for thought. How do we ensure that crowd wisdom is indeed accurate? We might take a look at the intricate system Wikipedia has in place for its editing – talk pages, a system of editorial processes and its ilk. Is it possible to generalize for online communities and spaces?

Just an ordinary guy, living in an extraordinary world