Crowdsourcing


What is crowdsourcing?


According to the Oxford English dictionary crowdsourcing is obtaining information or input into a particular task or project by enlisting the services of a number of people, either paid or unpaid, typically via the Internet. Even though it proclaims “typically via the internet”, there are many other forms of crowdsourcing. Since this project utilizes internet for the connectivity, the above definition is still valid.
Usually a crowdsourcing systems can be divided in to two categories; implicit and explicit crowdsourcing systems [1]. Explicit category includes evaluating, sharing and networking systems. In addition to those, there are explicit crowdsourcing systems for building artifacts and task executions [1].  Evaluating systems allow users to do tasks like rating, voting, tagging, etc. International movie database (IMDB) [2] is an evaluating system where users can rate movies in a one-to-ten scale. By doing that people enjoy benefits like movie recommendations, tracking watched movies, etc. With crowdsourced ratings, site vendors can do different analytics according to gender, age group, country, etc. Websites like YouTube[3] and Flicker[4] are examples for crowdsourced sharing systems. All the content of those web sites are added by users expecting to share with the community. Social networking applications like Facebook[5] fall into networking crowdsourced systems category. Open source software is a good example for crowdsourcing which is used to create or improve artifacts. Amazon Mechanical Turk [6] is a very popular crowdsourcing platform which facilitates users to complete tasks and in return they might be paid. These tasks vary in a wide range from simplest tasks such as surveys to complex tasks like analyzing satellite images.
In addition to these explicit crowdsourcing systems there are many situations where crowdsourcing happens implicitly. reCAPTCHA[7] is a perfect example for this. It provides two distorted words to be identified and typed correctly in order to proceed while completing a particular task in a web page. Even though the main objective of a CAPTCHA puzzle is to reduce spam, it improves the process of digitizing books by sending words that cannot be read by computers to the web in the form of CAPTCHAs for humans to decipher.
The most important part of a crowdsourced system is to motivate people for the crowdsourcing process. Usually people expect personal benefits for the effort and time he/she sacrifices for the crowdsourcing activity. These benefits can be either monetary or in any other forms. Crowdsourcing platforms like Amazon Mechanical Turk pays money to people who participate in their crowdsourcing activities while users tend to answers the questions to increase their reputation score in Stack Overflow[8] which is a technical forum site for programmers

The requirement for crowdsourcing


This research mainly focuses on explicit perception sharing. Instead of passive perception capturing and analysis techniques like sentiment analysis, this platform allows people to explicitly share their thoughts and emotions. Usually people are not interested to involve in such tasks sacrificing their effort, time or money (users have to pay for things like network bandwidth). Thus the proposed system must use crowdsourcing techniques to make people actively participate in explicit perception sharing.
Many recent research have proven that crowdsourcing is very suitable for human subject research. Since perception analysis deals with the human mind, crowdsourcing is an ideal solution for this. When crowdsourcing was not popular, many human subject researchers ended up generalizing of broad populations from very few individuals [9]. Most of the time those individuals belonged to the same or similar social groups. Crowdsourcing breaks this barrier and provides access to a huge population of people who are interested in participating in web-based or mobile based tasks at their own convenience [9]. Rapid development of internet and other communication technologies has made this even more viable [10].
There are several recent research which attempted to use crowdsourcing in some phases of the perception analyze process. Even Though those cannot be directly applied for the proposed system there are many lessons which can be learnt from them. Surveys are one of the primitive methods of capturing perceptions. A research on the viability of crowdsourcing for survey research has concluded that the quality of crowdsourced data is as good and diverse [11]. Other than surveys there are many other crowdsourced perception capturing systems available for web. OpinionLab [12], Kampyle[13] and Ideascale[14] are some examples to them. But functionality of all these systems is limited to capturing perceptions for the content of the web sites which they have been integrated.
Development of smart phone technologies has made crowdsourcing even more applicable to various situations. Research have been conducted to measure the effectiveness of doing surveys through smartphones [15]. The primary advantage of smart phone surveys is that those can be tailored into the situation of the user. As an example if the user is in a theater, questions which ask whether he is enjoying the movie can be pushed into his smartphone. There are three kinds of approaches; via mobile browsers, as separate apps and web apps which appear as native apps[16].Voting is an another area for which mobile crowdsourcing can be effectively applied. It has been identified that the mobile voting systems succeed in providing enhanced effectiveness in voting for smartphone owners, compared to the commonly used voting systems today [17]. Ortag&Huang[18] have tried to collect location-based emotions relevant for pedestrian navigation via mobile crowdsourcing. By doing that they could aggregate mental maps (a person’s subjective image of the world) of many people to provide guideline or tips for pedestrian navigation. They also plan to store the crowdsourced emotional data in an open database which allows its use for various purposes by everyone.
Crowdsourcing has been also used for analytical purposes. OPTIMISM [19] which is an opinion mining system for Portuguese politics have used crowdsourcing to determine which entities of a complex opinion article are mentioned in a positive or negative way. Templeton, Fleischmann, and Boyd-Graber have used crowdsourcing via Amazon Mechanical Turk to map sentiments of text into a one-to-five scale [20]. In addition to these pure crowdsourced analyzers, there are hybrid approaches such as systems which use crowdsourcing to train machine learning systems for sentiment analysis [21].

Challenges in crowdsourcing


The key challenge for crowdsourced systems is to persuade users for the process. Social networking sites such as Facebook [5] and Twitter [22] which have millions of users have drawn most of their users by word of mouth. Spreading word of mouth is happened through a range of media that potential users respond to. There have been only a little of a direct influence from the owners, administrators or designers of these sites [23]. Such kind of situations in which broadcasting and crowdsourcing is merged are referred as crowdcasting [23]. Since a perception analyzing system is required to have a very large subject pool to function correctly, this kind of approach is very suitable for that.
Another important challenge in crowdsourcing approaches is to protect the privacy of users. Some users hesitate to participate in crowdsourcing programs because of the privacy issues. At the same time some users expect to publish their services to the public for various reasons like improving the reputation. Thus when designing a crowdsourced system, public and private data has to be identified correctly and it is important to let the users to customize their privacy settings.
Tracking bad behaviors is very critical for maintaining the quality of data collected. Bad behaviors contain unexpected user actions like spamming, fake inputs, manipulating the system, etc. General and system specific precautions have to be taken to prevent those. When designing a crowdsourced system ethical and legal considerations have to be thoroughly considered since such systems deals with humans and their behaviors. As an example if we store user’s locations and those are not publicly visible we have legal obligations to protect those data. Also it is unethical to track user behavior without their knowledge.
In any kind of a crowdsourcing system it is important to know the demographic makeup of the users. Researchers can better do their work if they understand the sampling bias of the population participating in their studies [9]. As with demographics, people’s motivations for participating should be researched separately to get a better understanding. When designing a crowdsourcing system it has to be correctly identified the knowledge and skills of the target users. Otherwise users may hesitate to use the system or the quality of the data gathered will become low.
As previously discussed mobile crowdsourcing is getting popular and it is very suitable for perception analysis, but there are challenges which are specific for mobile crowdsourcing. Unlike desktop computers smart phones run on battery power. Its network connectivity is expensive than LAN networks unless it is connected to a Wi-Fi network. Also protecting privacy is also important because since mobile phones are very connected to user’s lives they have many information which can affect the user’s personal life. So a user would not be interested in participating in mobile phone sensing, unless it receives a satisfying reward to compensate its resource consumption and potential privacy breach [24]. This reward can be money, information, entertainment or anything which will be appreciated by the user. Another problem of mobile phone based crowdsourcing systems is various inaccuracies. As an example GSM network based positioning, with city- and district-level tracking, may not give accurate enough location. It is, for example, not accurate enough to distinguish between home and the shop nearby, or office and the lunch cafe [25]. Human factors also have to be considered when designing mobile phone based systems. While mobile phone technology is increasingly familiar to people in the developed world, not all users are comfortable or familiar with smartphones. Many mobile phone subscribers only use the most basic functionality and simple phones [25]. So when designing a perception analysis system this type of people has to be taken in to the account too. When doing smartphone based researches it is always assumed that the user carries the phone with him always. But it some situations it may left behind by choice or accident. So that detecting that kind of situations is essential to maintain the accuracy.
Also it has to take into account that people are not equally capable of participating in all situations. If the user is actually mobile at the time of a sample, he or she may have extremely limited attention and cognitive resources [26]. Even when not mobile, it may be awkward to fill out a questionnaire in a social situation. In these cases, the answering may either be postponed to a later time, with a decrease in accuracy, or not be done at all. So system is has to be designed in such a way that it will not interfere his life but assist his life style.

References

  1. A. Doan, R. Ramakrishnan, and A. Y. Halevy, “Crowdsourcing systems on the world-wide web,” Communications of the ACM, vol. 54, no. 4, pp. 86–96, 2011.
  2.  “IMDb - Movies, TV and Celebrities.” [Online]. Available: http://www.imdb.com/. [Accessed: 09-Apr-2013].
  3. “YouTube.” [Online]. Available: http://www.youtube.com/. [Accessed: 21-Apr-2013].
  4. “Welcome to Flickr - Photo Sharing.” [Online]. Available: http://www.flickr.com/. [Accessed: 21-Apr-2013].
  5. “Facebook.” [Online]. Available: http://www.facebook.com/. [Accessed: 21-Apr-2013].
  6. “Amazon Mechanical Turk - Welcome.” [Online]. Available: https://www.mturk.com/mturk/. [Accessed: 15-Apr-2013].
  7. “reCAPTCHA: Stop Spam, Read Books.” [Online]. Available: http://www.google.com/recaptcha. [Accessed: 21-Apr-2013].
  8. “Stack Overflow.” [Online]. Available: http://stackoverflow.com/. [Accessed: 21-Apr-2013].
  9. L. Schmidt, “Crowdsourcing for human subjects research,” in CrowdConf’10 Proceedings of the 1st International Conference on Crowdsourcing, 2010.
  10. S. D. Gosling, C. J. Sandy, O. P. John, and J. Potter, “Wired but not WEIRD: The promise of the Internet in reaching more diverse samples,” Behavioral and Brain Sciences, vol. 33, no. 2–3, pp. 94–95, 2010.
  11. T. S. Behrend, D. J. Sharek, A. W. Meade, and E. N. Wiebe, “The viability of crowdsourcing for survey research,” Behavior research methods, vol. 43, no. 3, pp. 800–813, 2011.
  12. “OpinionLab | Omnichannel Voice of Customer Feedback | Digital Feedback Management.” [Online]. Available: http://www.opinionlab.com/. [Accessed: 03-Apr-2013].
  13. “Feedback Form - Kampyle.” [Online]. Available: http://www.kampyle.com/. [Accessed: 30-Apr-2013].
  14. “Idea Management - Innovation Management - Crowdsourcing - Suggestion Box - Customer Feedback - IdeaScale.” [Online]. Available: http://ideascale.com/. [Accessed: 21-Apr-2013].
  15. M. Millar and D. A. Dillman, “Encouraging Survey Response via Smartphones,” Survey Practice, vol. 5, no. 3, 2012.
  16. T. D. Buskirk and C. Andres, “Smart Surveys for Smart Phones: Exploring Various Approaches for Conducting Online Mobile Surveys via Smartphones*,” Survey Practice, vol. 5, no. 1, 2013.
  17. B. A. Campbell, C. C. Tossell, M. D. Byrne, and P. Kortum, “Voting on a Smartphone Evaluating the Usability of an Optimized Voting System for Handheld Mobile Devices,” Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol. 55, no. 1, pp. 1100–1104, Sep. 2011.
  18. F. Ortag and H. Huang, “Location-based emotions relevant for pedestrian navigation,” in Proceedings of the 25th international cartographic conference, Paris, 2011.
  19. M. J. Silva, P. Carvalho, L. Sarmento, E. de Oliveira, and P. Magalhaes, “The design of OPTIMISM, an opinion mining system for portuguese politics,” New Trends in Artificial Intelligence: Proceedings of EPIA, pp. 12–15, 2009.
  20. T. C. Templeton, K. R. Fleischmann, and J. Boyd-Graber, “Comparing values and sentiment using Mechanical Turk,” in Proceedings of the 2011 iConference, 2011, pp. 783–784.
  21. A. Brew, D. Greene, and P. Cunningham, “Using crowdsourcing and active learning to track sentiment in online media,” ECAI 2010, pp. 145–150, 2010.
  22. “Twitter.” [Online]. Available: https://twitter.com/. [Accessed: 21-Apr-2013].
  23.  A. Hudson-Smith, M. Batty, A. Crooks, and R. Milton, “Mapping for the masses accessing Web 2.0 through crowdsourcing,” Social Science Computer Review, vol. 27, no. 4, pp. 524–538, 2009.
  24. D. Yang, G. Xue, X. Fang, and J. Tang, “Crowdsourcing to smartphones: incentive mechanism design for mobile phone sensing,” MobiCom’2012, 2012.
  25. M. Raento, A. Oulasvirta, and N. Eagle, “Smartphones An Emerging Tool for Social Scientists,” Sociological Methods & Research, vol. 37, no. 3, pp. 426–454, Feb. 2009.
  26. A. Oulasvirta, R. Petit, M. Raento, and S. Tiitta, “Interpreting and acting on mobile awareness cues,” Human–Computer Interaction, vol. 22, no. 1–2, pp. 97–135, 2007.