Mechanical Turks, Amazon's Mechanical Turk, web services, blog aggregators, web searches, screen scraping, data and information, copyright and attribution… With the rise of “easy access to information” comes a relatively new, still untapped solution vertical: Answers…
Amazon offers its Mechanical Turk API, Google and Yahoo! both offer Answer products, which in turn relies on people to find and post the answers. A number of startups such as AskMeNow have emerged in this space as well.
And with these new answer solutions also come concerns related to copyright, attribution and compensation. Let me give you an example. If I hit the web and ask the question “What is an answer?”, I get back:
To “reply or respond to”, to “give the correct answer or solution to”.
The above are the answers I got back from a Google search.
But who's definition is this? I see no attribution other than “definitions on the web”. Someone spent the time to define and write this information. So why isn't there an attribution? Is that fair use? I am not a lawyer so I can't really say, but as an author I will say that the use of information without attribution is of great concern.
Content is king. And creating good and accurate content is not only hard, but it is expensive in many ways. And the web provides a shortcut to ask and find information and answers. And I am afraid the attribution problem will get worst with the new wave of Answer products that are coming out…
The following diagram shows the elements of a possible Answers engine:
Where… we have the source for most of the information/answers: the Web. The diagram also illustrates possible ways to extract information from the web: screen scraping, web searches, Mechanical Turks… Some “answer engines” may even implement a cache for performance purposes. You can see how information can easily be extracted from the web – without proper attribution and compensation. The local cache may even store this information without permission. This is a huge problem, because authors may be relying in users visiting their websites for generating revenue through advertising.
Everyone is affected… from bloggers, to authors of books and technical papers, to Wikipedia… to anyone who provides content. The end-user benefits from this, as well as the answers companies and the people behind the mturks who research (Googles) the information and delivers it. But to whose expense? To the authors and the content providers expense. Bloggers and other content providers already have to battle blog aggregators and others that cannibalize without any respect. The next battlefront might be against the answers companies.
Since I don't have an answer to how to enforce attribution and compensation, I will be modifying my weblog's legal terms to not allow the extraction of information by Mechanical Turks, or screen scraping, or blog aggregators, or any other method that extracts information without proper attribution and compensation.
Web 2.0 is about collective use and collaboration… And referring to is not the same as extracting and cannibalizing from… Answers companies that uses other people's content, directly or indirectly, must do the fair thing: follow term of use, give proper attribution and show the money!
Easy, fast, and now access to answers has great revenue generating potential. And it is of great benefit to end-users. In mobile handsets we need such solution; a cheaper solution that competes against the expensive (carrier-based) 411. All I ask is for fair use of information on the web.