Wikipedia and the Credibility of Online Information

Feb 10


Sam Vaknin

Sam Vaknin

  • Share this article on Facebook
  • Share this article on Twitter
  • Share this article on Linkedin

The Wikipedia was touted as the greatest reference work in history. A collaborative effort of contributors and editors across time and space, it bloated into hundreds of thousands of articles on subjects both deserving and risible. Anyone with a connection to the Internet and a browser can edit the Wikipedia, regardless of his or her qualifications to do so.


Events in 2005-6 exposed the underbelly and weaknesses of this mammoth enterprise. Entries are routinely vandalized,Wikipedia and the Credibility of Online Information Articles libel and falsities often find their way into existing articles as a way to settle scores, manipulate public opinion, or express outrage.

The prestigious magazine "Nature" studied Wikipedia articles on the sciences and found them similar in quality to peer reviewed and edited encyclopedias. Indeed, the problems cluster around the entries that deal with the softer edges of the human experience (where everyone feels qualified to comment and edit): the social "sciences", the humanities, arts and entertainment, politics, current affairs, celebrities, and the like. It is there that "edit wars" and thrashing are most ripe. The result is that nigh close to 90% of the Wikipedia contain highly dubious material and attract the least qualified "experts" and "editors".

This seems to prove the point that the gaining and preservation of knowledge should not be subjected to a democratic process (or, as in the Wikipedia's case, mob rule). As the promoters of "intelligent design" are finding out, what we learn cannot and must not be decided by vocal protests and voting.

The acquisition of expertise and its propagation across the generations by means of works of reference should remain an elitist endeavor. The mechanisms of peer-review and editorial board are far from fail-proof. But they do guarantee a modicum of accuracy and objectivity which the Wikipedia gravely fails to do.

There are examples of online encyclopedias that actually adhere to basic principles: their authors and editors are qualified to write about the topics they have chosen or have been assigned, and the entries are largely accurate and unbiased. The Stanford Encyclopedia of Philosophy (SEP) is one example. The Open Site Encyclopedia is a hybrid, a cross between the Wikipedia and the SEP models. Still, they haven't been able to attain the stature of the likes of the Encyclopedia Britannica or even the Encarta.

But there is a larger issue at stake. Is the Internet a reliable and credible source of information?

People are conditioned to trust written words, not to mention images. "I read it in the paper" or "As seen on TV" are worn out but still effective clichés. The Internet combines both the written and the seen. It is both a textual and a visual (and audio) medium. Do people trust Internet content? Is the incredible Internet - credible?

In the "brick and mortar" world, credibility is associated with brands. A brand, in effect, guarantees the quality and specifications of a product (think McDonald's hamburgers), its performance (think Palm), level of service and commitment to customer care (Amazon), variety, or price (Wal-Mart). Brands are sustained and enhanced by advertising campaigns. The content or sales pitch of specific ads are often less important than the message conveyed by the very existence of a campaign: "This company is rich enough (read: stable, reliable, trustworthy, here to stay) to spend millions on advertising."

The Internet has very few brands (Yahoo!, Amazon) - and some of them are tarnished. Some "old media" brands have entered the fray (Barnes and Noble, The Wall Street Journal, the Britannica) - hitherto without much success. The overwhelming bulk of Web content is created or disseminated by small time entrepreneurs and monomaniacs.

So, how does one establish or acquire credibility in such a diffuse and anarchic medium?

Enter Stanford University's "Web Credibility Project".

They define themselves thus:

"Our goal is to understand what leads people to believe what they find on the Web. We hope this knowledge will enhance Web site design and promote future research on Web credibility. As part of this ongoing project we are:

  a.. Performing quantitative research on Web credibility.
  b.. Collecting all public information on Web credibility.
  c.. Acting as a clearinghouse for this information.
  d.. Facilitating research and discussion about Web credibility.
  e.. Helping designers create credible Web sites."
Examples of current projects:

     Timeliness: How does having out-of-date content affect the credibility of a Web site?
     Interaction: How does having a personalized interaction with a Web site affect its credibility?
     Negative Content: How does displaying negative content associated with a branded web site affect the credibility of the brand?

It is useful to confine ourselves to this definition of trust:

"The subjective belief, perception, or conviction that information provided is true, factual, and objective, and that commitments undertaken, explicitly, or implicitly, will be honored fully and in a timely manner."

Such perception, belief, or conviction are based on:

  a.. Past experience in general (with spam, with merchants, or providers, with a similar product category, with the same type of content, etc.) and personal proclivity to trust or to distrust.
  a.. Experience with the specific merchant or provider (whether personal or gleaned from other people's feedback - reviews, complaints, and opinions).
There is little that a merchant can do about the former. The latter is, expectedly, influenced by:

  a.. Professionalism (as evident in Web site design, e-commerce facilities, user-friendliness, navigability, links to other relevant Web pages, links from other Web sites, ease and speed of download, updated content, proofreading, domain name which matches the company's name, availability, multilingualism, etc.);
  a.. Trustworthiness (lack of bias, good intentions, truthfulness, thoroughness, objectivity, expertise and author credentials, knowledgeable sources and treatment, citations and bibliography), and what the authors of the research call "Real World Feel" (physical address, phone/fax numbers, non-Web e-mail address, photos of facilities and staff, audio recording, ownership by a not for profit organization, URL ending with ORG);
  a.. Commercial Web sites are less trusted. Cluttered ads, paid subscriptions, e-commerce enabled forms - all reduce the site's credibility! This is especially true if the entire site is a one, big ad and when it is hard to distinguish ads from  content;
  a.. Track record (how veteran is the merchant, past financial performance, credit history, brand name recognition, lists of customers, etc.);
  a.. Selection (how many products are carried, how often is inventory refreshed, etc.);
  a.. Advertising (is the company's business sufficiently lucrative to support a campaign?);
  a.. Service (good service indicates a reassuring readiness to sacrifice the bottom line to cater to the customer's legitimate concerns, feedback forms, live support, etc.);
  a.. Full disclosure of rates, prices, privacy policy, security issues, etc.;
  a.. Feedback from other users (opinions, reviews, comments, FAQs, support groups, etc.);
  a.. Site rating and certification by trustworthy agencies (like the Better Business Bureau - BBB, VeriSign, TRUSTe) - or awards won (from credible and reputable organizations). Links from other, well-known and believable Web sites.
The Credibility Web discovered that trust in e-commerce is also influenced by idiosyncratic factors. Certain domain names (org) are more trusted than others (com). Too many ads, broken links, typos, outdated or old content - all diminish trust. In the absence of proven markers and behavioral guidelines, people seem to resort to extrapolation ("if they can't maintain their own Web site...") and stereotypes (e.g., NGO's are more trustworthy than corporations).

As Web sites proliferate (Google indexes well over 3 billion now) and Web authoring becomes a routine task - the noise to signal ratio of garbage to useful information is bound to deteriorate. Search engines already incorporate crude measures of credibility in their rankings (e.g., the number of links from external Web sites). But, to remain useful, search engines (and Web directories) would do well to rate Web content more comprehensively and thoroughly. They should rank Web sites by  authoritativeness, reliability, and objectivity, for instance.

Research shows that 75% of all respondents resort to the Internet as a primary information provider. The inundation of irrelevant material caused most surfers to confine their surfing to 10 Web sites (the equivalent of "anchors" in shopping malls) which they deem reliable, timely, accurate, objective, authoritative, and credible. The rest of the Internet gets the leftovers.  This worrying trend can be reversed only through the emergence of independent and commercially-viable rating agencies. Web sites (at least the business ones) should be willing to pay for credible rating to enhance their stickiness and attract monetizable "eyeballs". In the absence of such third party accreditation, the Internet risks both irrelevance and disrepute.