General or special favouritism? Wikipedia-Google relationship reexamined with Chinese Web data.

The most recent Wikipedia Signpost mentioned my research findings presented in Wikisym 2013 on Chinese search engine result pages (SERPs) on a particular question that is beyond Chinese Internet research: Does Google secretly favour Wikipedia? Or it is simply that general search engines have the tendency to favour the successful user-generated encyclopedias (UGEs)? In other words, is the SERPs-UGEs favouritism a particular Google thing or a general feature of the search engines?  

My findings suggest it is a general favouritism by the SERPs, rather than a specific kind by Google. The fact that Google China favours Baidu Baike, a user-generated encyclopedia hosted by Google’s Chinese competitor Baidu, provides an interesting piece of evidence that it is “search engines favor user-generated encyclopedias”, not “Google favors Wikipedia”. In the following paragraphs, I will provide the details by expanding the summary described in the recent Wikipedia Signpost.

It’s “search engines favor user-generated encyclopedias”, not “Google favors Wikipedia”: In “How does localization influence online visibility of user-generated encyclopedias? A study on Chinese-language Search Engine Result Pages (SERPs),”[11] Han-Teng Liao reported on results (some of which previously published on his blog[12]) comparing the ranking of three Chinese-language user-generated encyclopedias (Wikipedia, Baidu Baike and Hudong) on nine Chinese-language search engine variants (by the three companies Google, Baidu and Yahoo, in mainland China, Singapore, Hong Kong and Taiwan, the former two mostly using simplified Chinese and the latter two traditional), for a collection of search terms. Google China on the other hand prefers its competitor’s user-generated encyclopedia, Baidu Baike, more so than Chinese Wikipedia, thereby suggesting the favouritism has more to do with the SERPs-UGEs favouritism, instead of a particular Google-Wikipedia one?  

(Although I am still revising the draft for formal publication, I provide a link for downloading the working draft here for discussion.)

He found that the three projects generally dominate Chinese-language search results, alongside other user-generated content. That Baidu Baike ranks highly on the search engine run by its mother company might come as no surprise (in fact, Hudong submitted a complaint to a government body last year about this), but it still ranked the wikipedia.org domain a (distant) second place in four of the seven search term categories studied.

One important observation is that for Chinese SERPs, user-generated content websites such as user-generated encyclopedias, blogs, knowledge sharing websites and online forums are visible. The findings are interesting especially when we consider Hudong Baike, a commercial competitor of Baidu Baike in China. Based on the findings shown in Table 6 and Table 7 in this working draft, Hudong Baike is hugely disadvantaged by Baidu Search, not so much by Google China. In other words, although Baidu Search favours its own product, it disfavour Hudong Baike more so than Chinese Wikipedia, suggesting that the commercial calculation is more important than political censorship of the content for Baidu Search. (Both Baidu Baike and Hudong Baike censor/filter content by their paid staff.)

Table 6: Google_CN findings

tcat_host_Google_CN

Table 7: Baidu_CN findings

tcat_host_Baidu_CN

Liao also interpreted the results, tentatively, as evidence against the often-voiced (but never substantiated) suspicion that Google artificially favors Wikipedia – in fact, Google as seen in China (in simplified Chinese) tends to rank Baidu Baike above Chinese Wikipedia. Instead, the results appear to indicate a general preference of search engines for user-generated content. (Cf. related earlier coverage: “High search engine rankings of Wikipedia articles found to be justified by quality“)

Indeed, the findings here is indeed limited to Chinese-language context, and thus we need more similar research for search engine markets that are not dominated by Google. The question whether the SERPs-UGEs favouritism is general or specific to Google-Wikipedia can be better answered if the same methodology is applied to (or similar research is conducted on) the cases of Russia (where Yandex dominates) and South Korea (where Naver and other dominate). Nonetheless, since my research findings are based on a diverse set of search queries (seven categories of 3000 search terms, the largest so far in Chinese Internet social research) and across various Chinese-speaking regions (including mainland China, Hong Kong, Singapore and Taiwan), they provide a set of important empirical findings for further discussion on this particular debate on whether Wikipedia (as an institution) receives special treatment from Google (as a corporation), or whether it is the outcome of successful user-generated encyclopedias being successful.

Comments (choose your preferred platforms)

Loading Facebook Comments ...