Wikipedia in mainland China: the critical years of 2005-2008

Since October 19, 2005, Wikipedia suffered the third block in mainland China (in a way that users in mainland China cannot visit the servers of Wikipedia because of the internet filtering decided by the Chinese government). Unlike the previous two brief blocks, this block begins a series of longer blocks with brief unblocks in between. In other words, it becomes “normal” for Wikipedia being blocked since this very moment, and it remains so till the eve of the 2008 Beijing Olympic Games. So, when Baidu Baike is launched in 2006 and gains a large number of users and encyclopedia entries since, is it because Baidu Baike has its merits for success or simply because Chinese Wikipedia is blocked?

Some of those who support or sympathize with Baidu Baike argue that the impact of blocks is overstated and Baidu Baike has its own merits to gain most users from mainland China. For instance, Baidu Baike suits (mainland) China better than Chinese Wikipedia does mainly because it fits “Chinese characteristics”, Chinese culture or practices. The supporters and sympathizers of Chinese Wikipedia tend to believe that, were it not for the blocks of Wikipedia in mainland China, Chinese Wikipedia’s popularity would surpass that of Baidu Baike’s.

Based on the data of the internet population in mainland China, I intend to tackle the puzzle by framing this historical development as a case of diffusion of (competing) innovations.

Background: Wikipedia, the timing of blocks and the launch of Baidu Baike

Wikipedia being blocked in mainland China: 2004 -2008

Source: The first six blocks are documented in the self-described encyclopedia entry of “Chinese Wikipedia” in the Chinese version of Wikipedia. The seventh block refers to the partial block (or unblock) since the eve of the Beijing Olympics in 2008, with a few “sensitive” entries still being blocked.

The third block that begins on October 19, 2005 is significant because it marks a shift in blocking patterns. Before this block, blocks are sporadic and unblocks are the norm. After this block, the reverse is true. In other words, the first two blocks seem to be “experimentally” short, and after the third block, it is the unblocks that are short. The blocks of Wikipedia in mainland China can be thus roughly divided into two periods: before October 2005 and after.

It is worth noting that the two milestones of Chinese Wikipedia occur before 2005: The first is the decision to merge the simplified and traditional Chinese versions and the consensus reached on its official Chinese name. The second is the implementation of its automatic conversion system between the simplified and traditional Chinese characters. For Baidu Baike, the major milestone is its launch in 2006.

What can be inferred is that the internet users in mainland China have difficulties in using Wikipedia when Baidu Baike is launched in 2006, and it is so before the eve of the Beijing Olympic Games in 2008.

The fact that these events coincide has propelled many observers and users of Chinese Wikipedia to speculate the plausibility of coordinated efforts by the Chinese government and Baidu to transfer their actual and potential users to Baidu Baike, be it for political or economic motivations.

How significant are these blocks? This article applies the theory of innovation diffusion to the growth of internet users, as the main variable for the history of Wikipedia (and Baidu Baike) in mainland China. The relevant statistics and historical data suggest that when Baidu Baike is launched and Wikipedia is blocked, it is exactly when mainland Chinese regions experienced the fastest recovery in forming the “early majority”. As a result, Chinese Wikipedia misses out despite their first-mover advantages and Baidu Baike enjoys the windfall of new users with its “second-mover” advantages: distribution channels and timing.

Timeline of Chinese Wikipedia and Baidu Baike: 2003-2008

Baidu Baike has the second-mover advantage since 2006 when Chinese Wikipedia being blocked

The premise of using Wikipedia is using the internet, both of which are new things for Chinese people. According to the theory of innovation diffusion, the key for wider adoption is not about how those “tech savvies” persuade the early adopters to use them, because both group constitute only a minority. The key for successful innovation diffusion is about whether they are capable to convince the majority, especially the “early majority” users to adopt new ideas or products, usually on pragmatic grounds.

The theory of diffusion of innovations

The premise of using Wikipedia is using the Internet, both of which are the new things for Chinese users. According to the diffusion of innovation theory, the key gap for the diffusion process is not whether the “techies” can persuade the visionary “early adopters”, because both groups constitute only a minority. The key for successful innovation diffusion for a society or market at large, these “techies” and “visionaries” must convince the majority, especially the “early majority” users who adopt innovations on pragmatic grounds.

To illustrate the social dynamics of such a diffusion process, the theory divides members of a society (or a market) into five categories, based on the length of time required for each member to accept and adopt innovations. The figure below shows a typical model of innovation diffusion curves, where the group classification depends on the time required for adoption shown along the horizontal axis. Located in the centre of the “bell curve” are the two majority groups, “early majority” and “late majority”, each of which accounted for one-third of population respectively. The former group take slightly lesser time than average whereas the later group take slightly more time than average. If the distribution is of normal distribution and thus represented as the “bell curve” in the below figure, the threshold values for the late majority and early majority groups are located at the average value plus and minus one standard deviation correspondingly. 

The model of the diffusion of innovations: the bell-shaped curve and the

The model of the diffusion of innovations: the bell-shaped curve and the “S”-shaped curve

Similarly, the two groups to the left of the majority (the “innovators” and “early adopters”) requires the least amount of time to adopt the innovations (under the ideal model of normal distribution, the time required is less than the average minus two standard deviations). The group to the right of the majority (the “laggards) is expected to be skeptical, waiting longest among the groups before adoption.

If these groups follow the ideal normal distribution model, then the cumulative normal distribution would be a S-shaped curve. What the S-shaped curve represents is the varying diffusion speed at different points in time: it begins slow until the diffusion process expands from the “early adopters” to the “early majority”. Then it speeds up to reach the half of the market (or society), and slows down when the process begins to reach the last group of “laggards”.

This slow -fast-slow diffusion process implies the critical moment occurs when the process begins to expand from the “early adopters” to the “early majority”. In other words, before the majority accepts and adopts a piece of innovation, the “early adopters” must persuade the “early majority”, which constitutes the major first test or “chasm” for wider adoption.

Although originally a theory used mainly by scholars in sociology and communication studies, the diffusion of innovation theory has been influential for scholars in management and marketing studies in recent years, because it focuses on the rates and socio-dynamics of (technological) adoption.

Does the growth of the internet users in mainland China match the slow-fast -slow S-curve?

Internet penetration rates in Chinese and East Asian regions

I have collected publicly available data released by the Chinese and international authority regarding the numbers of population and Internet users. The overall data set spans from 1990 to 2010 and covers 31 Chinese administrative regions in mainland China and 17 East Asian countries. With these numbers, the penetration rates can be calculated accordingly when they are not directly provided. Altogether, 48 curves (each represent a region) can be plotted, as shown in the figure below. What can be observed is that these regions have different levels and growth rates in their internet penetration numbers, and that many of these curves fit in line with the slow-fast-slow S-shaped curve of innovation diffusion.

Internet penetration rate for East Asian and Chinese regions: 1990-2011

1900-2011 Internet penetration rate for 17 East Asian and 31 Chinese regions

Categorization results: based on the 2010 data

To facilitate meaningful comparison, I categorize these regions into several categories based on their level of internet penetration rates.

First, I assume that the ultimate penetration rate (or potential market size) is 80%, which matches the current level of developed regions in East Asia (in 2011, 79.53% for Japan and 83.80% for South Korea). Using this as baseline, the corresponding thresholds for different groups of adopters in the process of innovations can be adjusted accordingly. In other words, it is expected the internet penetration rate will be slow before it reaches the threshold of 12.8%, and then grows rapidly till it reaches the half of the market potential (40%) when the early majority is included in the process. Then the diffusion process will slow down significantly around 67.2%, when the innovation begins to be adopted by the “laggards”. Therefore, for any given region, it can be categorized accordingly based on these threshold values. A region is said to be “internet-developed” (category I) when its internet penetration rate is larger than 67.2%. A region is categorized as a “slow-growing” one (category II) when the rate falls between 40% and 67.2%. A region is categorized as a “fast-growing” one (category III) when the rate falls between 12.8% and 40%.

Based on the 2010 data, the categorization outcome is listed in the table below. The non-Chinese East Asian regions for the category I include South Korea, Taiwan, Singapore, Japan and Hong Kong, whereas only Beijing, the Chinese capital has reached this “internet-developed” (category I) stage, and so on. This table also shows the significant differences across Chinese regions: some are developed with high internet penetration rates like other advanced East Asian regions, while other regions are still lagging.

Categorization  results: based on the 2010 data

Categorization results: based on the 2010 data

To systematically compare the differences between Chinese and East Asian regions, two series of figures are made as follow.

The first series of three figures shows the arithmetic averages, category by category, of Chinese versus East Asian regions. To highlight the difference between the curves of average values, a histogram is added below each figure, indicating the difference and timing of the development gap. What can be repetitively observed is that the difference is peaked around 2006. It suggests that Chinese regions have relatively slower growth rates before 2006, only to catch up or even surpass the other East Asian contrasting regions after 2006.

Comparison of Chinese regions with equivalent East Asian ones: category I average

Averages values of Chinese regions (I-CN) and East Asian ones (I) in category I

Comparison of Chinese regions with equivalent East Asian ones: category II average

Comparison of Chinese regions with equivalent East Asian ones: category II average

Comparison of Chinese regions with equivalent East Asian ones: category III average

Comparison of Chinese regions with equivalent East Asian ones: category III average

The second series of three figures below compare a representative region (one Chinese region with one East Asian) in each category. The first figure shows the contrast of Beijing and Taiwan. The curve of Taiwan roughly follows the S-shape model of innovation diffusion: its growth rate at early (from 0% to 12.8%) and late (from 67.2% to 80%) stages is relatively slower than that in the middle (from 12.8% to 67.2%). In contrast, the Beijing curve departs from the S-shape curve around 2002, with a increasingly significant difference signalling its decline in growth. It reverses the trend from 2006 with much faster growth rate, finally catching up around 2008. A similar contrasting pattern can also be identified in the second figure where Shanghai and Malaysia curves are compared: Shanghai’s curve departs from Malaysia (and also the S-shape curve) from around 2002, begins to converge around 2006 and surpasses it since 2008. The third figure compares Shandong (a Chinese northern coastal province) and Vietnam and Shandong. Though it is expected the difference between the two should be much smaller because both belong to the category III (meaning the Internet users has yet to reach a half of the potential market), a similar gap can be identified by where the two curves are crossed: from around 2004-2005 to 2008-2009.

Comparison of Chinese regions with equivalent East Asian ones: category I example

Comparison of Chinese regions with equivalent East Asian ones: category I average

Comparison of Chinese regions with equivalent East Asian ones: category II example

Comparison of Chinese regions with equivalent East Asian ones: category II example

Comparison of Chinese regions with equivalent East Asian ones: category III example

Comparison of Chinese regions with equivalent East Asian ones: category III example

Based on both the theoretical model (innovation diffusion) and the comparison with other regions (East Asian countries), mainland China has indeed gone through a peculiar development from 2002 to2008. The above figures suggest that the first point of departure is around 2002 when the penetration rates slow down, and the second point is around 2006 when the penetration rates speed up. It is worth noting that the growth after 2006 is rapid enough for these Chinese regions to catch up or even surpass the contrasting regions . It suggests that some factor may have constrained the growth in mainland China during the period of 2002-2006, but then it the catch-up during the period of 2006-2008 is fast enough as if whatever inhibit the development has little impact on the subsequent development.

Mere coincidence? Blocking Wikipedia from 2005 to 2008

The internet user growth curve can contextualize the historical development of Baidu Baike and Chinese Wikipedia in mainland China. When Wikipedia is blocked from 2005 to 2008, Baidu Baike grows rapidly from its launch in 2006, exactly the year when the internet penetration rates in Chinese regions (especially for the more developed regions such as Beijing and Shanghai) begin to change its course back to the normal level as predicted by the S-shaped curve.

Indeed, the data for the year of 2006 shows that no Chinese regions have enter the more advanced regional categories (I and II). Let us accept that the premise of using Wikipedia is using the internet, and thus the adoption rates of Wikipedia are subject to the internet penetration rates. It follows that whatever inhibits the growth of internet users in mainland China before 2006 should also inhibit the growth of Wikipedia users during that time.

No matter what these inhibiting factors are before 2005, the historical facts remain that Chinese Wikipedia has been blocked from 2005 to the eve of the Beijing Olympic Games in 2008 and that Baidu Baike is launched in 2006. It is fair to state that Baidu Baike in 2006 enjoys unprecedented windfall from the growth of internet users in mainland China with its “early majority” as the potential pool of readers and editors. Note that the speed of growth during the years of 2006-2008 is fast enough for Chinese regions to catch up or even overtake their equivalent counterparts in East Asia and their theoretical levels (determined by the S-curve).

It is worth repeating the key point in model of the innovation diffusion: for any innovation to be diffused to the wider membership of a society or a market, the main challenge is to overcome the gap between the “early adopters” and the “early majority”. This is also the main take-away in the book “Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customers”. It is central whether the “early adopters” can succeed in playing the role of opinion leaders in persuading the “early majority”. The innovative vision of the “early adopters” must be put in pragmatic use by the “early majority”.

In the context of the diffusion of internet use, the gap between the “early adopters” and the “early majority” is approximately when the internet penetration rates grow pass the 12.8% threshold and then enjoy the rapid growth to 40%. In 2006, only eight coastal provinces have reached this threshold. However, with in just two years, all but four regions have passed the 12.8% threshold, including five provinces and cities advanced to Category II with penetration rates over 40%. These years (2006-2008) can thus be regarded as years of the big wave of new Chinese internet users, a perfect window of opportunity for any new internet innovations to gain and then maintain a solid user base.

Thus, hypothetically if the window of opportunity were to happen earlier before Wikipedia is blocked, it would be difficult for Baidu Baike to replace Chinese Wikipedia even it is blocked, simply because Chinese Wikipedia might have built a solid user base that include the “early majority”. Or, were it not for the blocks during the years of 2006-2008, Wikipedia should have at least taken a slice (if not a significant one) out of the pie of the new user base of the “early majority”. Chinese Wikipedia missed the opportunity window of big wave of new internet users, which has been exploited by Baidu Baike to challenge its popularity in mainland China.

Second mover advantage: the lessons from the Browser Wars

Arguably the most well-known application of innovation diffusion theory is the second mover advantage in the Browser Wars. The academic discussions seem to have reached a consensus that the conditions for a second mover to win the fight for the early majority are not necessarily based on technologies or innovations. The second movers can prevail if they exploit their advantageous positions in distribution channels before the critical group of early majority chooses them instead.

In the case of the Browser War between the Netscape and Microsoft, many scholars in management and marketing studies argue that the main reason for the success of Microsoft as the second mover is that it exercise its dominance of the distribution channels (a near monopoly of the operating system market for personal computers) before the early majority group of personal computer users become the internet users, thereby challenging the existing advantages of Netscape.

In other words, regardless of the legality of its marketing strategies (later legal proceedings in Europe and the United States seem to suggest that Microsoft may have violated some of the principles if not the laws of fair competition), Microsoft need to prevent the “early majority” users from following the “innovators” and/or the “early adopters” in using the rival’s Netscape browsers. Similarly, Wikipedia in mainland China faces the challenges by Baidu Baike in mainland China, particularly during the years of 2006-2008 when Baidu Baike has the advantages in timing and distribution and Chinese Wikipedia is blocked.

Conclusion

The evidence based on historical records and the growth curve of internet users in mainland China demonstrates the precarious timing of Baidu Baike’s launch as a new competitor against Chinese Wikipedia. First, a significant discrepancy in the internet penetration rates is identified for the period of 2002-2008, which is distinct from other East Asian regions and also distinct from the expected S-curve of innovation diffusion model. The discrepancy is marked by the initial slow-down since 2002, with the difference peaked around 2005 or 2006, and then compensated by the speedy growth between 2006 and 2008. Thus, the year of 2006 seems to be a critical point in time when the internet penetration rates in Chinese regions are making positive turns. It is precisely when Baidu Baike is launched while Wikipedia being mostly blocked from mid-2005 to 2008.

Drawing on the recent lessons learned from the Browser Wars through the theoretical lens of innovation diffusion theory, I argue that (a) the timing of Baidu’s launch and (b) its absolute advantages in distribution channels (its competitor being blocked) are the two indispensable conditions for Baidu Baike’s rapid growth.

On Baidu Baike’s success in its growing numbers of entries and editors, I do not intend to advocate or reinforce the viewpoint of its skeptics. Its success does not rely only on the fact that Wikipedia has been blocked (from 2005 to 2008) for most of the time during Baidu Baike’s launch in 2006. I argue that even if Chinese Wikipedia were blocked, but during a period of time other than the years of 2006-2008 when the internet penetration speed up, Baidu Baike may not be able to attain and maintain its popularity in mainland China. Because of the lack of direct evidence, I do not suggest that blocking Chinese Wikipedia (especially from 2005 to 2008) is a conscious decision made by the Chinese government to help the launch of Baidu Baike. However, if the lessons of innovation diffusion and the Browser Wars are correct and useful, then Chinese Wikipedia has missed the crucial moment of the years 2006-2008 to gain its critical group of the “early majority” across Chinese regions from its existing user base of the groups of the “innovators” and the “early adoptors”. Indeed, before the year of 2005, since some unknown factor has inhibited the growth of internet users in mainland China, the growth of Wikipedia users (or user-generated encyclopedia users in general) must be conditioned accordingly. After the year of 2006, when the growth of internet users is the fast enough to make up the historical (compared to other East Asian regions) and theoretical (compared to the S-shaped curve of innovation diffusion) differences, the new users are left with the only choice of Baidu Baike because of the block of Wikipedia in mainland China. Therefore, Baidu Baike’s success is not only expected and fits very well with the two essential conditions for the second mover advantage: timing and distribution channels.

So far, I have only highlighted the plausibility of innovation diffusion theory in explaining the anomalies in relevant statistics and historical events. Though the conclusions drawn from the theory of innovation diffusion seems fit, I cannot prove or disprove whether these conclusions helped the Chinese government to make its decision when to block (and unblock) or whether these lessons helped Baidu’s decision to launched Baidu Baike in 2006. The timings of their decisions could be mere coincidental rather than strategic, a question on which this article cannot prove or disprove. This article only provides a compelling historical explanation why Baidu Baike can gain popularity in mainland China immediately after its launch in 2006 while the first mover of Chinese Wikipedia cannot do so before 2005. While Chinese Wikiepedia may be popular among the groups of early adoptors and innovators before 2005 when it does not face consistent block, the internet penetration rates suggest its growth is conditioned by the somehow inhibited growth rates of internet users. The critical moment in competing for the group of “early majority” from the pool of mainland Chinese internet users is therefore confined to the narrow window of the years 2006-2008. It strongly suggests that the combined factor of internet penetration timing and distribution channels is more important than their content or technical features in explaining their development in mainland China.

The above conclusion has the potentials to be applied in other cases beyond user-generated encyclopedias. The same data and similar theoretical formulations can be applied in the competition history between Baidu and Google China or in the recent controversy on the competition between Baidu and Qihoo’s 360.

Postscript: Although it is difficult to prove or disprove whether the Chinese government or Baidu uses the above lessons of innovation diffusion theory for their decisions in policy-making or marketing , it should be noted that the Baidu founder Robin Li was involved in the Browser War when he was working in Silicon Valley (on the side of the Netscape browser) and that he devoted a chapter in his book “Silicon Valley Business War” on this.