Page 1 of 12123456789101112

It is official. A viva voce

I am about to face the greatest challenge to be faced by all doctoral students: a viva voce.

It is officially printed by the Oxford University Gazette, “the authorised journal of record of the University of Oxford” since 1870.

Han-Teng Liao. Examinations for the Degree of Doctor of Philosophy. University of Oxford Gazette.

Han-Teng Liao. Examinations for the Degree of Doctor
of Philosophy. University of Oxford Gazette.

It reads:

Han-Teng Liao, Keble: ‘Cultural politics of
user-generated encyclopaedias: comparing
Chinese Wikipedia and Baidu Baike
Oxford Internet Institute, 24 July, 11am
Examiners: W H Dutton, M Thelwall

Five simple short lines of record, many years of hard work, and one big challenge: Whether one has abjectly failed or completely passed, the Oxford variant of the UK style doctorate viva.

Chinese-language literature about Wikipedia: a meta-analysis of academic search engine result pages

What is the status of Chinese-language literature on Wikipedia research? To answer this, Zhang Bin and I have conducted an exploratory study of meta-search analysis based on 3464 data points produced by CNKI scholar and Google Scholar (including versions for Hong Kong and Taiwan). This short paper is also accepted by the Opensym 2014. Continue reading

Analyzing the Internet diffusion in China means breaking it up and contextualizing it in East Asia

China is a vast country with its internal and external geopolitical complexity. As China has the largest internet population under Beijing’s governance, the historical process of the Internet diffusion in China, i.e. the growing proportions of Internet population, holds the key to our understanding of the progression of both the Internet development and control in the region.

The Economist’s Special Report on China and the Internet on April 6th, 2013 did use a choropleth map (see page 2) to tell a story of “a giant cage”. Although it managed to show the internal differences across different provinces for the Internet penetration rates in 2011-2012, it failed to show and describe the historical shifts across Chinese provinces. Although it managed to compare China’s Internet population against that of (a)the United States, (a)the European Union and (c)the rest of the world, with a bar chart showing the 1995, 2000, 2005, 2010, 2012 data points, it failed to contextualize China’s Internet development in East Asian context.

The Economist's Infographics of the Internet diffusion in China

The Economist’s Infographics of the Internet diffusion in China

To amend the situation, I gathered a set of more complete data points that cover the time period from 1997 to 2012, not just for Chinese individual provinces but also for all other East Asian countries/regions, including Hong Kong and Taiwan over which China claims sovereignty. It should be noted that both Hong Kong and Taiwan are outside Beijing’s filtering and censorship regime, and they played important historical roles in reconnecting (mainland China) to the world’s global capitalist system.

By doing so, not only did I break China up (not politically of course, but analytically) so as to compare between Chinese regions (i.e. provinces), but also I put China in the context of East Asia, including countries that are more *Internet advanced of developed* such as Japan, South Korea, Singapore, Hong Kong and Taiwan and those that are less so.

As one can see from the series of infographics below, we gain more insights into the China’s development this way:

For individual images with higher resolution, please click on this link.

List of Figures

2012 Internet Penetration Rates in East Asia

2012 Internet Penetration Rates in East Asia

2011 Internet Penetration Rates in East Asia

2011 Internet Penetration Rates in East Asia

2010 Internet Penetration Rates in East Asia

2010 Internet Penetration Rates in East Asia

2009 Internet Penetration Rates in East Asia

2009 Internet Penetration Rates in East Asia

2008 Internet Penetration Rates in East Asia

2008 Internet Penetration Rates in East Asia

2007 Internet Penetration Rates in East Asia

2007 Internet Penetration Rates in East Asia

2006 Internet Penetration Rates in East Asia

2006 Internet Penetration Rates in East Asia

2005 Internet Penetration Rates in East Asia

2005 Internet Penetration Rates in East Asia

2004 Internet Penetration Rates in East Asia

2004 Internet Penetration Rates in East Asia

2003 Internet Penetration Rates in East Asia

2003 Internet Penetration Rates in East Asia

2002 Internet Penetration Rates in East Asia

2002 Internet Penetration Rates in East Asia

2001 Internet Penetration Rates in East Asia

2001 Internet Penetration Rates in East Asia

2000 Internet Penetration Rates in East Asia

2000 Internet Penetration Rates in East Asia

1999 Internet Penetration Rates in East Asia

1999 Internet Penetration Rates in East Asia

1998 Internet Penetration Rates in East Asia

1998 Internet Penetration Rates in East Asia

1997 Internet Penetration Rates in East Asian

1997 Internet Penetration Rates in East Asia

Each choropleth map shows regions with lowest to highest penetration of Internet in different colors. When the color shifts from purple to green, it means the internet penetration rates are growing from 0% to near 100%
The histogram at the lower-left corner also shows how the population moves from mostly internet non users (in purple color) to internet users (in green color).

Thus, the infographics show how Chinese Internet population grows across time and space, and also in relation to other neighboring regions such as Japan, South Korea, Hong Kong and of course Taiwan, where I come from.

These infographics helped me to develop several observations and hypotheses for the historical development of the China’s filtering regime and the maritime aspect of the Internet diffusion in East Asia. They include:

Thus, I would prefer a story of “recentering” Chinese Internet to the Economist’s story of “a giant cage”. The Chinese regime wants its Internet population has its cultural and political center (of sources and focus) within mainland China (particularly Beijing), and thus the cultural and political effects of China’s filtering and censorship regime are not so much about keeping its users isolated from the world, but rather about disciplining them using the “recentered” information sources and Internet services hosted behind the Great Firewall.

The story of “recentering” can thus better explain and contextualize why the Internet penetration rates were indeed depressed before 2005, mostly because of the initial set up of the filtering and censorship regime did have negative impacts on the adoption of Internet. However, after 2005, especially during the time period of late 2005-2008, not only the filtering and censorship regime became more *established* as everyday practice by censors and users alike, but also the domestic services and sources (mainland Chinese ones of course) began to marginalize the *foreign* ones.

The English word “to analyze” has its Latin/Greek origin, meaning “breaking up”. Thus, the infographics here demonstrate the analytical power to break up data points. It is up to analysts to tell a story by synthesizing data points that have been broken down first. As the Web increasingly allow both researchers and users to design and use infographics, it may help us to think outside the box. Let us keep breaking things up for better understanding.

Many people to thank for my winning of the OxTALENT Awards in the category of Interactive Infographics

I have won the OxTALENT Awards this year 2014) in the category of Interactive Infographics. Part of the University Teaching Awards Scheme, the annual OxTALENT Awards recognise and reward excellence in teaching and learning supported by ICT at Oxford. Although I cannot attend the award ceremony held in Oxford as I was in Hong Kong for the Chinese Internet Research Conference, I like to thank the people/community that have helped directly and indirectly.

OxTALENT Awards Ceremony 2014

OxTALENT Awards Ceremony 2014

First, I have to thank the Oxford Internet Institute for exposing me to various new ideas and helping me to get the advice I need, including Ralph Schroeder (my supervisor, for shaping the research idea), Mark Graham and Scott Hale (for GIS mapping suggestions).

Second, I thank the Chinese Internet Research conference/community for providing the intellectual background on the topics of Internet diffusion in China and the impact of Chinese filtering/censorship regime.

Third, my gratitude to Adrian Herzog in Switzerland for his Java applet cartography tool called “MAPresso”. He took the time answering my questions so that I can fix some of the issues encountered.

Also, my thanks to Lin-ting Hsia, a Taiwanese colleague in Oxford who took the time to attend the ceremony on my behalfPhoto courtesy of Lin-ting Hsia ).

OxTALENT Awards 2014: Interactive Infographics   Han-Teng Liao

OxTALENT Awards 2014: Interactive Infographics
Han-Teng Liao

Although I did not manage to present my infographics at the ceremony, I will write a blog post on the infographics for general audience.

What do Chinese-language microblog users do with Baidu Baike and Chinese Wikipedia?

What do Chinese-language microblog users do with Baidu Baike and Chinese Wikipedia? To answer this, I have conducted a case study of information engagement based on more than 40,000 microblog posts provided by the WeiboScope (University of Hong Kong) and the DiscoverText-Weibo dataset (Texifter.com). If I manage to secure travel funding for the Opensym 2014, I will present the findings this year in Berlin. Continue reading

Exploring mainland Chinese interactions with the world, using the GDELT dataset

This blog post documents my preliminary trial of the GDELT (Global database of events, language and tone) dataset, a project created by Kalev H. Leetaru, currently the Yahoo! Fellow in Residence of International Values, Communications Technology & the Global Internet. The GDELT dataset has been used by (especially computational) social scientists and journalists. Have been monitoring the world’s broadcast, print, and web news to aggregate world events, now the GDELT project is supported by Google Ideas.

Actions

I have tried the following:

  • Retrieve the dataset for the time period from 20140504 to 20140605
  • Select the records where mainland China (country code CHN) was documented as an actor (either as actor 1 or actor 2)
  • Map/visualize the location of the events
  • Map/visualize the location of the pair of actors involved

Resulting maps

The resulting maps are as follows:

Event locations for mainland China (CHN) during the time period of 20140504 to 20140605

Event locations for mainland China (CHN) during the time period of 20140504 to 20140605

Pairs of actor locations for mainland China (CHN)-involved events during the time period of 20140504 to 20140605

Pairs of actor locations for mainland China (CHN)-involved events during the time period of 20140504 to 20140605

I will release the working codes later so that any researchers can try something different. My thanks to Rolf Fredheim and David Masad. Rolf Fredheim mapped the Russian data using R, and David Masad mapped also the Russian data but using python. I could not produce the maps above in such a short time without their sharing of their codes. Readers who are interested in trying are encouraged to read the two blog posts

The exact same maps with shaded relief visualization are also available:

Event locations for mainland China (CHN) during the time period of 20140504 to 20140605

Event locations for mainland China (CHN) during the time period of 20140504 to 20140605

Pairs of actor locations for mainland China (CHN)-involved events during the time period of 20140504 to 20140605

Pairs of actor locations for mainland China (CHN)-involved events during the time period of 20140504 to 20140605

Notes for future work

It has been an interesting exercise, with multiple implications and places for improvement. For future reference, I keep notes here. Any suggestions are welcome.

  • Explore the language coverage bias of the GDELT dataset by comparing it with other databases that focuses on events/protests in mainland China (for both Chinese studies and Chinese Internet research community)
  • Release easy-to-follow tutorials for python users to access the data, with a project called “PyGDELT” at github.
  • Compare and discuss the systemic bias of the GDELT dataset (established media) versus the popular web dataset (such as Twitter), with mainland Chinese datasets as additional points of comparison (such as Weibo).

See also

Page 1 of 12123456789101112