The deadline (1 March, 2015) is approaching for the Policy and Internet: Special Issue on Social Data Science and the Chinese Web. The Call For Papers can be found here: http://www.oii.ox.ac.uk/news/?id=1128
Using similar approach and data sets as described in the previous blog post, we can also estimate the size of economies by language. I have also summarized the findings here as follows.
It tells us the size of total “formal economic activities” that can be tapped into by targeting each group of language users. One of the major caveats using the Unicode CLDR Version 25 data will also be discussed here.
One tangible way to get a sense of what the new Internet world is like is to see Internet users through the lens of languages. Based on the language user percentage numbers for each country from the Unicode CLDR Version 25 data, the population data from the IMF World Economic Outlook Database (October 2014), and the Internet penetration rates from the ITU, I have summarized a table of top 20 language Internet users in the world. Interesting development for languages such as Arabic, Russian, Hindi, Wuu Chinese, Egyptian Arabic, and of course the closing gap between English and Chinese. Continue reading
While I agree with almost every main points made by Evgeny Morozov in his latest opinion piece “Who’s the true enemy of internet freedom – China, Russia, or the US?” that rightly criticizes the “the aggressive efforts of Washington to exploit the fact that so much of the world’s communications infrastructure is run by Silicon Valley”, I do not like the negative tone and its overall conclusion. In this essay, I want to strike a more positive tone not only to stay more hopeful, but also introduce a notion that digital sovereignty can promote internet freedom and cyber trust, with the cases such as Finland and Sweden. Continue reading
All too often the cyber-attacks or cyber-spying (cyber espionage) activities are reported as country-based aggregated numbers, showing China, the U.S., etc. as the most active countries. These reports should be normalized by “factoring out the size of the domain when you wish to compare“. This blog post demonstrates how it can be done using the pyCountrySize project that I have developed on the cyber incidents of one indicator in the Akamai’s Internet attack traffic report and the Ghostnet incident (a cyber-spying event).
This blog post documents my preliminary work of extracting and repackaging the Country Size data sets for python users (pyCountrySize). It currently contains country-size indicators of Population (in millions, LP from IMF WEO), Economy Size (in billions, PPPGDP from IMF WEO), Internet Population (in millions, derived from the IMF WEO population dataset and ITU Internet penetration rates), and Internet Hosts (in millions, extracted from CIA the World Factbook). This post will demonstrate how it can be used in python, how it can be used to conduct meaningful cross-country comparisons, and how it can be applied to Internet-related data sets in a systematic fashion to ask and answer the question regarding fair (more equal or more proportional) distribution of Internet resources and responsibilities. The results should not only put China and the U.S. in their places but also open new spaces for us to look for better ideas among countries beyond them. Continue reading
Thanks to the kind invitation by associate research fellow with tenure Dr. Tyng-Ruey Chuang at the Institute of Information Science, I will give a research seminar talk on Tuesday, 23 December 2014, titled “Big data industry of online expressions and attention“. Continue reading