Page 1 of 16123456789101112...Last »

The deadline is approaching for P&I’s Special Issue on Social Data Science and the Chinese Web.

The deadline (1 March, 2015) is approaching for the Policy and Internet: Special Issue on Social Data Science and the Chinese Web. The Call For Papers can be found here: http://www.oii.ox.ac.uk/news/?id=1128

截稿日将至,请把握时间

《政策和互联网》特刊〈社会数据科学及中国/华文网络〉

政策和互联网》是研究互联网对公共政策冲击的第一份主要跨领域期刊。该刊计划于2015年下半年发布特刊〈社会数据科学及中国/华文网络〉讨论社会数据科学于中国及广大华文网络的发展,包括数据科学方法(如大数据)的创新使用以及能增进中国/华文网络研究方法。征稿面向世界各地,论文投稿仅限于英文,截稿日期为2015年3月1日。征稿完整英文原文详见:http://www.oii.ox.ac.uk/news/?id=1128

Chinese language surpassing English in Internet users any time soon?

In the previous blog post, I have found that the gap between English-language and Chinese-language Internet users is closing, from about 55 million in 2011 to around 36 million in 2013. Does it mean that Chinese language will soon surpass English in Internet users any time soon? Continue reading

The size of economies by language: the top 20 languages based on CLDR Version 25 data

Using similar approach and data sets as described in the previous blog post, we can also estimate the size of economies by language. I have also summarized the findings here as follows.
It tells us the size of total “formal economic activities” that can be tapped into by targeting each group of language users. One of the major caveats using the Unicode CLDR Version 25 data will also be discussed here.
Continue reading

Internet users by language: the top 20 languages based on CLDR Version 25 data

One tangible way to get a sense of what the new Internet world is like is to see Internet users through the lens of languages. Based on the language user percentage numbers for each country from the Unicode CLDR Version 25 data, the population data from the IMF World Economic Outlook Database (October 2014), and the Internet penetration rates from the ITU, I have summarized a table of top 20 language Internet users in the world. Interesting development for languages such as Arabic, Russian, Hindi, Wuu Chinese, Egyptian Arabic, and of course the closing gap between English and Chinese. Continue reading

Data normalization of cyber-attacks and cyber-spying activities

All too often the cyber-attacks or cyber-spying (cyber espionage) activities are reported as country-based aggregated numbers, showing China, the U.S., etc. as the most active countries. These reports should be normalized by “factoring out the size of the domain when you wish to compare“. This blog post demonstrates how it can be done using the pyCountrySize project that I have developed on the cyber incidents of one indicator in the Akamai’s Internet attack traffic report and the Ghostnet incident (a cyber-spying event).
Continue reading

Exploring Country Size data sets: one systematic approach towards fair (more equal or more proportional) distribution of Internet resources and responsibilities

This blog post documents my preliminary work of extracting and repackaging the Country Size data sets for python users (pyCountrySize). It currently contains country-size indicators of Population (in millions, LP from IMF WEO), Economy Size (in billions, PPPGDP from IMF WEO), Internet Population (in millions, derived from the IMF WEO population dataset and ITU Internet penetration rates), and Internet Hosts (in millions, extracted from CIA the World Factbook). This post will demonstrate how it can be used in python, how it can be used to conduct meaningful cross-country comparisons, and how it can be applied to Internet-related data sets in a systematic fashion to ask and answer the question regarding fair (more equal or more proportional) distribution of Internet resources and responsibilities. The results should not only put China and the U.S. in their places but also open new spaces for us to look for better ideas among countries beyond them. Continue reading

Big data industry of online expressions and attention

Thanks to the kind invitation by associate research fellow with tenure Dr. Tyng-Ruey Chuang at the Institute of Information Science, I will give a research seminar talk on Tuesday, 23 December 2014, titled “Big data industry of online expressions and attention“. Continue reading

Page 1 of 16123456789101112...Last »