Megacities Through the Lens of Social Media

Urbanization and Megacities

Over the past half century, the worldwide urban population grew from 746 million in 1950 to 3.9 billion in 2014, and experts project the population will reach 5 billion in 2030 and 6.3 billion by 2050. [1] This growth is primarily due to a pronounced urbanization trend. While only one-third of the global population was urban in 1950 (29.5 percent), half (53.6 percent) of the global population is urban today and two-thirds (66.4 percent) is projected to be urban by 2030. This urbanization trend is uneven: it does not lead to more urban areas but rather to bigger metropolitan areas, with megacities [2] expected to grow at a faster pace than the rest of urban settlements.

Since the 1970s, the number of megacities more than tripled (from eight to 34), and is expected to further double until 2050 (to exceed 60). Almost all of these new megacities are emerging in geopolitical hotspots of the developing world, primarily in Southeast Asia and sub-Saharan Africa. [1,3] The U.S. Department of Defense, therefore, must consider the challenges presented by engagement in such environments when planning for its future.

In a recent article, Maj. Christopher Bowers drew analogies between the experiences of the U.S. Army operations in 2004 and 2008 in Sadr City (a Shiite-controlled impoverished suburb district of Baghdad) and the projected challenges of future operations in megacities. [4] He focused primarily on the challenges of scale, human terrain variations and governance. Indeed, the physical challenges of operating in such dense, highly three-dimensional, socially uneven and, often, ungovernable environments are immense.

However, and this is the key focus of this contribution, the advanced functional complexity of these large urban environments further compounds these challenges: megacities function at the intersection of the physical, social and cyber spaces. Accordingly, military operations in these locations must be prepared to engage in environments where news, ideas and opinions are often shaped in cyberspace and propagated across the physical urban landscape. These processes lead to the formation and reformation of social networks to connect (or divide) populations, and facilitate the mobilization of these communities in response to ongoing events.

Across continents and events, from protests in the Arab world and disasters in the Far East, to reactions to terrorist activities in the West, social media has been the communication avenue of choice for the general public. [5,6] Advancing the capability to analyze crowd-generated content in the form of social media feeds is a substantial scientific challenge with considerable implications for future DoD operations.

Social Media and Intelligence

The term social media typically refers to services like Facebook, Twitter, Flickr and YouTube, which enable the general public to communicate with peers, sharing information instantly and constantly in an effortless and intuitive way. By bypassing the need for advanced computing skills to participate, and by fostering social interaction in cyber space, social media revolutionized information dissemination and presented an alternate means for community formation.

Today, Facebook has nearly 1.5 billion monthly active users worldwide (exceeding the populations of either China or India), while Instagram and Twitter have in excess of 400 and 300 million users respectively. [7] While these are global applications, there exist a number of regional services as well. For example, the Chinese instant messaging platform Tencent QQ exceeds 800 million active accounts, while the Russian VKontakte service has 100 million local active users. These communities contribute massive amounts of crowd-generated data. Every minute, more than 300,000 status updates are posted in Facebook and 450,000 new tweets are generated, while 65,000 new photos are uploaded in Instagram, [8] leading to the emergence of a new big data paradigm. [9]

Analyzing the content of these contributions is all about finding connection patterns. Connections among users (e.g., formed as they respond to, or follow, other users) reveal the underlying social structure of the user community. Word co-occurrences lead to the formation of semantic connections among the terms used in social media (e.g., words that are used commonly together in the context of a particular discussion), and in doing so reveal the complex narrative of this public discourse. Connections among locations (e.g., coordinates from which the contributions originate, or of references to specific locations) reveal the geographical footprint of various communities. It is through the analysis of these multiple connections that one can decode the convoluted content of social media feeds. This can be of interest to a variety of applications, including intelligence.

The significance of social media for intelligence was demonstrated quite vividly during the Arab Spring events across North Africa and the Middle East in early 2011. Platforms like Twitter, Facebook and YouTube were instrumental in reporting news from these events, [10] and in supporting the organization and coordination of related activities. [11] While this is widely considered a watershed moment for the use of social media in geopolitical events, it was not the first time this happened. Twenty months before the Arab Spring, in June 2009, social media platforms were used to broadcast to the world real-time information from the clashes in the streets of Teheran following the rigged Iranian presidential election, bypassing the state-imposed crackdown on crisis coverage. [11]

In the time since these first glimpses at their communication power, social media has been used in response to natural disasters and the Fukushima nuclear accident in the Far East; used to communicate information following terrorist attacks in the streets of Boston and Paris; and abused by the Islamic State of Iraq and the Levant in the Middle East. [12,13,14]

Decoding Urban Complexity through Social Media Analysis

With megacities emerging as theaters of events and operations, a new framework is needed for monitoring, analysis and modeling. Toward this goal, megacities are treated with an alternative view, as information hubs. Five years ago, in 2010, then Google CEO Eric Schmidt, pointed out the world was generating nearly five exabytes of data every two days, the equivalent of the sum of information generated by humanity from the dawn of civilization up until the beginning of this millennium. [15] Most of that information is generated in cities through smart devices (from traffic cameras to smart appliances) or from their residents (through their social media activities). Accordingly, operations in megacities are operations in information-rich environments.

A novel framework for studying megacities, therefore, must be characterized by the collaborative use of authoritative (e.g., mapping and census data) and crowd-generated (e.g., content harvested from social media) content (see Figure 1). It is through the collaborative analysis of these data sources that one can fully capture the complex structure and functionality of these large urban areas. Some lessons learned from on-going studies in this emerging framework help us realize some notable particularities associated with operating in such information-rich environments and are listed below.

Figure 1. An emerging framework to study urban systems. (Released)

Figure 1. An emerging framework to study urban systems. (Released)

Cities Are More Complex Spaces Than Their Geometries

While the three-dimensional layout of a city remains important, cities cannot be viewed as pure geometrical spaces. Buildings and road networks enable and support certain actions and operations, and as such, it is critical to maintain the most up-to-date information for them; but cities are more than their geometries. Human activities and perceptions augment geometry
by assigning to locations different sociocultural meaning, transforming these locations into places. While some of these places are well established and widely known (e.g., the theater district
in Manhattan, or the artsy Bastille district in Paris), other places are more dynamic, occurring for example, temporarily, in response to particular events.

For example, Cairo’s Tahrir Square gained  totally different meaning on Jan. 25, 2011, when it was occupied by 50,000 protesters, marking the beginning of the revolution against Hosni Mubarak’s regime. [16] Capturing such information is becoming feasible through analyzing crowd-generated content.

Figure 2 shows different sociocultural hotspots in Singapore, detected by analyzing tweets originating from the city over a period of a month. Tweets were classified by analyzing their content into one of various thematic categories, and spatial clusters were identified to mark the corresponding hotspots—in this particular case, entertainment, politics and military.

Figure 2. Sociocultural hotspots in Singapore detected through the classification of tweets originating from these locations: entertainment (shades of red), politics (shades of blue) and military (shades of green). (Released)

Figure 2. Sociocultural hotspots in Singapore detected through the classification of tweets originating from these locations: entertainment (shades of red), politics
(shades of blue) and military (shades of green). (Released)

In a similar manner, one could identify hotspots associated with other sociocultural issues that elicit public mentions in social media (e.g., health, finance) and at various levels of granularity (e.g., identifying references to a particular health issue as opposed to health at large). Through this process one could identify urban sociocultural hotspots. The aggregate of these hotspots is the equivalent of semantic map of the city, identifying meaningful sociocultural subdivisions and their variations over time. Through this process one can identify, for example, friendly or hostile areas and their evolution in response to certain events; hotspots for or against a certain issue; and even monitor the progress in space and through time of a civil unrest event. [17] Harvesting such local knowledge directly from crowd-generated content offers the additional advantage of eliminating the potential biases often associated with cross-cultural analysis.

Containment is Challenging in a Networked World

By substituting physical with virtual interaction, social media have introduced a novel avenue for community building, transcending established boundaries to diffuse ideas and information across space. This leads to the formation of highly-connected communities that are spatially distributed. As a result, an area of operations is no longer geometrically bound: the individuals operating within it may be connected (virtually, even though not physically) to other groups or individuals at distant locations, beyond the particular area of operations boundaries. Accordingly, events that occur at these distant locations may affect the area of operations, often in an unpredictable manner.

As an example, Figure 3 A shows the formation of an international community because of the discourse in Twitter regarding Syria. [18] Remote communities participate in this debate, influencing and being influenced by the local Syrian community. The effect of this process is to connect these remote locations, creating a virtual community that transcends space, comprising locals and foreigners alike.

Figure 3. A (left): The discussion about Syria in Twitter. The size of the nodes indicates the level of participation by different international communities, proportional to the number of tweets originating from these countries normalized by the local populations.

Figure 3. A (left): The discussion about Syria in Twitter. The size of the nodes indicates the level of participation by different international communities, proportional to the number of tweets originating from these countries normalized by the local populations.


Figure 3B (right): Social networks embedded within a geographical content, leading to connected, non-contiguous areas of operations. (Released)

Figure 3B (right): Social networks embedded within a geographical content, leading to connected, non-contiguous areas of operations. (Released)

From an operational standpoint this is visualized in Figure 3 B: three distinct neighborhoods, (the three disjointed gray blobs) are connected through the connections of individuals or groups within them (the colored nodes, with each color denoting a particular on-line community). While nodes within each neighborhood are connected (via spatial proximity) to other nodes within it, certain nodes are also connected (via social proximity) to distant nodes in the other neighborhoods. Accordingly, operations within each neighborhood would be affected by the events occurring in the rest, establishing a non-contiguous area of operations. Advancing the ability to identify these connections and gaining a better understanding of the footprint of an area of operations could lead to substantial operational benefits.

Information Authority is a New Challenge

In the new paradigm of information dissemination through social media information, authority is a challenge. Figure 4 shows the network of retweets (top) and its spatial distribution (bottom) of Twitter traffic during the first 10 minutes after the Boston Marathon bombing. The network of retweets captures the communities formed through retweet activities: once a user retweets a post, the user is connected to the original author. Through this process, communities emerge as node clusters, indicating groups of users that are sharing stories. Bigger communities (i.e. ones with larger membership) are more influential than smaller ones.

Figure 4. Top: A retweet network formed through interactions during the first 10 minutes after the Boston Marathon bombing of April 15, 2013.

Figure 4. Top: A retweet network formed through interactions during the first 10 minutes after the Boston Marathon bombing of April 15, 2013.


Figure 4. Bottom: The geographical distribution of these retweets. (Released)

Figure 4. Bottom: The geographical distribution of these retweets. (Released)

Figure 4 overlays upon each community cluster the name of its central node, i.e. a highly influential member for that community. It is interesting to observe that during this critical period the top news disseminating node was not an official government account: Boston Police is rather peripheral in the discussion, representing a relatively small community at the upper right-hand side of the graph.

Instead, the top news disseminating node then was the Twitter account of Anonymous (@YourAnonNews), surpassing even official news organizations (@cnnbrk, @nypost, @ BostonGlobe) and the ever popular celebrities (@AlfredoFlores, @Lil- Tunechi). This exemplifies the challenge of authority in this participatory information ecosystem: a large part of the population was getting its information from unvetted sources, and as such may be vulnerable to manipulative dissemination of misinformation.

Given the spatial footprint of these communities (Figure 4, bottom) one can easily realize this information may very well be provided by overseas accounts. To further emphasize this vulnerability, in the fall of 2014, Russian government-affiliated hackers tested their abilities to disseminate false information and spread panic. In September, 2014 the hackers used fake social media accounts to make up a fake story about a fictional disaster in a real chemical plant in Louisiana, and followed in December, 2014 with posts reporting a fake outbreak of Ebola in Atlanta. [19]

These challenges represent a new type of cybersecurity concern, where the issue is not denial of service (as is usually the case with traditional cybersecurity attacks) but rather the denial of information, or the spread of misinformation. Accordingly, operations in information-rich urban areas may be subject to such challenges, leading to highly volatile environments.


Megacities are challenging operational environments, as they function at the intersection of the physical, social and cyber spaces. By viewing them as information hubs, the DoD can gain a better understanding of the way in which they are organized and operate. Therefore, a novel approach for studying megacities is emerging, characterized by the collaborative use of authoritative and crowd-generated content.

Harvesting information from social media allows the military to capture the complex sociocultural multidimensionality and the multiple links that characterize these modern urban environments. It also offers the added advantage of gaining such knowledge directly from the local population. In this approach, data is an operational commodity. However, the immersion in such a data-rich framework comes at the cost of a challenged authority, with official government agencies enjoying only a limited presence compared to other leaders of the social
media ecosystem. Refining analytical capabilities will help overcome this challenge and take full advantage of the presented opportunities.


1. United Nations (2014, July 10) World Urbanization Prospects: The 2014 Revision. UN Department of Economic and Social Affairs / Population Division. Retrieved from (accessed January 13, 2016).

2. Oxford English Dictionary. The Definition of a Megacity (A city with a population of 10 million or more). Retrieved from (accessed January 13, 2016).

3. Pasick, A. (2014, July 11) Almost all of the world’s largest cities will be in Asia and Africa by 2030. Retrieved from (accessed January 13, 2016).

4. Bowers, C. O. (2015). Future Megacity Operations—Lessons from Sadr City. Military Review. Retrieved from
20150630_art006.pdf (accessed January 13, 2016).

5. Gerbaudo, P (2012). Tweets and the streets: Social Media and contemporary activism. London, England; Pluto Press. 6. Gao, H, Barbier, B., & Goolsby, R. (2011). Harnessing the crowdsourcing
power of social media for disaster relief. IEEE Intelligent Systems, (3), 10-14. Retrieved from (accessed January 13, 2016).

7. (2015, November 1). Leading social networks worldwide as of November 2015, ranked by number of active users (in millions). Retrieved from (accessed January 13, 2016).

8. (2015, August 25). How much data is generated every minute on top digital and social media? Retrieved from (accessed January 13, 2016).

9. Croitoru, A., Crooks, A. T., Radzikowski, J., Stefanidis, A., Vatsavai, R. R., & Wayant, N. (2014). Geoinformatics and social media: A new big data challenge. Big Data Techniques and Technologies in Geoinformatics, CRC Press, Boca Raton, FL, 207- 232.

10. Shane, S. (2011 January 29). Spotlight again falls on web tools and change. The New York Times. Retrieved from (accessed January 13, 2016).

11. Pollock, J. (2011, September/October), Streetbook: How Egyptian and Tunisian Youth Hacked the Arab Spring. Technology Review. Re-trieved from (accessed January 13, 2016).

12. Stefanidis, A., Crooks, A., & Radzikowski, J. (2013). Harvesting ambient geospatial information from social media feeds. GeoJournal, 78(2), 319-338. Retrieved from (accessed January 13, 2016). doi: 10.1007/s10708-011-9438-2

13. Rosen, R. (2015, November 13) “They are slaughtering by one.” Hostage posts to Facebook during Paris attack. Retrieved from] hostage-post-social-media-facebook-twitter-benjamin-cazenoves-bataclan-eagles-death-metal/ (accessed January 13, 2016).

14. J Berger, J Morgan, (2015, March) The ISIS Twitter Census: Defining and describing the population of ISIS supporters on Twitter. Retrieved from (accessed January 13, 2016).

15. Siegler, M.G. (2010, August 4). Techcrunch. com. Eric Schmidt: Every two days we create as much information as we did up to 2003. Retrieved from (accessed January 13, 2016).

16. Aboelezz, M. (2014). The Geosemiotics of Tahrir Square: A study of the relationship between discourse and space. Journal of Language and Politics. 13(4), 599-622. Retrieved from (accessed January, 13, 2016). doi:10.1075/ jlp.13.4.02abo

17. Croitoru, A., Wayant, N., Crooks, A., Radzikowski, J., & Stefanidis, A. (2015). Linking cyber and physical spaces through community detection and clustering in social media feeds. Computers, Environment and Urban Systems. 53: 47-64. doi: 10.1016/j. compenvurbsys.2014.11.002

18. Stefanidis, A., Cotnoir, A., Croitoru, A., Crooks, A., Rice, M., & Radzikowski,  J. (2013). Demarcating new boundaries: mapping virtual polycentric communities through social media content. Cartography and Geographic Information Science, 40(2), 116-129. Retrieved from (accessed January 13, 2016). doi: 10.1080/15230406.2013.776211

19. Chen, A. (2015, June 2) The agency. The New York Times. Retrieved from (accessed January 13, 2016).

Focus Areas