Mobile phone transfer data for measuring and analyzing human population mobility in western Ethiopia: implication for malaria epidemiology and elimination efforts | Malaria Diary

Study framework

This study was conducted in the Benishangul-Gumuz and Gambella regions of western Ethiopia (Fig. 1). These two regions were selected because they have the highest malaria burdens in Ethiopia and are regionally significant with respect to malaria epidemiology. About half of Ethiopia’s population, 52.9 out of 100, subscribe to Ethio Telecom’s mobile service. The total population of Benishangul-Gumuz is estimated at 1,109,565 of which 50.8% are male and 78.4% of the population lives in rural areas. There are 10 urban areas in the region. Assosa is the capital of the region located 665 km from Addis Ababa, Ethiopia. The region consists of 20 districts and three municipal administrations divided into 3 zones. Pawe, Metekel, Assosa and Kamashi are the major population centers (towns) in the region. The total area of ​​the region is approximately 49,289.46 km2 with over 784,345 inhabitants and an estimated population density of 15.91 people per km2. Its altitude varies from 580 to 2731 m. The hottest period is from February to April with a range of 35℃-40℃ Annual rainfall ranges from 800 to 2000 mm. About 97% of the population is at risk of contracting malaria while 98% of all Kebeles in the region are malarial. Currently, various public and private development activities such as agriculture and mining are underway [20].

Fig. 1

Places of study and main administrative units

The Gambella region has 3 zones with 13 districts and a municipal administration: Gambella, Abobo, Lare, Itang, Jor, Akobo. Gambella has an area of ​​29,782.82 km2 with 13 districts and a municipal administration. There are 25 urban areas and 237 rural areas in the region. The total population of the region is 468,017. In addition, there are approximately 271,000 refugees in the region [21]. The average annual temperature ranges from 17.3℃ to 28.3℃. Annual rainfall in the region ranges from 900 to 1500 mm and 1900 to 2100 mm at lower and higher elevations, respectively. The altitude varies from a minimum of 300 m to a maximum of 2200 m. Malaria transmission is permanent in all the Kebeles of the region. Agricultural and mining development activities are ongoing in the area [22].

Data acquisition

This study was conducted based on an anonymized dataset of nearly 121 and 17 cell towers (each with geographic coordinates) in Benishangul-Gumuz and Gambella regions, respectively. Intercellular and intracellular transfer data were obtained from the sites. In areas with low traffic distribution and sparsely distributed towers (i.e. rural areas), only handoff was used. A total of 330,732,648 transfers were observed in Benishangul-Gumuz region while 696,010,712 transfers were observed in Gambella region. This data comes from Ethiotelecom, the only mobile operator in Ethiopia.

Their distribution varies according to the spatial distribution of population density, with more clustering in urban areas and less clustering in suburban and rural areas of both regions. Using data from the OSS network, it was possible to retrieve mobile active users with mobility related information (HO, distribution of active users per cell, traffic distribution, TCH) from February 2019 to the end of August of the same year for the Benishangul-Gumuz region (7 months in total) and from January to October 2020 (10 months in total) for the Gambella region. Our goal was to capture short (hourly) and long-term trends or seasonality in travel behaviors, and the study period includes any significant public or religious holidays that may have different mobility patterns.

To formulate the mobility of users, the most important data are the transfer (HO), the number of active users in the specified time period, the distribution of traffic over the entire network; and location update statistics (LU). Whereas the former reflects the movement of logged in users and LU does the same only for inactive users (note that logged in users do not use the SDCCH to update their location, but the TCH). In addition, a mobility model contains a set of rules that allow to statistically predict how long a call will hold a channel in a cell and if/when this call will cause a handover generation request to the cell. the closest.

The dataset is based on anonymous traffic records collected throughout the study area. Each record includes a timestamp, anonymous cell ID and location area code (LAC). The abundant information contained in the daily transfers records the amounts and provided a sufficient database for analysis and evaluation. Additionally, the dataset is sorted and categorized by temporal and spatial properties, such as day/night and rural/urban/dense urban. Similarly, rural areas considered places as countryside and villages, while urban areas only included cities.

Data on host-seeking behaviors of mosquitoes were obtained from the entomological research work of the research teams as well as other publications. [23, 24]. Briefly, mosquito collection was conducted for a total of eight man-nights, with collection time extending from 6:00 p.m. to 6:00 a.m. in Gambella. A similar study was conducted by another team in Benishangul-Gumuz. Mosquitoes were collected between July and December 2017 and collection was carried out from 6:00 p.m. to 6:00 a.m.

Data analysis

In this research, a human mobility algorithm was used to analyze information at the mobile phone level. Spatial patterns and characteristics of human mobility have been studied at different spatial and temporal scales. A scale factor was calculated to change the value of the mobility parameter into a human mobility model. Human mobility patterns and relative probability of human mobility were determined for different scenarios using parameter values ​​based on plausible assumptions (detailed in Supplementary File 1) [25,26,27,28,29,30,31]. Hourly movement patterns in the study settings were compared to the hourly feeding behavior of malaria vector mosquitoes to understand its implication for malaria ecology.


MapInfo Pro 16, Matlab 2018b and Quantum GIS software were used for visualization and analysis. MapInfo was used to process the images [to create raster maps] taking into account real human mobility. This software can render rasters and has the ability to combine and display, on a single map, data from various sources and create a seamless map. To generate a smooth map, the raster used a kernel density estimation method to estimate the mobility density in a given area. This technique estimated the proximity or density of mobility data in a given area.