- 95% of people in the world use mobile phones.
- Researchers have begun using call detail records to predict mobile phone location.
- This kind of big data is challenging because it's huge and non-uniform.
- Predicting mobility patterns could be useful for traffic and urban planning.
Predicting mobile phone location using big data
Scientists studying urban deer populations use tracking collars to learn about animal behavior in urban environments. But what if scientists want to study human behavior in urban environments? Recently, an international team of researchers took the next step in studying call detail record (CDR) data — and were able to predict the next movements of mobile phone users with a high degree of accuracy.
It’s estimated that 95% of people in the world already have “tracking devices” on them at all times — their mobile phones. This makes for a ready field of study. Indeed, using CDR data to model human mobility has become common. Unlike deer, however, people don’t have to be trapped first.
To predict cell phone users' movement, the researchers used mining-based sequential pattern techniques on real anonymized data obtained from one of the largest mobile phone operators in Turkey. They studied a region covering roughly 25,000 square kilometers (around 15,534 square miles) with an average of 30 records per day for each user.
Making sense of all that big data — from more than a million mobile phone users over a period of one month — was challenging because of its huge size and non-uniform nature. A standard AprioriAll (based on the Apriori algorithm) method would have been too costly computationally, and they were only interested in patterns of a given length. So they modified their frequent pattern extraction algorithm to make a single pass on the data.
“On the computing side, we used a distributed architecture,” said Pinar Karagoz, corresponding author for the study. “For problems involving big data, like our problem, distributed computer architecture is more suitable. Our distributed computer system is similar to systems used by Google and Facebook but much smaller.”
The researchers concentrated on three sub-problems: predicting the location and time of the next activity of any given mobile phone user, predicting the location of the next activity of the user when their location changes, and predicting both the location and the time of the activity of the user when their location changes. The results were recently published in The Computer Journal.
While CDR data is mainly used for generating customer invoices, it also contains location information. However, user location information is not always precise since it corresponds to cell tower location. Because the distribution of cell towers is uneven, the estimated user location has error margins.
The researchers wanted to determine common daily patterns regardless of activity. A frequent user movement pattern may correspond to a few hours of any given day. When a new sequence emerges, researchers can compare it to existing sequences, and thus try to predict the potential next location and time. Using this approach, the number of frequent patterns becomes much smaller and easier to handle.
Exact user activity times were generally not signiﬁcant, so the researchers used a simple abstraction approach and divided each day into predeﬁned time intervals. If no action was recorded in a given time interval, they dropped it from the daily user activity sequence (DUAS). If more than one action was recorded, they selected the most frequent location as the default. Then they converted the DUAS into a sequence of daily user location–time pairs obtained under the time abstraction.
Their analysis found that 80% of the time, the location of the users’ next activity was the same as their current location in terms of base station IDs (i.e., they were connected to the same base station). Since only 20% of two consecutive events were at different locations in relation to a base station, it did not make sense to present the results below or around 80% accuracy. Eight times out of ten, a user’s next location was their current location (in relation to a base station).
The research team concluded that their proposed model could generate successful accuracy values with acceptable prediction counts. Their next challenge will be to try to broaden predictions of the next location change. “Mobility pattern extraction has useful applications such as location-based recommendations,” said Karagoz. “More importantly, mobility patterns hold great promise for traffic and urban planning.”
For now, location prediction promises several immediate benefits for society. Identifying patterns could provide insights into the spread of biological and mobile viruses. It could also prove valuable for mobile phone operators looking to optimize service or predict data congestion.