On Prophesying Online Gamer Departure

Pin-Yun Tarng, Kuan-Ta Chen, and Polly Huang

PDF Version | Contact Us


Most revenues of the MMORPG (massively multiplayer online role-playing game) industry come from the sale of subscriptions and virtual items, especially to loyal "hardcore" players who would stay in a game for more than a year. Understanding the players' behavior and how long will they stay in the game is hence vital to game operators. If a player's departure is predictable, measures can be taken to prevent that from happening.
This paper strives to develop a practical scheme for predicting player unsubscription. The players have various degrees of predictability, hence we approach the task first by classifying them with support vector machine, then use the same tool to model their playing pattern before and after a given date. In the case of hardcore players, the scheme allows us to predict two months prior with a compound accuracy of over 80%. We have also conducted generalizability analysis to show that our scheme is generalizable across different MMORPGs and can be also applied to avatar usage predictions.

1  Introduction

Online gaming has become increasingly popular in recent years. In [5], it is reported that over 55% of Internet users are now also online gamers, of which 90% have experience with role-playing games [7]. The market size for online games has reached 6 billion US dollars in 2007 [2], with the commonest business model being the sale of virtual items or monthly subscriptions in which gamers must pay for credits to continue their adventures in the virtual world. From the perspective of game operators, predicting how many people will join a game and how long they will stay in the game is crucial, since these two factors dominate their revenues.
Predicting how many players will join a game before a game's launch is very difficult, if not impossible, since it involves many non-game factors, such as marketing strategies, the release date of the game (whether it is launched during the summer vacation), the artistic design (whether it is manga-like or realistic), and cultural references (whether it is Oriental or Occidental). Predicting how long a player will stay once he joins a game is more feasible, as it should correlate with the extent of his involvement in the virtual world. Usually, this can be inferred from the player's external behavior, such as how quickly his avatar advances to new levels and how much time he spends in the game every day.
This study is an extension of our previous game hour analysis [11]. Our goal is to provide a practical scheme for predicting player unsubscription that takes a player's game hours as input and determines whether or not he will renew an expiring subscription. Predicting unsubscription decisions are important to game operators because the decisions affect their revenues directly. Our rationale is that, if we can predict the departure of a player before he actually quits a game, the game operator can take remedial measures to prevent it from happening and improve the game along the way based on the feedback provided by such a player.
Predicting unsubscription decisions can provide the following benefits:
  1. Players usually quit a game because they are dissatisfied with the game's design or content. Thus, to some degree, player unsubscriptions should indicate low user satisfaction. In other words, if we can predict which players will leave the game in the near future, we may have a chance to stop them leaving, or at least understand their reasons and make future improvements. To this end, operators could conduct surveys to determine the causes of player dissatisfaction and improve the game accordingly. (It is also likely that operators would not receive useful comments because dissatisfied players who have been totally disappointed with a game may reluctant to take surveys from game companies.)
  2. Predicting gamer unsubscription facilitates forecasts on the number of future players as well. Even though we may predict the number of players directly by using time series modeling [3], unsubscription prediction provides more information because we can predict which players will leave the game rather than just how many players will leave. With such information, game operators can optimize their network and server allocation beforehand.
Our study is based on real-life traces collected from ShenZhou Online [12], a mid-scale commercial MMORPG (massively multiplayer online role-playing game) in Taiwan. The traces we acquired from the operator contain the playing histories of 162,980 accounts over a span of four years. We propose a classification method to first identify the involvement pattern of the players in game along time with 90% accuracy, and then devise a prediction model that detects whether gamers are leaving the game in the near future. For "hardcore" players who subscribe to the game for more than a year, the prediction reaches 85% accuracy. Furthermore, we analyze the generalizability of our scheme by collecting 2,132 questionnaires from real-life gamers, and find that players of different MMORPGs have similar playing patterns towards their unsubscription. We also apply our methods to World of Warcraft [1] avatar traces, and find that we can detect with 80% accuracy whether gamers are discarding their avatars in the near future.
The remainder of this paper is organized as follows. Section 2 reviews a selection of related studies. The origin and collection of our traces are described in Section 3. In Section 4 we observe how gamers play during their subscription and classify them into one of the two groups: fade-out and sudden-out. We blaze a way of predicting gamer unsubscription in Section 5, while the generalizability of our scheme is analyzed in Section 6. Finally, we conclude our findings and results in Section 7.

2  Related Work

Based on a set of World of Warcraft traces, Pittman et al. [10] attempted to establish a realistic, empirical model for predicting player behavior and server population fluctuation over time. The authors conjectured that at least four types of information are required for such a model: 1) the server's population variation over time; 2) the arrival rate and session duration of players; 3) the spatial distribution of avatars in the virtual world; and 4) the movements of avatars over time (how many distinct regions the avatars visit and how long they stay in a region).
Chambers et al. [3] presented a comprehensive analysis of player behavior and game server workload of Counter-Strike, a well-known first-person shooting game. They found that 1) gamers are extremely difficult to satisfy and displays zero loyalty if a game server is not properly set up and provisioned; 2) the popularity of a game follows a power law, making it difficult to provision at launch time; 3) server workload exhibits predictable patterns on daily and weekly scales but loses them in longer terms; 4) a shared game hosting infrastructure posts significant challenges; and 5) software updates are a great burden on game hosting and must be planned for.
In our previous pilot study [11], we analyzed what time players enter the game's virtual world and how long they stay in the game, investigated whether a player's future game hours can be predicted with his observed behavior, but fell short of embodying an unsubscription prediction model. The study nevertheless paved the way for the selection of parameters in our player classification method, and the goal it stated is very much accomplished in this current paper.

3  Data Description

Developed and distributed by UserJoy Technology Co., Ltd., ShenZhou Online [12] sustains at any moment thousands of online players, who must purchase "game points" if they wish to continue their adventures in the virtual world beyond the 30-day free trial period. A screenshot of the game is given in Figure 1. As in typical MMORPGs, ShenZhou Online players can fight with random creatures, trade in marketplaces, take on quests, and train themselves to become masters of particular skillsets.
Figure 1: A screenshot of ShenZhou Online
Out of courtesy of UserJoy, we were able to obtain the traces of 162,980 ShenZhou Online accounts from March 1, 2003 to February 15, 2007. A total of 102,233,240 sessions is logged. Not all of the traces are suitable for our analysis, however, as about two thirds of the accounts were never seen again after their respective trial periods. Furthermore, some accounts obviously had sessions falling outside the covered 1,447 days of the traces. To play on the safe side, we only use traces whose first activity is six months or more later than March 1, 2003 and whose last activity (assuming that they quit after) is six months or more earlier than February 15, 2007. The base of our study thus consists of the un-curtailed traces of 20,514 accounts.

4  Classification of Online Gamers

4.1  Identifying the Groups

Figure 2: The playing history of six gamers.
Prediction of a gamer's unsubscription would be feasible if his playing history prior to his departure exhibited one or more features not uncommon to fellow leaving players. The most intuitive of those features may be an ever-decreasing daily playtime. The two gamers on the left of Figure 2, which juxtaposes the entire playing history of six randomly chosen gamers, indeed conform with the intuition, while the others display no obvious and exploitable trends. As a matter of fact, only 312 out of a random set of one thousand gamers fit into the "fade-out" pattern, as determined by the human eye. The other 688 would be categorized as "sudden-out," with no noticeable tendency in daily playtime or login frequency. They could go from playing for more than 12 hours every day this week to complete disappearance in the following week.
We figure that if we focus on the fade-out group of gamers, the prediction of departure will attain a higher degree of success. Consequently, a scientific method is needed to separate the "predictables" from the "unpredictables."

4.2  Automating the Classfication

We base our automated classification method on gamers' average daily playtime and playing density. First, we randomly choose 2,000 gamers from our traces and classify them with the human eye. Among all the sample gamers, 613 are fade-out and 1,387 are sudden-out. Second, we divide each gamer's history into k periods of equal length, and evaluate the average daily playtime and playing density in each period. The playing density is the occurrence of a gamer's playing days within all available days. For example, if a gamer has at least logged in the game once for 15 days in June, his playing density in June will be 0.5.
In the case of k=3, suppose a gamer has a subscription length of 90 days, then his average daily playtime and playing density, or his 3-period features, for the first through the 30 day, the 31 through the 60 day, and the 61 through the 90 day are respectively computed. In addition, we normalize the two statistics by setting each player's maxima to be 1, so that the classification would not be influenced by a few gamers' extra large or small playtime and density.
We use the support vector machine (SVM) [4] as the classifier. The traces of the 2,000 sample gamers, along with their k-period features and predetermined categories, serve as the training data set. Note that if we divide the playing history too roughly (small k), the trend we extract would be blurred, jeopardizing the overall classification accuracy. On the other hand, if we use too large a k, the average daily playtime and playing density on some periods might be dominated by a few big days. To find the optimal value of k, we experiment within the range of [2,20], and compare the resulting classification accuracy in Figure 3. Each accuracy estimate is computed via ten-fold cross-validation. We conclude that 10-period features contain the most trend information and yield the best classification.
Now that a SVM model with k=10 is established, the traces of the remaining accounts can be processed. Of the 18,514 remnants, 5,503 (29.7%) are deemed fade-out while 13,011 (70.3%) are dubbed as sudden-out.

4.3  Predictive Classification

Figure 3: The classification accuracy of different values of k.
So far we have been classifying players with their complete traces. In real-life prediction, however, only incomplete data is available, and the approximate time of a gamer's final login is to be predicted. Therefore, we need to check whether the gamers could be correctly classified by their incomplete traces.
Figure 4: Predictivity of our classification method.
The traces of the 2,000 sample gamers, their last n days cut off (n ∈ [3,60]), are fed into the SVM model as the training data set along with their re-computed 10-period features. The predictivity of our classification method, ten-fold cross-validated, is shown in Figure 4, where lines are drawn for various subscription lengths. It can be seen that gamers with longer subscription lengths tend to be more resilient to the cutting, as 82% of the hardcore players (subscribing for more than one year) are correctly classified even with their last month truncated. This is good news because hardcore gamers generate most of the operator's revenue and are most valued by the gaming industry [8].

5  Predicting Gamer Unsubscription

5.1  Prediction Model

To predict whether a gamer is leaving in d days (n ∈ [3,60]), we first construct a SVM for each d. Similar to our classification method, we use the traces of the 2,000 sample gamers as the training data set. For each sample gamer, we assign the prediction point at d days before his quitting. Two random observation windows, counting from the gamer's first login day, are derived for each gamer. One of the observation window, dubbed as leaving, contains the prediction point, since it is implied that after the last day of the window, the gamer will unsubscribe within d days. The other window, tagged as staying, does not contain the prediction point, so it is implied that the gamer will stay in the game for at least d days after the last day of the window. The 10-period features are extracted from each window and fed to the SVM along with their corresponding window type.
Figure 5: Unsubscription prediction accuracy
We applied the above procedure on both fade-out and sudden-out groups to give Figure 5. The accuracy estimates are again computed via ten-fold cross-validation. Our method is especially useful for predicting hardcore fade-out gamers, reaching an accuracy as high as 90%, while only about 70% is attained for sudden-out gamers. The difficulty of predicting sudden-out gamers lies in their irregular behavior, which may or may not be due to their own social activities.

5.2  The Complete Scheme

The complete unsubscription prediction scheme is the combination of the classification method (Section 4.2) and the prediction model (Section 5.1) and takes a player's incomplete trace as input. A gamer categorized as sudden-out is difficult to predict, so we opt to leave him be, not applying the prediction model on him. As a result, the complete scheme will give a three-way output:
Figure 6: Accuracy of our final prediction scheme.
The ten-fold cross-validated prediction accuracy of our scheme shown in Figure 6, can be seen as a logical aggregation of Figures 4 and 5. We can see that the accuracy for predicting whether a gamer is going to leave in a month is around 82% for gamers who subscribed for more than one year, and 85% for gamers who subscribed for more than two years. The results indicate that, for hardcore gamers, it is feasible to use our scheme to predict whether they are quitting from the game in the near future.
Figure 7: False positives and false negatives of our prediction scheme
The error rate of our scheme regarding the three classes of outputs is given in Figure 7. The probability of wrongly identifying a leaving hardcore player as staying is only about 10%. Put it the other way, we can detect 90% leaving gamers correctly with our final prediction scheme, if we do not count the false unpredictables in. The game operators, knowing in advance which gamers are leaving from the game in advance, can target their resources in preventing them from quitting. Nevertheless, a certain amount of resource inefficiency is inevitable, as there is a 15% false leaving rate.

6  Generalizability Analysis

6.1  Generalizability across Games

There is a myriad of MMORPGs out in the market, and we need to make sure our scheme is applicable to at least most of them. In other words, we need to see whether players in different games are with similar playing trends, that is, if we classify players according to their playing behavior, e.g., decreasing or increasing playtime, the distribution of them in the categories is similar across the games. If the gamer population shows similar behavior in different games, our prediction methods should be applicable to other games.
We empirically classify gamers into five categories with respect to their playtime trends: fluctuating ("depends"), decreasing, changeless, increasing, and periodical. In order to see the gamer distribution of different games, we conducted an online survey resulting 2,132 effective questionnaires from gamers who had played an MMORPG before but had already quit from it. In the survey, 74% gamers were male and 93% gamers were students when they played the game. Furthermore, 77% gamers turned to another game after they had left the previous game.
Figure 8: The distribution of playtime trends of different MMORPGs
We compare the playtime distribution of the three most popular MMORPGs in the survey, World of Warcraft (WoW), Lineage [9], and Ragnarok Online [6], with that of ShenZhou Online in Figure 8. We can see that Lineage, Ragnarok Online, and ShenZhou Online feature similar distributions, while WoW gamers tend to maintain their playing patterns towards the end of their subscription, resulting in an exceptionally low proportion of decreasing playtime. The discrepancy may be related to the games' receptive artistry and sizes of population, since players are attracted to well-designed games and games where they can find many acquaintances.
Figure 9: The distribution of reasons for unsubscribing different MMORPGs
We have also investigated what make gamers quit from a game (Figure 9). We found that the top seven reasons are (in the order of frequency):
  1. They had more important things to do, such as obligatory military service or school entrance exams;
  2. They become bored with it;
  3. Their friends left;
  4. They realized that it is waste of time playing MMORPG after all;
  5. Their accounts were hacked;
  6. They turned to other newer games;
  7. They had no more money to spend on entertainment.
We find that no matter what the game is, gamers quit for similar reasons. In addition, WoW gamers tend to quit from the game citing "more important things" instead of boredom, a testimony of the game's attractiveness.
From our questionnaires, we find that gamers of different MMORPGs have similar playing trends and unsubscription reasons. Therefore, since our scheme can successfully predict the departure of ShenZhou Online gamers, it should be applicable to another MMORPGs, too.

6.2  Prediction of Avatar Usage

Figure 10: prediction accuracy of our scheme for WoW avatar usage
If the game operator is able to predict whether a given avatar is going to be discarded by its user, it will know in advance how popular a race or career path will be in the near future. To verify the applicability of our scheme on avatar usage, we apply it to the two-year traces of WoW avatars we collected in [11]. The verification serves two purposes, that is, we use it to confirm whether our scheme applies to another MMORPG (WoW rather than ShenZhou Online) and to avatars (rather than accounts). The result, shown in Figure 10, is only slightly worse than that of unsubscription prediction of ShenZhou Online gamers, achieving 80% accuracy at d ≥ 30 for hardcore players. The drop in accuracy may be due to the facts that our WoW data is shorter in time and that complete traces lasting longer than a year accumulate to only 6% of the data. Furthermore, avatar usage is supposed to be more unpredictable, as gamers might own multiple avatars at the same time, and might discard a avatar for the sake of constant change.

7  Conclusion

We proposed in this paper a generalizable scheme for predicting online gamer departures. The scheme is the combination of a player classification method and a prediction model, and takes a player's to-date playing history as input. It first identifies the player as fade-out or sudden-out according to his playtime trend. If he falls into the fade-out group, given a specific date in the near future the scheme predicts whether or not he has left the game by that date. We devised and tested the scheme with four years' worth of ShenZhou Online traces, and achieved a compound accuracy of over 80% in predicting the decisions of hardcore players.
The ability to predict a gamer's departure is coveted by the MMORPG industry as it allows the game operators to target their resources on keeping subscribers motivated and to benefit from these loyal customers, not only financially, but also in terms of the improvement of current games and the design of future ones. To this end, we hope that our scheme will prove helpful to operators, as well as gamers who may enjoy a better gaming environment because of it.


[1] Blizzard Entertainment Inc. World of Warcraft. http://www.worldofwarcraft.com/.
[2] S. Carless. Analyst: Online Game Market $13 Billion By 2011. Gamasutra, 2007. http://www.gamasutra.com/.
[3] C. Chambers, W.-C. Feng, S. Sahu, and D. Saha. Measurement-based Characterization of A Collection of Online Games. In IMC'05: Proceedings of the 5th Conference on Internet Measurement 2005, pages 1-14, 2005.
[4] N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, 2000.
[5] Z. Z. Eric Wan, Xin Xu. 2006 Online Game Report. In Pacific Epoch Red Innovation Report Series, 2006.
[6] Gravity Co., Ltd. Ragnarok Online. http://www.ragnarokonline.com/.
[7] MMOGChart. Total MMOG Active Subscriptions. 2009. http://www.mmogchart.com/.
[8] J. Mulligan and B. Patrovsky. The Market for Online Games. Peachpit, 2003. http://www.peachpit.com/.
[9] NCsoft Corporation. Lineage. http://www.lineage.com/.
[10] D. Pittman and C. GauthierDickey. A Measurement Study of Virtual Populations in Massively Multiplayer Online Games. In NetGames '07: Proceedings of the 6th ACM SIGCOMM Workshop on Network and System Support for Games, pages 25-30, 2007.
[11] P.-Y. Tarng, K.-T. Chen, and P. Huang. An Analysis of WoW Players Game Hours. In NetGames '08: Proceedings of the 7th ACM SIGCOMM Workshop on Network and System Support for Games, pages 47-52. ACM, 2008.
[12] UserJoy Technology Co., Ltd. ShenZhou Online. http://www.ewsoft.com.tw/index.php.

Sheng-Wei Chen (also known as Kuan-Ta Chen)
Last Update September 19, 2017