AOBTM - Adaptive Online Biterm Topic Model
Code |
Paper
Problem Statement
- App reviews have dynamic nature
- Discussed topics in the app-reviews change over time for different versions of app.
- The changes in the topics should be analyzed to reveal important issues in the app update
Conventional (Inadequate) Approaches
- Conventional topic models, such as LDA, PLSA, BTM
- Online Topic Modeling algorithms, such as OLDA, OBTM
Proposed Approach & Contributions
- Adaptive Online Biterm Topic Model (AOBTM)
- Parallel algorithms to automatically determine the optimal number of topics
- Parallel algorithms to automatically determine the best number of previous version to consider
- Open sourced Code: github.com/Mohammad-Abdul-Hadi/AOBTM-Adaptive-Online-Biterm-Topic-Modeling
Research Questions
- Can AOBTM achieve better performance compared to baselines?
- How different parameter settings impact the performance of AOBTM?
- How discriminative and coherent are the topics discovered when parameters are set by proposed parallel algorithms?
Observations
- AOBTM delivers the highest PMI_Scores in every dataset
- AOBTM delivers the highest Dis_Scores in every dataset except for Tweets2020
- AOBTM also delivers the highest scores in every dataset for Precision, Recall, and F_hybrid
- We acknowledge that AOBTM is time-expensive; but the run-time is still comparable to adaptive online algorithms when the dataset is small.
- AOBTM outperforms AOLDA in runtime for NOAA Radar dataset, which has the lowest number of average short texts per version
Conclusions
- Proposed a novel Adaptive topic modeling algorithm, AOBTM
- AOBTM discovers coherent and discriminative topics from short texts
- AOBTM addresses the problems with conventional topic models by adopting a version sensitive strategy
- Proposed two parallel algorithms to determine the value of the two most important parameters of our model automatically
- The results of several experiments on different datasets conform the performance of AOBTM compared to the state-of-the-art algorithms.