RQ1: How accurate and efficient are the PTMs in the classification of app reviews compared to the existing tools?
RQ2: How does the performance of the PTMs change when they are pre-trained on app-review dataset, instead of a generic dataset (e.g., Wiki-documents, book corpus)?