FACOS: Finding API Relevant Contents on Stack Overflow with Semantic and Syntactic Analysis

Abstract

Finding relevant API content on Stack Overflow is a critical yet time-consuming task for developers. FACOS (Finding API Contents On Stack Overflow) combines semantic and syntactic analysis to automatically surface Stack Overflow posts most relevant to a given API query, outperforming keyword-based baselines by leveraging both code structure and natural language semantics.

Publication
arXiv preprint arXiv:2111.07238

Overview

Stack Overflow is the primary resource for developers seeking API usage examples, error explanations, and best practices. However, finding the most relevant posts for a specific API query remains a challenge — keyword search often surfaces too many irrelevant results, while semantic search alone misses syntactically precise matches.

FACOS (Finding API Contents On Stack Overflow) addresses this by combining:

  • Semantic analysis: leveraging PTMs to understand the intent behind API-related queries and map them to conceptually similar Stack Overflow discussions.
  • Syntactic analysis: using code structure and API call signatures to identify posts with exact or near-exact API usage patterns.

Approach

FACOS fuses both signals into a unified ranking model, trained and evaluated on a curated benchmark of API queries paired with their ground-truth relevant Stack Overflow posts. The hybrid approach consistently outperforms pure semantic and pure syntactic baselines.

Published at: arXiv preprint arXiv:2111.07238, 2021 · Citations: 7

Mohammad Abdul Hadi
Mohammad Abdul Hadi
AI Security Researcher (Sr. Software Engineer)

AI Security Researcher at Huawei R&D — LLM architecture, malware analysis, and agentic multi-agent systems. 150+ citations across A* and A-rated conferences.

Next
Previous

Related