ace news
our team
Join us
Please do NOT email Prof. Chattopadhyay directly. She would be unable to respond promptly owing to the high volume of emails she receives :(
Recent Publications
- Cognitive Biases in LLM-Assisted Software Development
π IEEE/ACM International Conference on Software Engineering
Abstract: The widespread adoption of Large Language Models (LLMs) in software development is transforming programming from a generative activity into one dominated by prompt engineering and AI-generated solution evaluation. This shift opens the pathway for new cognitive challenges that amplify existing decision-making biases or create entirely novel ones. One such type of challenge is cognitive biases, which are thinking patterns that lead people away from logical reasoning, often resulting in errors, poor decisions, or sub-optimal actions. Despite LLMs becoming integral to modern development workflows, we lack a systematic understanding of how cognitive biases manifest and impact developer decision-making in these AI-collaborative contexts. This paper presents the first comprehensive study of cognitive biases in LLM-assisted programming using a mixed-methods approach of observational studies with n=14 student and professional developers, followed by surveys with n=22 additional developers. First, we qualitatively analyze our data based on bias categorization in the traditional non-LLM workflow reported in prior work. Our findings suggest that the traditional software development biases are inadequate to explain/understand why LLM-related actions are more likely to be biased. Through systematic analysis of 239 cognitive bias types, we develop a novel taxonomy of 15 bias categories of 90 biases specific to developer-LLM interactions, validated with cognitive psychologists. We found that 48.8% of total programmer actions are biased, and programmer-LLM interactions included 56.4% of biased actions. Based on our survey analysis, we present practical tools and practices for programmers, along with recommendations for LLM-based code generation tool builders to help mitigate cognitive biases in human-AI programming. - Observing Without Doing: Pseudo-Apprenticeship Patterns in Student LLM Use
π International Conference on Computing Education Research
Abstract: Large Language Models (LLMs) such as ChatGPT have quickly become part of student programmers' toolkits, whether allowed by instructors or not. This paper examines how introductory programming (CS1) students integrate LLMs into their problem-solving processes. We conducted a mixed-methods study with 14 undergraduates completing three programming tasks while thinking aloud and permitted to access any resources they choose. The tasks varied in open-endedness and familiarity to the participants and were followed by surveys and interviews. We find that students frequently adopt a pattern we call pseudo-apprenticeship, where students engage attentively with expert-level solutions provided by LLMs but fail to participate in the stages of cognitive apprenticeship that promote independent problem-solving. This pattern was augmented by disconnects between students' intentions, actions, and self-perceived behavior when using LLMs. We offer design and instructional interventions for promoting learning and addressing the patterns of dependent AI use observed. - From Prompts to Propositions: A Logic-Based Lens on Student-LLM Interactions
π International Conference on Computing Education Research
Abstract: - Beyond the Page: Enriching Academic Paper Reading with Social Media Discussions
π ACM Symposium on User Interface Software and Technology
Abstract: Researchers actively engage in informal discussions about academic papers on social media. They share insights, promote papers, and discuss emerging ideas in an engaging and accessible way. Yet, this rich source of scholarly discourse is often isolated from the paper reading process and remains underutilized. A natural question thus arises: What if we bring these peer discussions on social media into the reading experience? What might be the benefits of reading research papers alongside informal social insights? To explore the design space of such integration, we conducted a formative study with eight researchers. Participants recognized the value of social media in expanding their perspectives and connecting with fellow researchers. However, they also reported significant distraction and cognitive overload when confronted with streams of noisy, unstructured social media comments. Guided by the design goals derived from their feedback, we introduce SURF, a novel reading interface that enriches academic papers with Social Understanding of Research Findings. SURF organizes social media clutter into digestible threads and presents them contextually within the paper, allowing readers to seamlessly access peer insights without disrupting their reading process. In a within-subjects usability study (N=18), participants achieved significantly deeper comprehension and higher self-efficacy with SURF, while reporting lower cognitive load. They also noted SURF's various benefits beyond paper reading, such as facilitating literature review and fostering social engagement within the academic community. Some participants envisioned SURF and academic social media as a potential supplement to the traditional peerβreview process. - Exploring the Challenges and Opportunities of AI-assisted Codebase Generation
π IEEE Symposium on Visual Languages and Human-Centric Computing
Abstract: - ELI-Why: Evaluating the Pedagogical Utility of Language Model Explanations
π Annual Meeting of the Association for Computational Linguistics
Abstract: Language models today are widely used in education, yet their ability to tailor responses for learners with varied informational needs and knowledge backgrounds remains under-explored. To this end, we introduce ELI-Why, a benchmark of 13.4K 'Why' questions to evaluate the pedagogical capabilities of language models. We then conduct two extensive human studies to assess the utility of language model-generated explanatory answers (explanations) on our benchmark, tailored to three distinct educational grades: elementary, high-school and graduate school. In our first study, human raters assume the role of an 'educator' to assess model explanations' fit to different educational grades. We find that GPT-4-generated explanations match their intended educational background only 50% of the time, compared to 79% for lay human-curated explanations. In our second study, human raters assume the role of a learner to assess if an explanation fits their own informational needs. Across all educational backgrounds, users deemed GPT-4-generated explanations 20% less suited on average to their informational needs, when compared to explanations curated by lay people. Additionally, automated evaluation metrics reveal that explanations generated across different language model families for different informational needs remain indistinguishable in their grade-level, limiting their pedagogical effectiveness - Code Today, Deadline Tomorrow: Procrastination Among Software Developers
π IEEE/ACM International Conference on Software Engineering
Abstract: Procrastination, the action of delaying or postponing something, is a well-known phenomenon that is relatable to all. While it has been studied in academic settings, little is known about why software developers procrastinate. How does it affect their work? How can developers manage procrastination? This paper presents the first investigation of procrastination among developers. We conduct an interview study with (n=15) developers across different industries to understand the process of procrastination. Using qualitative coding, we report the positive and negative effects of procrastination and factors that triggered procrastination, as perceived by participants. We validate our findings using member checking. Our results reveal 14 negative effects of procrastination on developer productivity. However, participants also reported eight positive effects, four impacting their satisfaction. We also found that participants reported three categories of factors that trigger procrastination: task-related, personal, and external. Finally, we present 19 techniques reported by our participants and studies in other domains that can help developers mitigate the impacts of procrastination. These techniques focus on raising awareness and task focus, help with task planning, and provide pathways to generate team support as a mitigation means. Based on these findings, we discuss interventions for developers and recommendations for tool building to reduce procrastination. Our paper shows that procrastination has unique effects and factors among developers compared to other populations. - Trust Dynamics in AI-Assisted Development: Definitions, Factors, and Implications
π IEEE/ACM International Conference on Software Engineering
Abstract: Software developers increasingly rely on AI code generation utilities. To ensure that βgoodβ code is accepted into the code base and βbadβ code is rejected, developers must know when to trust an AI suggestion. Understanding how developers build this intuition is crucial to enhancing developer-AI collaborative programming. In this paper, we seek to understand how developers (1) define and (2) evaluate the trustworthiness of a code suggestion and (3) how trust evolves when using AI code assistants. To answer these questions, we conducted a mixed-method study consisting of an in-depth exploratory survey with (n=29) developers followed by an observation study (n=10). We found that comprehensibility and perceived correctness were the most frequently used factors to evaluate code suggestion trustworthiness. However, the gap in developersβ definition and evaluation of trust points to a lack of support for evaluating trustworthy code in real-time. We also found that developers often alter their trust decisions, keeping only 52% of original suggestions. Based on these findings, we extracted four guidelines to enhance developer-AI interactions. We validated the guidelines through a survey with (n=7) domain experts and survey members (n=8). We discuss the validated guidelines, how to apply them, and tools to help adopt them. - A Tale of Two Communities: Exploring Academic References on Stack Overflow
π The ACM Web Conference
Abstract: Stack Overflow is widely recognized by software practitioners as the go-to resource for addressing technical issues and sharing practical solutions. While not typically seen as a scholarly forum, users on Stack Overflow commonly refer to academic sources in their discussions. Yet, little is known about these referenced academic works and how they intersect the needs and interests of the Stack Overflow community. To bridge this gap, we conducted an exploratory large-scale study on the landscape of academic references in Stack Overflow. Our findings reveal that Stack Overflow communities with different domains of interest engage with academic literature at varying frequencies and speeds. The contradicting patterns suggest that some disciplines may have diverged in their interests and development trajectories from the corresponding practitioner community. Finally, we discuss the potential of Stack Overflow in gauging the real-world relevance of academic research. - Generating Function Names to Improve Comprehension of Synthesized Programs
π IEEE Symposium on Visual Languages and Human-Centric Computing
Abstract: Despite great advances in program synthesis techniques, they remain algorithmic black boxes. Although they guarantee that when synthesis is successful, the implementation satisfies the specification, they provide no additional information regarding how the implementation works or the manner in which the specification is realized. One possibility to answer these questions is to use large language models (LLMs) to construct human-readable explanations. Unfortunately, experiments reveal that LLMs frequently produce nonsensical or misleading explanations when applied to the unidiomatic code produced by program synthesizers. In this paper, we develop an approach to reliably augment the implementation with explanatory names. We recover fine-grained input-output data from the synthesis algorithm to enhance the prompt supplied to the LLM, and use a combination of a program verifier and a second language model to validate the proposed explanations before presenting them to the user. Together, these techniques massively improve the accuracy of the proposed names, from 24% to 79% respectively. Through a pair of small user studies, we find that users significantly prefer the explanations produced by our technique (76% of responses indicating the appropriateness of the presenting names) to the baseline (with only 2% of responses approving of the suggestions), and that the proposed names measurably help users in understanding the synthesized implementation.
ace projects
Science in Stack Overflow
Exploring references to academic literature on Stack Overflow (SO) to understand how scientific knowledge diffuses into this practitioner-centric forum.
Sustainable Software Engineering
Redefining the notion of sustainability for software projects, investigate various open-source and industrial projects to derive metrics, and help future engineers build sustainable software.
Trustworthy AI code generation
Developing a method to ensure trustworthy AI-generated code suggestions through statistical tools and program analysis techniques.
Codebase Generation Model Evaluation
The project examines the performance of AI in creating codebases, assessing developer experience and code effectiveness.
Expanding the Scope of Changes Made by Code Prompts
Exploring the limitations of current automatic code suggestion models in generating complete code bases from abstract prompts and proposing a human-in-the-loop framework to extend their capabilities
Cognitive control and intervention in autonomous systems
To design good collaboration in human-autonomous system, we need to understand the cognitive barriers in humans and provide good explanation/intervention systems. In these projects, we look at different autonomous systems and study how human cognition needs support for seamless and safe operations.











