Navigation auf uzh.ch

Suche

Department of Informatics s.e.a.l

GitHub/IDE Plugin for Comment Analysis

Introduction

The Code Comment Analysis Project is dedicated to extending the capabilities initially developed in previous research, How to identify class comment types? A multi-language approach for class comment classification by Rani et al[1]. This continuation project enhances our understanding and handling of code comments within GitHub projects by classifying these comments into various informational types and assessing their compliance with established commenting guidelines.

Motivated by the importance of comments in understanding developers’ intents and aiding in maintenance tasks, this project aims to analyze and improve the quality of code documentation. By leveraging Large Language Models (LLMs), the project will categorize comments into 5-7 information types and verify their adherence to established guidelines. The classified comments will then be highlighted within an interface to enable easier navigation and understanding of comment contents across extensive codebases.

Technical Skills and Technologies:

This project requires a good understanding of software development. Familiarity with machine learning concepts and some hands-on experience in Python and PyTorch are also important. Additionally, candidates should have an interest in learning to apply Large Language Models for text analysis and natural language processing tasks.

The technological stack for this project includes Python, PyTorch, the GitHub API, HTML/CSS/JavaScript for frontend development, REST APIs for integration, and SQL/NoSQL databases for backend management. Additionally, Docker will be employed to streamline the deployment and testing phases, ensuring robust application performance and ease of maintenance.

Contact: Dr. Pooja Rani rani@ifi.uzh.ch

[1] https://www.sciencedirect.com/science/article/pii/S0164121221001448