Home
Repository Search
Listing
Academics - Research coordination office
R-RC -Acad
Admin-Research Repository
Engineering and Computer Science
Computer Science
Engineering
Mathematics
Faculty of Arts and Humanities
English (Multan Campus)
English (Faisalabad Campus)
Languages
Arabic
Chinese
English
French
Persian
Urdu
German
Korean
Management Sciences
Economics
Governance and Public Policy
Management Sciences
Management Sciences Rawalpindi Campus
ORIC
Oric-Research
Social Sciences
Education
International Relations
Islamic thought & Culture
Media and Communication Studies
Pakistan Studies
Peace and Conflict Studies
Psychology
Content Details
Back to Department Listing
Title
DUPLICATE BUG REPORT DETECTION USING DATA AUGMENTATION TECHNIQUE
Author(s)
Anisa Tariq
Abstract
In software projects, developers, testers, and end users identify bugs and report the bugs to the triager. To manage these bug reports, various Bug Tracking Systems (BTS) such as Bugzilla or Jira are used. One bug may be reported by multiple persons to the system which generate Duplicate Bug Reports in the system. Duplicate Bug Report Detection (DBRD) is very important because it results in a significant depletion of human resources. Many researchers proposed a range of machine learning techniques to detect the duplicate bug reports. The existing techniques performs well when a large number of bug reports are used as a training dataset but the performance of existing techniques significantly decreased for small dataset. To overcome this challenge, the data augmentation technique is used to increase the number of bug reports for the projects having small number of bug reports as training data. Various data augmentation techniques like synonym replacement, random insertion, component shuffling and class balance are used to increase the bug data. To validate the performance of data augmentation technique for duplicate bug detection, we used various deep learning models e.g. CNN, LSTM and BERT. We also compare the results of various deep learning techniques to analyze which model performs better with the augmented bug reports data. Our results show that data augmentation improved the results for all three models in term of accuracy, precision, recall, F1-socore and AUC score. The accuracy achieved on augmented data is 94.77%, 94.77% and 96.30% for LSTM, CNN and BERT respectively.
Type
Thesis/Dissertation MS
Faculty
Engineering and Computer Science
Department
Engineering
Language
English
Publication Date
Subject
Publisher
Contributor(s)
Format
Identifier
Source
Relation
Coverage
Rights
Category
Description
Attachment
Name
Timestamp
Action
93b92a5894.pdf
2025-12-31 12:00:02
Download