Makar: A Framework for Multi-source Studies based on Unstructured Data

Birrer, Mathias; Rani, Pooja; Panichella, Sebastiano; Nierstrasz, Oscar (2021). Makar: A Framework for Multi-source Studies based on Unstructured Data. In: 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). Honolulu, HI, USA. March 9 2021 to March 12 2021. 10.1109/SANER50967.2021.00069

[img] Text
Rani21c.pdf - Published Version
Restricted to registered users only
Available under License Publisher holds Copyright.

Download (1MB)

To perform various development and maintenance tasks, developers frequently seek information on various sources such as mailing lists, Stack Overflow (SO), and Quora. Researchers analyze these sources to understand developer information needs in these tasks. However, extracting and preprocessing unstructured data from various sources, building and maintaining a reusable dataset is often a time-consuming and iterative process. Additionally, the lack of tools for automating this data analysis process complicates the task to reproduce previous results or datasets.To address these concerns we propose Makar, which provides various data extraction and preprocessing methods to support researchers in conducting reproducible multi-source studies. To evaluate Makar, we conduct a case study that analyzes code comment related discussions from SO, Quora, and mailing lists. Our results show that Makar is helpful for preparing reproducible datasets from multiple sources with little effort, and for identifying the relevant data to answer specific research questions in a shorter time compared to state-of-the-art tools, which is of critical importance for studies based on unstructured data. Tool webpage: https://github.com/maethub/makar

Item Type:

Conference or Workshop Item (Paper)

Division/Institute:

08 Faculty of Science > Institute of Computer Science (INF)
08 Faculty of Science > Institute of Computer Science (INF) > Software Composition Group (SCG) [discontinued]

UniBE Contributor:

Pooja Rani, Pooja Rani, Nierstrasz, Oscar

Subjects:

000 Computer science, knowledge & systems

ISBN:

978-1-7281-9630-5

Language:

English

Submitter:

Oscar Nierstrasz

Date Deposited:

24 Feb 2022 12:15

Last Modified:

02 Mar 2023 23:35

Publisher DOI:

10.1109/SANER50967.2021.00069

Uncontrolled Keywords:

scg-pub snf-asa3 scg21 jb21

BORIS DOI:

10.48350/165151

URI:

https://boris.unibe.ch/id/eprint/165151

Actions (login required)

Edit item Edit item
Provide Feedback