Adding ICCS 2023, ICCQ 2023

iterater · mallamanis · commit c6c7c030f3bf · 2023-07-01T17:16:10.000+01:00
diff --git a/_publications/kovalchuk2023test.markdown b/_publications/kovalchuk2023test.markdown
@@ -0,0 +1,11 @@
+---
+layout: publication
+title: Test-based and metric-based evaluation of code generation models for practical question answering
+authors: S. Kovalchuk, D. Fedrushkov, V. Lomshakov, A. Aliev
+conference: ICCQ
+year: 2023
+additional_links:
+   - {name: "IEEE", url: "https://ieeexplore.ieee.org/document/10114665"}
+tags: ["code generation", "test generation", "natural language generation", "evaluation", "metrics", "natural language processing"]
+---
+We performed a comparative analysis of code generation model performance with evaluation using common NLP metrics in comparison to a test-based evaluation. The investigation was performed in the context of question answering with code (test-to-code problem) and was aimed at applicability checking both ways for generated code evaluation in a fully automatic manner. We used CodeGen and GPTNeo pretrained models applied to a problem of question answering using Stack Overflow-based corpus (APIzation). For test-based evaluation, industrial test-generation solutions (Machinet, UTBot) were used for providing automatically generated tests. The analysis showed that the performance evaluation based solely on NLP metrics or on tests provides a rather limited assessment of generated code quality. We see the evidence that predictions with both high and low NLP metrics exist that pass and don't pass tests. With the early results of our empirical study being discussed in this paper, we believe that the combination of both approaches may increase possible ways for building, evaluating, and training code generation models.
diff --git a/_publications/lomshakov2023fine.markdown b/_publications/lomshakov2023fine.markdown
@@ -0,0 +1,12 @@
+---
+layout: publication
+title: Fine-Tuning Large Language Models for Answering Programming Questions with Code Snippets
+authors: V. Lomshakov, S. Kovalchuk, M. Omelchenko, S. Nikolenko, A. Aliev
+conference: ICCS
+year: 2023
+additional_links:
+   - {name: "LNCS", url: "https://link.springer.com/chapter/10.1007/978-3-031-36021-3_15"}
+   - {name: "Papers with Code ", url: "https://paperswithcode.com/paper/fine-tuning-large-language-models-for"}
+tags: ["program synthesis", "question answering", "large language models"]
+---
+We study the ability of pretrained large language models (LLM) to answer questions from online question answering fora such as Stack Overflow. We consider question-answer pairs where the main part of the answer consists of source code. On two benchmark datasets — CoNaLa and a newly collected dataset based on Stack Overflow — we investigate how a closed-book question answering system can be improved by fine-tuning the LLM for the downstream task, prompt engineering, and data preprocessing. We use publicly available autoregressive language models such as GPT-Neo, CodeGen, and PanGu-Coder, and after the proposed fine-tuning achieve a BLEU score of 0.4432 on the CoNaLa test set, significantly exceeding previous state of the art for this task.