Publications

Refine Results

(Filters Applied) Clear All

Software vulnerability detection using LLM: does additional information help?

Summary

Unlike conventional machine learning (ML) or deep learning (DL) methods, Large Language Models (LLM) possess the ability to tackle complex tasks through intricate chains of reasoning, a facet often overlooked in existing work on vulnerability detection. Nevertheless, these models have demonstrated variable performance when presented with different prompts (inputs), motivating a surge of research into prompt engineering – the process of optimizing prompts to enhance their performance. This paper studies different prompt settings (zero-shot and few-shot) when using LLMs for software vulnerability detection. Our exploration involves harnessing the power of both natural language (NL) unimodal and NL-PL (programming language) bimodal models within the prompt engineering process. Experimental results indicate that LLM, when provided only with source code or zero-shot prompts, tends to classify most code snippets as vulnerable, resulting in unacceptably high recall. These findings suggest that, despite their advanced capabilities, LLMs may not inherently possess the knowledge for vulnerability detection tasks. However, fewshot learning benefits from additional domain-specific knowledge, offering a promising direction for future research in optimizing LLMs for vulnerability detection.
READ LESS

Summary

Unlike conventional machine learning (ML) or deep learning (DL) methods, Large Language Models (LLM) possess the ability to tackle complex tasks through intricate chains of reasoning, a facet often overlooked in existing work on vulnerability detection. Nevertheless, these models have demonstrated variable performance when presented with different prompts (inputs), motivating...

READ MORE

VulSim: Leveraging similarity of multi-dimensional neighbor embeddings for vulnerability detection

Summary

Despite decades of research in vulnerability detection, vulnerabilities in source code remain a growing problem, and more effective techniques are needed in this domain. To enhance software vulnerability detection, in this paper, we first show that various vulnerability classes in the C programming language share common characteristics, encompassing semantic, contextual, and syntactic properties. We then leverage this knowledge to enhance the learning process of Deep Learning (DL) models for vulnerability detection when only sparse data is available. To achieve this, we extract multiple dimensions of information from the available, albeit limited, data. We then consolidate this information into a unified space, allowing for the identification of similarities among vulnerabilities through nearest-neighbor embeddings. The combination of these steps allows us to improve the effectiveness and efficiency of vulnerability detection using DL models. Evaluation results demonstrate that our approach surpasses existing State-of-the-art (SOTA) models and exhibits strong performance on unseen data, thereby enhancing generalizability.
READ LESS

Summary

Despite decades of research in vulnerability detection, vulnerabilities in source code remain a growing problem, and more effective techniques are needed in this domain. To enhance software vulnerability detection, in this paper, we first show that various vulnerability classes in the C programming language share common characteristics, encompassing semantic, contextual...

READ MORE

Showing Results

1-2 of 2