Author
Listed:
- Antonio Sabbatella
(Department of Computer Science, Systems and Communications, University of Milan-Bicocca, 20126 Milan, Italy)
- Andrea Ponti
(Department of Economics, Management, and Statistics, University of Milan-Bicocca, 20126 Milan, Italy)
- Ilaria Giordani
(Oaks srl, 20125 Milan, Italy)
- Antonio Candelieri
(Department of Economics, Management, and Statistics, University of Milan-Bicocca, 20126 Milan, Italy)
- Francesco Archetti
(Department of Computer Science, Systems and Communications, University of Milan-Bicocca, 20126 Milan, Italy)
Abstract
Prompt optimization is a crucial task for improving the performance of large language models for downstream tasks. In this paper, a prompt is a sequence of n-grams selected from a vocabulary. Consequently, the aim is to select the optimal prompt concerning a certain performance metric. Prompt optimization can be considered as a combinatorial optimization problem, with the number of possible prompts (i.e., the combinatorial search space) given by the size of the vocabulary (i.e., all the possible n-grams) raised to the power of the length of the prompt. Exhaustive search is impractical; thus, an efficient search strategy is needed. We propose a Bayesian Optimization method performed over a continuous relaxation of the combinatorial search space. Bayesian Optimization is the dominant approach in black-box optimization for its sample efficiency, along with its modular structure and versatility. We use BoTorch, a library for Bayesian Optimization research built on top of PyTorch. Specifically, we focus on Hard Prompt Tuning, which directly searches for an optimal prompt to be added to the text input without requiring access to the Large Language Model, using it as a black-box (such as for GPT-4 which is available as a Model as a Service). Albeit preliminary and based on “vanilla” Bayesian Optimization algorithms, our experiments with RoBERTa as a large language model, on six benchmark datasets, show good performances when compared against other state-of-the-art black-box prompt optimization methods and enable an analysis of the trade-off between the size of the search space, accuracy, and wall-clock time.
Suggested Citation
Antonio Sabbatella & Andrea Ponti & Ilaria Giordani & Antonio Candelieri & Francesco Archetti, 2024.
"Prompt Optimization in Large Language Models,"
Mathematics, MDPI, vol. 12(6), pages 1-14, March.
Handle:
RePEc:gam:jmathe:v:12:y:2024:i:6:p:929-:d:1361476
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:6:p:929-:d:1361476. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.