Author
Listed:
- Sandi Ljubic
(University of Rijeka, Faculty of Engineering, Vukovarska 58, HR-51000 Rijeka, Croatia
Center for Artificial Intelligence and Cybersecurity, University of Rijeka, R. Matejcic 2, HR-51000 Rijeka, Croatia)
- Alen Salkanovic
(University of Rijeka, Faculty of Engineering, Vukovarska 58, HR-51000 Rijeka, Croatia
Center for Artificial Intelligence and Cybersecurity, University of Rijeka, R. Matejcic 2, HR-51000 Rijeka, Croatia)
Abstract
In the field of human–computer interaction (HCI), text entry methods can be evaluated through controlled user experiments or predictive modeling techniques. While the modeling approach requires a language model, the empirical approach necessitates representative text phrases for the experimental stimuli. In this context, finding a phrase set with the best language representativeness belongs to the class of optimization problems in which a solution is sought in a large search space. We propose a genetic algorithm (GA)-based method for extracting a target phrase set from the available text corpus, optimizing its language representativeness. Kullback–Leibler divergence is utilized to evaluate candidates, considering the digram probability distributions of both the source corpus and the target sample. The proposed method is highly customizable, outperforms typical random sampling, and exhibits language independence. The representative phrase sets generated by the proposed solution facilitate a more valid comparison of the results from different text entry studies. The open source implementation enables the easy customization of the GA-based sampling method, promotes its immediate utilization, and facilitates the reproducibility of this study. In addition, we provide heuristic guidelines for preparing the text entry experiments, which consider the experiment’s intended design and the phrase set to be generated with the proposed solution.
Suggested Citation
Sandi Ljubic & Alen Salkanovic, 2023.
"Generating Representative Phrase Sets for Text Entry Experiments by GA-Based Text Corpora Sampling,"
Mathematics, MDPI, vol. 11(11), pages 1-26, June.
Handle:
RePEc:gam:jmathe:v:11:y:2023:i:11:p:2550-:d:1162100
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:11:p:2550-:d:1162100. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.