Large Language Models for Inclusive Image Captioning
| dc.contributor.author | Rostyslav Zatserkovnyi | |
| dc.contributor.author | Roksoliana Zatserkovna | |
| dc.contributor.author | Zoriana Novosad | |
| dc.date.accessioned | 2026-04-06T15:12:00Z | |
| dc.date.issued | 2025-03-21 | |
| dc.description | 29th International Conference on Information Technology (IT) Žabljak, 19 – 22 February 2025 | |
| dc.description.abstract | The recent rapid development of artificial intelligence (AI) has opened up many new possibilities for making digital content more accessible and inclusive. One of the most exciting advancements in this area is the use of large language models (LLMs) for image captioning. Trained on vast amounts of text data, these models, among other capabilities, can generate detailed descriptions of images – this can make visual content on the Web, which often has missing or unreliable captions, more understandable for individuals with visual impairment. In this article, we investigate the possibility of using LLMs to improve image captioning of visual web content. We discuss the current capabilities of this new class of ML models, comparing several free and open-source LLMs that can be utilized for the task. Finally, we propose the architecture of a novel system that can be utilized by visually impaired web users to automatically caption visual content on the websites they visit. | |
| dc.identifier.citation | R. Zatserkovnyi, Z. Novosad and R. Zatserkovna, "Large Language Models for Inclusive Image Captioning," 2025 29th International Conference on Information Technology (IT), Zabljak, Montenegro, 2025, pp. 1-5 | |
| dc.identifier.issn | 2836-3736 (Print) | |
| dc.identifier.issn | 2836-3744 (Electronic) | |
| dc.identifier.other | https://doi.org/10.1109/IT64745.2025.10930278 | |
| dc.identifier.uri | https://ieeexplore.ieee.org/document/10930278/ | |
| dc.identifier.uri | https://dspace.lute.lviv.ua/handle/123456789/2375 | |
| dc.language.iso | en | |
| dc.publisher | IEEE | |
| dc.subject | Visualization | |
| dc.subject | Large language models | |
| dc.subject | Visual impairment | |
| dc.subject | Data models | |
| dc.subject | Service-oriented architecture | |
| dc.subject | Information technology | |
| dc.title | Large Language Models for Inclusive Image Captioning | |
| dc.type | Article |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- LLM-Inclusive-Image-Captioning-Camera-Ready.pdf
- Size:
- 376.15 KB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed to upon submission
- Description: