Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.11861/9286
DC FieldValueLanguage
dc.contributor.authorShafqat Ali, Muhammaden_US
dc.contributor.authorDr. AZHAR Muhammaden_US
dc.contributor.authorMasood, Sabaen_US
dc.contributor.authorLee, Bumshiken_US
dc.contributor.authorIqbal, Tanzeelaen_US
dc.contributor.authorAmjad, Adeenen_US
dc.date.accessioned2024-04-02T11:51:56Z-
dc.date.available2024-04-02T11:51:56Z-
dc.date.issued2023-
dc.identifier.citationAli, M. S., Azhar, M., Masood, S., Lee, B., Iqbal, T., & Amjad, A. (2023). Efficient video summarization with hydra attentive vision transformer. In IEEE (Ed.). 2023 International conference on frontiers of information technology (FIT). 2023 International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan (pp. 196-201). IEEE.en_US
dc.identifier.isbn9798350395785-
dc.identifier.isbn9798350395792-
dc.identifier.issn2473-7569-
dc.identifier.issn2334-3141-
dc.identifier.urihttp://hdl.handle.net/20.500.11861/9286-
dc.description.abstractA video summary’s objective is to produce succinct and condensed synopses that accurately depict the content of the original video while sacrificing none of its vital features. Effective deep summarization models have arisen in the field of video summarizing, made feasible by the advancement of gated recursive unit (GRU) and long and short-term memory (LSTM) technologies. However, if the video is very long, GRU and LSTM models are unlikely to capture long-term dependencies as well as they could otherwise. In recent years, significant progress in the field of supervised video summarization has been accomplished through the use of techniques involving sequence-to-sequence learning. However, it is important to remember that traditional recurrent neural networks (RNNs) have some limitations when it comes to modeling long sequences and that using transformers to represent sequences necessitates a large number of input parameters, resulting in increased computational complexity. We present a new video summarizing methodology that addresses the aforementioned issues by utilizing a Hydra Attention-based Vision Transformer framework. The suggested method captures long-term dependencies and extracts significant characteristics from video sequences well. The proposed architecture improves the accuracy and reduces the processing time of video summaries by using the capabilities of Hydra Attention and transformer-based sequence modeling. Our solution outperforms state-of-the-art approaches in terms of performance and computing economy in empirical evaluations of the SumMe and TVSum datasets.en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.titleEfficient video summarization with hydra attentive vision transformeren_US
dc.typeConference Paperen_US
dc.relation.conference2023 International Conference on Frontiers of Information Technology (FIT)en_US
dc.identifier.doi10.1109/FIT60620.2023.00044-
item.fulltextNo Fulltext-
crisitem.author.deptDepartment of Applied Data Science-
Appears in Collections:Applied Data Science - Publication
Show simple item record

SCOPUSTM   
Citations

1
checked on Jan 12, 2025

Page view(s)

34
Last Week
0
Last month
checked on Jan 13, 2025

Google ScholarTM

Impact Indices

Altmetric

PlumX

Metrics


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.