Effects of Positivization on the Paragraph Vector Model

doi:10.1109/INISTA.2019.8778304

Publication:
Effects of Positivization on the Paragraph Vector Model

dc.contributor.authors	Gerek A., Yuney M.C., Erkaya E., Ganiz M.C.
dc.date.accessioned	2022-03-15T02:14:19Z
dc.date.accessioned	2026-01-11T13:44:51Z
dc.date.available	2022-03-15T02:14:19Z
dc.date.issued	2019
dc.description.abstract	Natural language processing (NLP) is an important field of Artificial Intelligence. One of the fundamental problems in NLP is to create vector (distributed) representations of words so that vectors of words that have similar meaning lie closer in space. One of the most popular algorithms for creating these representations are word embedding models such as word2vec and fastText. Similarly the paragraph vector model (doc2vec) is used to create distributed representations of documents while simultaneously creating distributed representations for the words in these documents. These models create a dense, and low dimensional (usually in the low hundreds) vector representations which may include negative values. In this study we focus on these negative values and introduce a family of regularization methods in which document, word and/or context vectors of the paragraph vector model are forced to have only positive components. We measure its effects on several tasks; text classification, semantic similarity, and analogy tasks. Although positivization greatly increases the sparsity of the word embeddings, and should be expected to result in a loss of information, our results show that there is almost no reduction in the performance of the regularized embeddings in these tasks. We also observe an increase in the classification accuracy in one case. We foresee that these approaches can be beneficial in machine learning systems which require non-negative vectors. © 2019 IEEE.
dc.identifier.doi	10.1109/INISTA.2019.8778304
dc.identifier.isbn	9781728118628
dc.identifier.uri	https://hdl.handle.net/11424/248026
dc.language.iso	eng
dc.publisher	Institute of Electrical and Electronics Engineers Inc.
dc.relation.ispartof	IEEE International Symposium on INnovations in Intelligent SysTems and Applications, INISTA 2019 - Proceedings
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	analogy
dc.subject	artificial intelligence
dc.subject	natural language processing
dc.subject	regularization
dc.subject	semantic similarity
dc.subject	text classification
dc.subject	word embeddings
dc.title	Effects of Positivization on the Paragraph Vector Model
dc.type	conferenceObject
dspace.entity.type	Publication
oaire.citation.title	IEEE International Symposium on INnovations in Intelligent SysTems and Applications, INISTA 2019 - Proceedings

Collections

Araştırma Çıktıları

Publication: Effects of Positivization on the Paragraph Vector Model

Files

Collections

Publication:
Effects of Positivization on the Paragraph Vector Model