ANALYSIS OF GROUPS ON TEXTS: A STUDY OF THE ABSTRACTS OF THE BANCO DE TESES AND DISSERTATIONS OF CAPES
Um estudo dos resumos do banco de teses e dissertações da CAPES
DOI:
https://doi.org/10.48090/ciki.v%25vi%25i.589Keywords:
Agrupamento de Documentos, Dados Abertos, Mineração de Dados, Kmeans, Descoberta de conhecimento em textoAbstract
The process of knowledge discovery in large volumes of information has a wide field of application. The main tasks of classification, clustering and association have been used in different areas of knowledge to make it possible to identify useful knowledge in large volumes of data. In this article, the application of data mining techniques, especially the K-Means clustering algorithm, is analyzed with the objective of verifying its effectiveness for the analysis of data from the Brazilian Open Data Portal, a public data repository organized and made available for the population. The dataset used for the application of the clustering algorithm was extracted from the information provided on the thesis and dissertation database made available by CAPES (Coordination of Improvement of Higher Education Personnel). The data were processed and inserted in the Apache Solr® platform where they were indexed, and the clusters were generated from the Carrot2 software, using the K-Means algorithm with customized configurations. The clusters were generated year by year and consolidated, with different configurations of the algorithm, making it possible to compare the obtained terms. It was concluded that the results of the used tools are directly related to the choice of the number of initial clusters, but the potential for discovering non-obvious clusters is obvious.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
DECLARATION OF ASSIGNMENT AND TRANSFER OF PATRIMONIAL RIGHTS ON ARTICLE PUBLISHED IN THE CIKI PROCEEDINGS AND AUTHORIZATION FOR PUBLICATION
The AUTHOR, according to the law n. 9.610 of February 19th, 1998, hereby declares to whomsoever may concern, that it assigns and transfers, in a universal, definitive, irreversible, exclusive and gratuitous manner, all of its author's economic rights on the article submitted to the International Congress of Knowledge and Innovation - ciki.
The AUTHOR guarantees:
- That the article is original, except for the citations of other published works, provided that the limitations expressed in articles 46 and 47 of Law 9.610 of February 19, 1998 are observed;
- That the article does not contain any slanderous or defamatory statements and does not infringe any intellectual, commercial or industrial property rights of any third parties;
- To promptly compensate the International Congress of Knowledge and Innovation - ciKi for any indemnities, losses or expenses arising from the breach of the guarantees expressed in paragraphs 1 and 2, above.
With this assignment and transfer of the patrimonial rights referring to the author's right, the International Congress of Knowledge and Innovation - ciKi and its successors are free of any copyright payment to the AUTHOR or to their heirs or successors.
The AUTHOR further declares that the International Congress of Knowledge and Innovation - ciKi is fully authorized to use this article, in whole or in part, edited or complete, in Portuguese and in all other languages, in print, in Electronic means, internet, for commercial purposes or not, including distributing, adapting, creating derivative works, assigning rights to third parties in Brazil and / or abroad, including but not limited to: teaching, study and research; publication and dissemination; quote; use in general telecommunication means; audiovisual use in general, including all existing or future digital technologies, capable of storing and reproducing data.
The AUTHOR has guaranteed the moral rights over his article, including the binding of his name as author of the article object of this transfer.
The AUTHOR must always make a written request to the International Congress of Knowledge and Innovation - ciKi, when he intends to use his article, and is obliged to always insert the credit in the original publication, citing the bibliographic reference, complete and legible.