Trend analysis of the Brazilian scientific production in Information Science area: a text mining exploratory study

Authors

  • Caio Cesar Trucolo Universidade de São Paulo - USP
    • Luciano Antonio Digiampietri Universidade de São Paulo - USP

      DOI:

      https://doi.org/10.5380/atoz.v3i2.41341

      Keywords:

      Trend analysis, Information science, social networks

      Abstract

      Introduction: Trend analysis can be used as a strategy to identify subjects or research areas with potential of popularity which are not very widespread. This work consists of trend identification by text mining and historic analysis of the scientific productions (scientific papers) of the Information Science area PhD s. Method: This work, having an exploratory basis, was built in three steps. The first step was the data gathering of the curricula registered in Lattes platform. The second one consisted of automatic extraction of the most important terms inside the publications titles and, in the third step linear and nonlinear regression of the frequency based importance index of the extracted terms were executed. Results: Identified trends from the Information Science area for short, medium and long time were presented. Conclusions: This work presents and applies a trend identification method that can be seen as a first step considering all the potential of the national scientific production trend analysis. Moreover, trend analysis general information and the trends behavior over time were discussed. 

      Author Biographies

      Caio Cesar Trucolo, Universidade de São Paulo - USP

      Graduado em Sistemas de Informação - USP, Mestrando em Sistemas de Informação - USP

      Luciano Antonio Digiampietri, Universidade de São Paulo - USP

      Graduado em Ciência da Computação - UNICAMP, Doutor em Ciência da Computação - UNICAMP

      References

      Abe, H., Tsumoto, S. (2009). Evaluating a method to detect temporal trends of phrases in research documents. 8th IEEE International Conference on Cognitive Informatics, 378-383. doi:10.1109/COGINF.2009.5250711

      Bolelli, L., Ertekin, S., Zhou, D., & Giles, C. L. (2009). Finding topic trends in digital libraries. 9th ACM/IEEE-CS Joint Conference on Digital Libraries, 69-72. doi:10.1145/1555400.1555411

      Cimenler, O., Reeves, K. A., & Skvorets, J. (2014). A regression analysis of researchers’ social network metrics on their citation performance in a college of engineering. Journal of Informetrics, 8(3), 667-682. doi:10.1016/j.joi.2014.06.004

      Digiampietri, L. A., Mena-Chalco, J. P., Pérez-Alcázar, J. J., Tuesta, E. F., Delgado, K. V., Mugnaini, R., & Silva, G. S. (2012a). Dinâmica das relações de coautoria nos programas de pós-graduação em computação no Brasil. 2012 Brazilian Workshop on Social Network Analysis and Mining.

      Digiampietri, L. A., Mena-Chalco, J. P., Pérez-Alcázar, J. J., Tuesta, E. F., Delgado, K. V., Mugnaini, R., & Silva, G. S. (2012b). Minerando e caracterizando dados de currículos Lattes. 2012 Brazilian Workshop on Social Network Analysis and Mining. Retirado de http://www.imago.ufpr.br/csbc2012/anais_csbc/eventos/brasnam/artigos/BRASNAM%20-%20Minerando%20e%20Caracterizando%20Dados%20de%20Curriculos%20Lattes.pdf

      Digiampietri, L., Mugnaini, R., Mena-Chalco, J., Delgado, K., & Pérez-Alcázar, J. (2014). Análise da atualização dos Currículos Lattes. IV Encontro Brasileiro de Bibliometria e Cientometria. Retirado de http://www.uspleste.usp.br/digiampietri/bibtex/DigiampietriEtAl_EBBC2014.pdf

      Kawamae, N. (2012). Theme chronicle model: chronicle consists of timestamp and topical words over each theme. 21st ACM International Conference on Information and Knowledge Management, 2065-2069. doi:10.1145/2396761.2398573

      Kawamae, N., & Higashinaka, R. (2010). Trend detection model. 19th International Conference on World Wide Web, 1129-1130. doi:10.1145/1772690.1772838

      Jayashri, M., & Chitra, P. (2012). Topic clustering and topic evolution based on temporal parameters. 2012 International Conference on Recent Trends in Information Technology, 559-564. doi:10.1109/ICRTIT.2012.6206816

      Miyata, B. K. O., Kano, V. Y., & Digiampietri, L. A. (2013). Combinando mineração de textos e análise de redes sociais para a identificação das áreas de atuação de pesquisadores. Second Brazilian Workshop on Social Network Analysis and Mining. Retirado de https://drive.google.com/viewerng/viewer?a=.

      Nakagawa, H., & Mori, T. (2002). A simple but powerful automatic term extraction method. Second International Workshop on Computational Terminology. doi:10.3115/1118771.1118778

      Park, H., Kim, E., Bae, K., Hahn, H., Sung, T., & Kwon, H. (2011). Detection and analysis of trend topics for global scientific literature using feature selection based on Gini-Index. 23rd IEEE International Conference on Tools with Artificial Intelligence, 965–969. doi:10.1109/ICTAI.2011.166

      Trucolo, C. C., & Digiampietri, L. A. (2014a). Análise de tendências da produção científica nacional da área de Ciência da Computação. Revista de Sistemas de Informação da FSMA, 14, 2-9. Retirado de http://www.fsma.edu.br/si/edicao14/FSMA_SI_2014_2_Estudantil_1.pdf

      Trucolo, C. C., & Digiampietri, L. A. (2014b). Uma revisão sistemática acerca das técnicas de identificação de análise de tendências. X Simpósio Brasileiro de Sistemas de Informação, 639-650. Retirado de http://www.uspleste.usp.br/digiampietri/bibtex/TrucoloEDigiampietri2014a.pdf

      Published

      2014-12-31

      How to Cite

      Trucolo, C. C., & Digiampietri, L. A. (2014). Trend analysis of the Brazilian scientific production in Information Science area: a text mining exploratory study. AtoZ: Novas práticas Em informação E Conhecimento, 3(2), 87–94. https://doi.org/10.5380/atoz.v3i2.41341