Comprehending international important Ramsar wetland documents using latent semantic topic model in kernel space
Ping Lin
College of Electrical Engineering, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Key Laboratory for Advanced Technology in Environmental Protection of Jiangsu Province, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Search for more papers by this authorShanchao Jiang
College of Electrical Engineering, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Key Laboratory for Advanced Technology in Environmental Protection of Jiangsu Province, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Search for more papers by this authorDu Li
College of Electrical Engineering, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Key Laboratory for Advanced Technology in Environmental Protection of Jiangsu Province, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Search for more papers by this authorZhiyong Zou
College of Mechanical and Electrical Engineering, Sichuan Agricultural University, Ya'an, Sichuan, China
Search for more papers by this authorCorresponding Author
Yongming Chen
College of Electrical Engineering, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Key Laboratory for Advanced Technology in Environmental Protection of Jiangsu Province, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Correspondence Yongming Chen, College of Electrical Engineering, Yancheng Institute of Technology, No.1 Middle Road Hope Avenue, Yancheng, 224051 Jiangsu, China. Email: billrange007@gmail.com
Search for more papers by this authorPing Lin
College of Electrical Engineering, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Key Laboratory for Advanced Technology in Environmental Protection of Jiangsu Province, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Search for more papers by this authorShanchao Jiang
College of Electrical Engineering, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Key Laboratory for Advanced Technology in Environmental Protection of Jiangsu Province, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Search for more papers by this authorDu Li
College of Electrical Engineering, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Key Laboratory for Advanced Technology in Environmental Protection of Jiangsu Province, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Search for more papers by this authorZhiyong Zou
College of Mechanical and Electrical Engineering, Sichuan Agricultural University, Ya'an, Sichuan, China
Search for more papers by this authorCorresponding Author
Yongming Chen
College of Electrical Engineering, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Key Laboratory for Advanced Technology in Environmental Protection of Jiangsu Province, Yancheng Institute of Technology, Yancheng, Jiangsu, China
Correspondence Yongming Chen, College of Electrical Engineering, Yancheng Institute of Technology, No.1 Middle Road Hope Avenue, Yancheng, 224051 Jiangsu, China. Email: billrange007@gmail.com
Search for more papers by this authorAbstract
The kernel-based statistical semantic topic model is introduced for comprehending three species of internationally important Ramsar wetland documents describing the Lashi Lake wetland in the Yunnan Province, the Yancheng wetland in the Jiangsu Province, and the Zoige wetland in the Sichuan Province of China. Latent Dirichlet allocation (LDA) features are used to represent the semantic components of wetland documents. Kernel principal component analysis (KPCA) maps the topic components to the kernel space to attain the low dimensional principal components. Support vector machines (SVMs) are used to comprehend the semantic distribution of distinct wetland documents in the kernel space. The LDA+KPCA+SVM algorithm reaches 77.0% training and 75.9% test accuracy and 0.902 training and 0.840 test mean average precision scores in the application of comprehending the wetland documents, respectively. The performance of the proposed kernel-based model is superior to the traditional models of LDA+SVM and LDA+PCA+SVM.
REFERENCES
- Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
- Chen, Y., Lin, P., He, Y., & He, J. (2015). A new method for perceiving origins of international important Ramsar wetland ecological habitat scenes in China. Computers and Electronics in Agriculture, 118, 237–246.
- Chen, Y., Zhang, X., Li, Z., & Ng, J.-P. (2015). Search engine reinforced semi-supervised classification and graph-based summarization of microblogs. Neurocomputing, 152, 274–286.
- Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273–297.
- Cui, M., Ma, A., Qi, H., Zhuang, X., Zhuang, G., & Zhao, G. (2015). Warmer temperature accelerates methane emissions from the Zoige wetland on the Tibetan Plateau without changing methanogenic community composition. Scientific Reports, 5, 11616.
- Gu, Y., Jurgens, G., Zhang, X., Chen, Q., & Lindstöm, K. (2013). Analysis of the non-thermophilic Crenarchaeota phylogeny in the swamp soil of Zoige plateau wetland. Acta Ecologica Sinica, 33, 201–205.
10.1016/j.chnaes.2013.05.006 Google Scholar
- Guo, Y., Barnes, S. J., & Jia, Q. (2017). Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent dirichlet allocation. Tourism Management, 59, 467–483.
- Hamill, J. M., Zhao, X. T., Mészáros, G., Bryce, M. R., & Arenz, M. (2018). Fast data sorting with modified principal component analysis to distinguish unique single molecular break junction trajectories. Physical Review Letters, 120, 016601.
- Hermoso, V., Abell, R., Linke, S., & Boon, P. (2016). The role of protected areas for freshwater biodiversity conservation: Challenges and opportunities in a rapidly changing world. Aquatic Conservation: Marine and Freshwater Ecosystems, 26, 3–11.
- Hoffman, M., Bach, F. R., and Blei, D. M (2010). Online learning for latent dirichlet allocation. Advances in Neural Information Processing Systems.
- Huang, L., Zhang, Y., Shi, Y., Liu, Y., Wang, L., & Yan, N. (2015). Comparison of phosphorus fractions and phosphatase activities in coastal wetland soils along vegetation zones of Yancheng National Nature Reserve China. Estuarine Coastal and Shelf Science, 157, 93–98.
- Huang, Y., Tian, K., Yue, H., Liu, Z., & Lai, J. (2012). Effects of dam impoundment on lakeside vegetation succession in the plateau wetlands of the Lashi Lake. Resources and Environment in the Yangtze Basin, 21, 1197–1203.
- Jayashankar, S., & Sridaran, R. (2017). Superlative model using word cloud for short answers evaluation in eLearning. Education and Information Technologies, 22, 2383–2402.
- Karaa, W. B. A., and Gribâa, N. (2013) Information retrieval with porter stemmer: A new version for English, in: Advances in computational science, engineering and information technology, Springer, 243-254.
- Ke, C.-Q., Zhang, D., Wang, F.-Q., Chen, S.-X., Schmullius, C., Boerner, W.-M., & Wang, H. (2011). Analyzing coastal wetland change in the Yancheng national nature reserve China. Regional Environmental Change, 11, 161–173.
- Kim, S.-B., Han, K.-S., Rim, H.-C., & Myaeng, S. H. (2006). Some effective techniques for naive bayes text classification. IEEE Transactions on Knowledge and Data Engineering, 18, 1457–1466.
- Kuang, F., Zhang, S., Jin, Z., & Xu, W. (2015). A novel SVM by combining kernel principal component analysis and improved chaotic particle swarm optimization for intrusion detection. Soft Computing, 19, 1187–1199.
- Landgrebe, T. C. W., & Duin, R. P. W. (2008). Efficient multiclass ROC approximation by decomposition via confusion matrix perturbation analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 810–822.
- Lee, K., Lee, J., & Kwan, M.-P. (2017). Location-based service using ontology-based semantic queries: A study with a focus on indoor activities in a university context. Computers, Environment and Urban Systems, 62, 41–52.
- Lu, H., Campbell, D., Chen, J., Qin, P., & Ren, H. (2007). Conservation and economic viability of nature reserves: An emergy evaluation of the Yancheng Biosphere Reserve. Biological Conservation, 139, 415–438.
- Lu, H., Meng, Y., Yan, K., & Gao, Z. (2019). Kernel principal component analysis combining rotation forest method for linearly inseparable data. Cognitive Systems Research, 53, 111–122.
- Mcleod, E., Chmura, G. L., Bouillon, S., Salm, R., Björk, M., Duarte, C. M., …, Silliman, B. R. (2011). A blueprint for blue carbon: Toward an improved understanding of the role of vegetated coastal habitats in sequestering CO2. Frontiers in Ecology and the Environment, 9, 552–560.
- Moro, S., Cortez, P., & Rita, P. (2015). Business intelligence in banking: A literature analysis from 2002 to 2013 using text mining and latent Dirichlet allocation. Expert Systems with Applications, 42, 1314–1324.
- Munezero, M. D., Montero, C. S., Sutinen, E., & Pajunen, J. (2014). Are they different? Affect, feeling. emotion, sentiment, and opinion detection in text, IEEE transactions on affective computing, 5, 101–111.
- Noori, R., Abdoli, M. A., Ghasrodashti, A. A., & Jalili Ghazizade, M. (2009). Prediction of municipal solid waste generation with combination of support vector machine and principal component analysis: A case study of Mashhad. Environmental Progress & Sustainable Energy, 28, 249–258.
- Noori, R., Karbassi, A., Ashrafi, K., Ardestani, M., Mehrdadi, N., & Nabi Bidhendi, G.-R. (2012a). Active and online prediction of BOD5 in river systems using reduced-order support vector machine. Environmental Earth Sciences, 67, 141–149.
- Noori, R., Karbassi, A., Khakpour, A., Shahbazbegian, M., Badam, H. M. K., & Vesali-Naseh, M. (2012b). Chemometric analysis of surface water quality data: Case study of the Gorganrud River Basin, Iran. Environmental Modeling & Assessment, 17, 411–420.
- Noori, R., Sabahi, M. S., Karbassi, A. R., Baghvand, A., & Taati zadeh, H. (2010). Multivariate statistical analysis of surface water quality based on correlations and variations in the data set. Desalination, 260, 129–136.
- Ozenne, B., Subtil, F., & Maucort-Boulch, D. (2015). The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. Journal of Clinical Epidemiology, 68, 855–859.
- Rahman, N. A., Soom, A. B. M., and Ismail, N. K. (2017) Enhancing Latent Semantic Analysis by Embedding Tagging Algorithm in Retrieving Malay Text Documents, in: Advanced Topics in Intelligent Information and Database Systems, Springer, 309-319.
- Salles, T., Rocha, L., Gonçalves, M. A., Almeida, J. M., Mourão, F., Meira, W., & Viegas, F. (2016). A quantitative analysis of the temporal effects on automatic text classification. Journal of the Association for Information Science and Technology, 67, 1639–1667.
- Sidorov, G., Ibarra romero, M., Markov, I., Guzman-Cabrera, R., Chanona-Hernández, L., & Velásquez, F. (2017). Measuring similarity between Karel programs using character and word n-grams. Programming and Computer Software, 43, 47–50.
- Soleymani, M., Garcia, D., Jou, B., Schuller, B., Chang, S.-F., & Pantic, M. (2017). A survey of multimodal sentiment analysis. Image and Vision Computing, 65, 3–14.
- Sun, Z., Sun, W., Tong, C., Zeng, C., Yu, X., & Mou, X. (2015). China's coastal wetlands: Conservation history, implementation efforts, existing issues and strategies for future improvement. Environment International, 79, 25–41.
- Taylor, B. M., & McAllister, R. R. J. (2014). Bringing it all together: Researcher dialogue to improve synthesis in regional climate adaptation in South-East Queensland, Australia. Regional Environmental Change, 14, 513–526.
- Tian, K., Liu, G., Xiao, D., Sun, J., Lu, M., Huang, Y., & Lin, P. (2015). Ecological effects of Dam impoundment on closed and half-closed wetlands in China. Wetlands, 35, 889–898.
- Uysal, A. K., & Gunal, S. (2014). The impact of preprocessing on text classification. Information Processing & Management, 50, 104–112.
- Voeller, E. (2011). Renewing a Naxi environmental ethic in Lijiang, China: An approach for water management. Lakes & Reservoirs: Research & Management, 16, 223–229.
10.1111/j.1440-1770.2011.00477.x Google Scholar
- Wu, H., Zhang, J., Ngo, H. H., Guo, W., Hu, Z., Liang, S., …, Liu, H. (2015). A review on the sustainability of constructed wetlands for wastewater treatment: Design and operation. Bioresource Technology, 175, 594–601.
- Zhang, H., Wu, P., Yang, D., Cui, L., He, X., & Xiong, Y. (2011). Dynamics of soil meso- and microfauna communities in Zoige alpine meadows on the eastern edge of Qinghai-Tibet Plateau. China, Shengtai Xuebao/Acta Ecologica Sinica, 31, 4385–4397.
- Zhao, R., & Mao, K. (2018). Fuzzy bag-of-words model for document representation. IEEE Transactions on Fuzzy Systems, 26(2), 794–804.
- Zheng, Y., Byg, A., Thorsen, B. J., & Strange, N. (2014). A temporal dimension of household vulnerability in three rural communities in Lijiang, China. Human Ecology, 42, 283–295.
- Zhou, X., Ouyang, J., & Li, X. (2018). Two time-efficient gibbs sampling inference algorithms for biterm topic model. Applied Intelligence, 48, 730–754.
- Zhou, X., Xu, M., Wang, Z., Yu, B., & Shao, X. (2019). Responses of macroinvertebrate assemblages to environmental variations in the river-oxbow lake system of the Zoige wetland (Bai River, Qinghai-Tibet Plateau). Science of The Total Environment, 659, 150–160.
- Zhuang, L., Jing, F., and Zhu, X.-Y. (2006) Movie review mining and summarization, Proceedings of the 15th ACM International Conference on Information and Knowledge Management.