Md. Billal Hossain, Mohammad Shamsul Arefin and Mohammad AshfakHabib, “Developing a Framework for Acquisition and Analysis of Speeches”, Progress in Advanced Computing and Intelligent Engineering (Scopus Index), Springer, to appear.
Speech plays a vital role for human communication. Proper delivery of speech can enable a person to connect with a large number of people. Nowadays, a lot of valuable speeches are being provided by many popular people throughout the world and it will be very helpful if important information can be extracted from those speeches by analyzing them. An automatic speech-to-text converter can facilitate the task of speech analysis. There have been carried out a lot of works for the conversion of speech to text in the last few decades. This paper presents a framework for the acquisition of speech along with the location of the speaker and then conversion of that speech into text. We have worked with speeches containing three different languages. To evaluate our framework, we collected speeches from several locations and the result shows that the framework can be used for efficient collection and analysis of the speeches
Md. Rashadur Rahman, Mohammad Shamsul Arefin, Md. BillalHossain, Mohammad AshfakHabib and A. S. M. Kayes,Towards a Framework for Acquisition and Analysis of Speeches to Identify Suspicious Contents through Machine Learning, In Complexity (Indexing: SCIE, Scopus Index, Impact Factor: 2.462, CiteScore:3.20, Q1), to appear
The most prominent form of human communication and interaction is speech. It plays an indispensable role for expressing emotions, motivating, guiding, cheering. An ill-intentioned speech can mislead people, societies and even a nation. A misguided speech can trigger social controversy and can result in violent activities. Every day there are a lot of speeches being delivered around the world, which are quite impractical to inspect manually. In order to prevent any vicious action resulted from any misguided speech, the development of an automatic system that can efficiently detect suspicious speech has become imperative. In this study, we have presented a framework for acquisition of speech along with the location of the speaker, converting the speeches into texts and finally, we have proposed a system based on long short-term memory (LSTM) which is a variant of recurrent neural network (RNN) to classify speeches into suspicious and non-suspicious. We have considered speeches of Bangla language and developed our own dataset which contains about 5000 suspicious and non-suspicious samples for training and validating our model. A comparative analysis of accuracy among other machine learning algorithms such as Logistic regression, SVM, KNN, Naive Bayes and decision tree is performed in order to evaluate the effectiveness of the system. The experimental results show that our proposed deep learning-based model provides the highest accuracy compared to other algorithms.