Abstract:
The appearance of unknown words often disturbs communication and reading. The proposed
system focuses on detecting those words which are unfamiliar to the users using temporal
data, Electroencephalography (EEG) and facial expressions of users. In particular, for the
word where the user gazes for some time, a word-familiarity prediction approach based on
time duration for which user has focused on that word, EEG signals from the user's brain
waves and facial expressions of the user while reading that word, has been developed. Wordfamiliarity
refers whether a user is familiar with the word or not while reading the text. The
proposed system keeps the track of the coordinates of the gaze with the timestamp to nd
the duration of the xation of the gaze at the particular word. Further, this time duration
data has been fed to Stochastic Gradient Descent classi er to predict the word familiarity.
Similarly, EEG signals have been processed using Wavelet decomposition technique and four
features have been computed from beta and gamma frequency bands. The prediction of wordfamiliarity
has been performed using Random Forest classi er. A decision fusion approach has
also been used to boost the prediction performance. The results show that the characteristics
of brain waves at the time of unknown word perception or confusion can be detected. And
further facial expressions of users have been used for prediction. The video has been recorded
while the user is reading the text. Image frames have been extracted from that video and
from each of that frame, a total of 68 cartesian coordinate point dataset have been generated.
The sequential dataset has been generated by nding the di erence between the coordinate
points with adjacent frame. And then word familiarity has been predicted by LSTM classi er
and further results have been compared with HMM classi er. A dictionary based pop-up
window has been developed to provide the meaning of the word when a user is found to be
unfamiliar with the text. The dataset of 12-15 users for di erent models has been developed
while they are reading 25 words. An accuracy of 82% has been recorded with EEG dataset
using the proposed classi er combination approach, 72.9% with temporal analysis and 80.26%
with facial expression dataset using LSTM classi er. Finally, a comparative study with other
popular classi cation technique is also discussed.