PSYCHOVISUAL BASED IMAGE SEGMENTATION

Sharma, Ashu

Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/19436

Full metadata record

DC Field	Value	Language
dc.contributor.author	Sharma, Ashu	-
dc.date.accessioned	2026-03-09T07:17:46Z	-
dc.date.available	2026-03-09T07:17:46Z	-
dc.date.issued	2020-03	-
dc.identifier.uri	http://localhost:8081/jspui/handle/123456789/19436	-
dc.guide	Ghosh, Jayanta Kumar	en_US
dc.description.abstract	The astounding visual capabilities of humans serve as a standard for evaluating image analysis approaches. However, this standard is applied in a limited way. In order to be able to use the visual capabilities of human as a useful standard, there is a need to first have a comprehensive characterization of these capabilities. Given that psycho-vision, the study of psychology of vision, is a key characteristic of the human visual system (HVS). Therefore, studying psycho-visual aspects is a sine-qua-non for progressing with the image analysis tasks. In this regard, much advancement is noticed in the field of psycho-visual related image analysis. Segmentation, an initial step for image analysis, has also been attempted with psycho-visual methods which have produced far greater results than the conventional methods. However, segmentation of complex images without predefined target remains overlooked, and so merits a great deal of scholarly attention. This study, therefore, attempts to explore the psycho-visual segmentation of images when target objects are not known prior to analysis. The existing methods for such segmentation usually focus over boundary delineation of salient or target object/s. However, saliency is not a well-defined term. For that reason, the salient objects are not unique in complex images. Generally, when target object is not defined in complex images, boundaries for all possible objects are considered as a reference. If multiple humans prepare such images then the extent of segments varies from person to person. On the other hand, segmentation done by a single human may contain consistency; but do not justify the variation in human perception. This statement remains valid even when that person is an expert of image interpretation. So in either case, the issue of over-under segmentation and ambiguity remain unsorted. Therefore, the existing ways are incompetent for assessing complex image segmentation in the absence of target objects. Due to these reasons, in such case, there is a need to bring the human variability at a single validated level. Therefore the objective of the research work is to study psycho visual segmentation of complex images without known target object. The HVS involves attentional search to locate the perceptually relevant areas in a visual scene through eye movements. The relevance of area is decided by the perceptual process and the intended aim. Thus, the gaze data are strongly correlated with the human perceptual process. Correspondingly, verbalizing thoughts during an attentional search can i directly reflect the spontaneous mental process. Therefore, a correlation between the verbal (psycho) and gaze (visual) data can be expected. With this concept, the research question and accordingly the methodology have been framed. The research question looks for ‘how to draw segments in complex images for which the quantified verbal data significantly correlate with gaze data in a condition where target objects are not predefined.’ To answer this, the methodology assesses the segments drawn on the basis of the human perception related theories. Each time, the verbal-gaze data correlation has been analysed and finally, those segments with the most significant correlation have been considered as the psycho visual segments. The methodology has been implemented on high-spatial resolution satellite (HSRS) images of the urban area in natural colour, which are visually complex in nature. The verbal and gaze data have been recorded through concurrent think-aloud (CTA) protocol and eye tracking technology, respectively. The gaze data have been quantified by using four different metrics viz. First Fixation (FF), Fixation Duration (FD), % share of FD (FD%), and Fixation count (FC). Thus, with respect to four metrics FF, FD, FD%, and FC four different correlations and coefficients R1, R2, R3, and R4 have been obtained, respectively. The qualitative and quantitative correlation analysis has been started with the existing notion of full segmentation and reaches up to the proposed premise of segmentation. The values of four correlation coefficients have been obtained as 0.53, 0.60, 0.53, and 0.55 for the existing full segmentation. The best CTA-gaze data correlation has been achieved for the segments formed with affordance based grouping of objects with the preference to perceptual grouping. The four correlation coefficients values of 0.64, 0.72, 0.76, and 0.69 have been obtained for proposed premise. The qualitative and quantitative comparison of the proposed premise with the existing full segmentation has shown the superiority of the proposed premise. Finally, the validation process ensures the credibility of the proposed premise. Thus, the study concludes that affordance based grouping of objects, with preference to perceptual grouping, more closely represents the psycho-visual segmentation of complex HSRS images without predefined target than full segmentation. Due to the direct data collection from humans, the output segmented images serve as a reference for assessing psycho-visual segmentation method. Also, the methodology adopted here serves as a guideline for other eye tracking based study of complex images. Development of mathematical model of perception is a promising future scope of this study.	en_US
dc.language.iso	en	en_US
dc.publisher	IIT Roorkee	en_US
dc.title	PSYCHOVISUAL BASED IMAGE SEGMENTATION	en_US
dc.type	Thesis	en_US
Appears in Collections:	DOCTORAL THESES (Civil Engg)

Files in This Item:

File	Description	Size	Format
ASHU SHARMA 13914002.pdf		5.7 MB	Adobe PDF	View/Open

Show simple item record