Quality assessment database for super-resolved images: QADS

Fei Zhou, Rongguo Yao, Bangwen Zhang, Qun Chen, Bozhi Liu, Guoping Qiu

Introduction

This web site contains further details of the Quality Assessment Database for SRIs (QADS) designed for evaluating the performances image super-resolution algorithms.

Process of psychovisual experiments

A total of 100 subjects participated in the subjective evaluation to acquire reliable mean opinion score (MOS). The subjects, who are mainly postgraduate from different disciplines, has an age range from 18 to 38. All of them have normal or corrected to normal vision, and were naive as to the purpose of the study. Some statistics about the participating subjects are shown in Fig.1. The distribution of subjects’s disciplines  varied, as shown in Fig.1a. Moreover, the ratio of genders is relatively balanced, as showed in Fig.1b. In addition to postgraduate, some undergraduate and post-docs also participated in our psychovisual experiments, as showed in Fig.1c.

The physical environment of the experiment is presented in Fig.2. To make the picture (in Fig.2) visible, we turned the light on. During the subject evaluation, the physical environment is without any background light to prevent any interference.

QADS contains 20 source images and 980 SRIs, it means for each reference image we have 49 SRIs. The subject evaluation was conducted in a round-wise manner. Here a round of evaluation means that the subject have evaluated 49 SRIs for a given reference image. Fig.3 is a user interface designed specifically for subjective evaluation of image quality.

At the top of the software interface, each subject should input his/her name and the corresponding reference image number that range from 1 to 20. Next, pushing the button ‘start’ to begin. There are four image windows shown simultaneously in the software interface. The top-left window in the surface shows one SRI, denoted by “SRI 1” while the top-right window shows another SRI, denoted by “SRI 2”. The image at the bottom-left can be controlled by the subjects during the evaluation. It will show the SRI in the top-left window if the subjects pressed the key “1” on the keyboard. When the key “2” is pressed, the image would switch to one shown in the top-right window. The subjects could further press “3” to show the reference image in the bottom-left window. Alternatively, the subjects can also click the buttons below the bottom-left image. During the subjective evaluation, subjects were instructed to click the button “>” on the user interface, if they felt the left image has higher quality than the right one. Conversely, the button “<” should be clicked. We also allowed the subjects to click the “=”, if they felt the top two images were with the same visual quality or even could not perceive the difference between them.

Before the subjects begin to evaluate the quality of the SRIs, we trained them to place their left ring finger on the keyboard number “1”, and their middle finger and index finger on “2” and “3” respectively, while holding the mouse in their right hand, as showed in Fig.4. They were required to conduct a round of training evaluation before formal evaluation, while the results from training evaluation were not recorded in the database. We reminded them to make decision as soon as possible during the subjective evaluation. All the subjects voluntarily chose the number of rounds, from 1 to 20, to evaluation. That means for each subject 20 round is the maximum, 1 is the minimum. Nevertheless, in one day the maximum number of rounds for each subject is 3 to avoid visual fatigue. Additionally, before the next round evaluation, subjects must have a rest for 30 minutes.

Generally, each subject spent between 25 and 30 minutes on a round of SRIs. It took us about two months to complete the psychovisual experiments. The raw data obtained from subject’s evaluation are normalized using Eq.(1). And the MOS were calculated as the average of each subject’s scores after the removal of outliers.

Data post processing

Where \(S_{ijl} \) is the normalized score for the \(j\) -th (1≤ \(j\) ≤49) SRI of the \(i\)-th reference image (1≤ \(i\)≤20) rated by subject \(l\), \( S_{Rijl} \) is a round of raw scores evaluated by subject \(l\) , \(MOS_{max}\) is the maximum MOS in the round evaluated by subject \(l\).

The steps for outlier detection are as follows:

1.Compute the mean and standard deviation for each SRI as follows:

where \(L\) is the number of subjects

2.If \(S_{ijl}\) is outside the interval [ \( \overline{S}_{ij} \) – 2 × \( \overline{ST}_{ij} \) , \( \overline{S}_{ij} \) + 2×\( \overline{ST}_{ij} \)], then \(S_{ijl}\) is regarded as an outlier.

Given a round of subject evaluation, the round would be rejected if the outlier’s ratio is more than 5%.

Anaysis on texture descriptors

In our framework, the core is to use different strategies for structural and textural components. For the textural similarity, we do not wish to exclude other texture descriptors, although the SIFT features are employed in this work. We believe that any descriptor that satisfies the following two points can be adopted in our framework. First, it should describe the distribution of textures instead of the feature at a given pixel. For the textures, our motivation is that similar textural distribution would result in similar visual perception. Therefore, a histogram-like descriptor would be better. That is why we do not directly use Gabor and phase congruency. Similarly, the LBP operator is also not the first choice. Second, it should have the capacity to describe the distribution of mussy textures. In the textural component, the intensities are generally very weak and mussy. Therefore, when using LBP histogram within a cell, many non-uniform patterns appear. However, all the non-uniform patterns are summed to only one bin in the LBP histogram. Thus, we prefer SIFT-based histogram feature to LBP histogram in this work. Here, we perform a simple experiment to see the impact of texture descriptors. The final results using different texture descriptors are provided as below.

\[
\begin{array}{|c|c|c|c|c|c|}
\hline
\text{Criteria} & \text{LBP operator} & \text{Gabor} & \text{Phase congruency} & \text{LBP histogram}& \text{SIFT}\\
\hline
\text{SROCC} & 0.8333 & 0.8560 &0.5257 &0.8534 &0.9232 \\
\hline
\text{KROCC} & 0.6256 & 0.6717 &0.3696 &0.6581 &0.7541 \\
\hline
\text{PLCC} & 0.8393 & 0.8615 &0.5454 &0.8561 &0.9230 \\
\hline
\text{RMSE} & 0.1571 & 0.1395 &0.2308 &0.1420 &0.1057 \\
\hline
\end{array}
\]

Download

QADS.zip

SR_IQA_Code.zip

STD mat files: refImg-t and refImg-s  , srImg-t  , srImg-s

Please cite the following paper if you use the QADS database in your research:

Fei Zhou, Rongguo Yao, Bozhi Liu, and Guoping Qiu. “Visual quality assessment for super-resolved images: Database and method”, to be published in IEEE Transactions on Image Processing. DOI: 10.1109/TIP.2019.2898638.

Copyright

All rights of the QADS Database are reserved. The database is only available for academic research and noncommercial purposes. Any commercial uses of this database are strictly prohibited.

Contact information

Fei Zhou, Assistant Professor

The College of Information Engineering, Shenzhen University.

E-mail: flying.zhou@163.com