HOUGH TRANSFORM GENERATED STRONG IMAGE HASHING SCHEME FOR COPY DETECTION

The rapid development of image editing software has resulted in widespread unauthorized duplication of original images. This has given rise to the need to develop robust image hashing technique which can easily identify duplicate copies of the original images apart from differentiating it from different images. In this paper, we have proposed an image hashing technique based on discrete wavelet transform and Hough transform, which is robust to large number of image processing attacks including shifting and shearing. The input image is initially pre-processed to remove any kind of minor effects. Discrete wavelet transform is then applied to the pre-processed image to produce different wavelet coefficients from which different edges are detected by using a canny edge detector. Hough transform is finally applied to the edge-detected image to generate an image hash which is used for image identification. Different experiments were conducted to show that the proposed hashing technique has better robustness and discrimination performance as compared to the state-of-theart techniques. Normalized average mean value difference is also calculated to show the performance of the proposed technique Received: 7 April 2018 Accepted: 30 August 2018 Published: 1 October 2018 Journal of ICT, 17, No. 4 (October) 2018, pp: 653–678 654 towards various image processing attacks. The proposed copy detection scheme can perform copy detection over large databases and can be considered to be a prototype for developing online real-time copy detection system.


INTRODUCTION
The use of digital media is increasing in our day to day life due to the adaptability of a large number of smart devices like smartphones.These devices allow us to do a lot of application-oriented tasks very easily, including capturing and editing of images.We know it very well that the use of editing software does not require any special technical expertise and they can easily manipulate images (Liu, Wang, Lian, & Wang, 2011).Massive creation and widespread dispersion of data, arising from easy to copy nature, poses new challenges for the protection of intellectual property of multimedia data (Kang & Wei, 2009).Protecting the copyright of an image is a matter of great concern (Qazi, Hayat, Khan, Madani, Khan, Kolodziej, Lin & Wu, 2013).To ensure that the given image is original and is not a modified version, image authentication techniques are required (Battiato, Farinella, Messina, & Puglisi, 2012).Traditionally, authentication issues are addressed by cryptographic hashes, which are sensitive to each bit of the input message.As a result, change in even a single bit of the input data leads to a significant change in the hash value (Qureshi & Deriche, 2015).However, due to the high sensitivity of the input data these hash functions are not suitable for image authentication.
In this context, we need to explore the area of image forensics (Redi, Tatak, & Dugelay, 2011) which involves a combination of techniques used not only to verify the authenticity of an image but also to verify ownership and detect unauthorized copies.Currently, two approaches named as watermarking and content-based copy detection are used to detect unauthorized copies.In watermarking, an authenticator is generated and added to the media content which is used to identify the authenticity of an original content (Rey & Dugelay, 2002).Content-based copy detection (CBCD) is an alternative to digital watermarking, in which multimedia content itself is used to establish its ownership.(Hsiao, Chen, Chien, & Chen, 2007).In image-based copy detection, unique features are extracted from an image which can be used for identification.
Over the last few years, a number of significant works have been proposed in the area of image hashing which is an extension of the content based copy detection techniques (Tang, Yang, Huang, & Zhang, 2014b).In image hashing, the generated unique feature is represented as small, preferably bit level data to form the image hash, which is used for image identification (Tang, Zhang, Dai, Yang, & Wu, 2013a).Ideally, image hashing should be able to discriminate between similar and dissimilar images, i.e. the mechanism should depict robustness and discrimination among the images.Apart from that, it should be robust to various kinds of image processing attacks besides fulfilling the properties related to specific applications.

LITERATURE REVIEW
In the past, researchers have implemented many algorithms related to various aspects of image hashing.Some of the notable algorithms categorized on the basis of transformation/functionalities used are as follows: Tang, Yang, Huang, and Zhang (2014b) proposed image hashing based on dominant discrete cosine transform (DCT) coefficients which have been proven to perform well in classification and in detecting image copies.Tang, Wang, and Zhang (2010) used a mechanism based on a dictionary, which represents the characteristics of various image blocks.The proposed mechanism depicted a low collision probability.DCT-based techniques fail against geometric transformations (Al-Qershi & Kho, 2013).Lei, Wang, and Huang (2011) proposed a novel robust hashing method based on the use of radon transform and discrete Fourier transform (DFT).The algorithm performs well in detecting copies with a small hash size.Wu, Zhou, and Niu (2009) proposed a hashing algorithm based on radon and wavelet transform, which can identify content changes.Unfortunately, radon transform is not resistant to all the geometric transformations such as shifting & shearing.(Wu, Zhou, & Niu, 2009).Ahmed, Siyal, and Abbas (2010) proposed a robust hash-based scheme, where pixels of each block are randomly modulated to produce the hash.Such algorithms have proven to exhibit good time-frequency localization property.Karsh, Laskar, and Aditi (2017) proposed an image hashing, where four-level 2D-DWT is applied along with SVD to produce the image hash.Chen and Hsieh (2015) proposed an algorithm, where 128-dimensional SIFT features are extracted from a normalized image.The proposed scheme significantly reduces the retrieval time with a minor loss of accuracy.Ling, Yan, Zou, Liu, and Feng (2013) proposed a fast image copy detection approach based on local fingerprint defined visual words.The mechanism outperforms similar state-of-the-art methods in terms of precision and efficiency.Lv and Wang (2012) proposed a technique similar to Ling et al. (2013), wherein the Harris detector is used to select the most stable key-points which are less vulnerable to image processing attacks after the application of SIFT.Tang, Dai, Zhang, Huang, and Yang (2014) proposed a block-based robust image hashing based on color-vector angles and discrete wavelet transform (DWT).The proposed mechanism is robust to normal digital operations including rotation up to 5 o .Tang, Huang, Dai, and Yang (2012b) proposed the use of multiple histograms in which normalized image is divided into different rings with equal area and then ring-based histogram features are extracted.The proposed mechanism is claimed to be resilient against rotation of any arbitrary angle.Tang, Zhang, Huang, and Dai (2013) proposed another hashing on the basis of ring-based entropies which outperforms similar techniques in terms of time complexity.Tang, Zhang, Li and Zhang (2016) proposed a robust image hashing method based on ring partition and four statistical features, i.e. mean, variance, skewness and kurtosis.Tang, Zhang, and Zhang (2014) proposed image hashing based on ring partition and nonnegative matrix factorization (NMF).Here, NMF is applied to the secondary image produced on the basis of ring partition.The algorithms show good robustness against rotation and have good discriminative capability.Tang, Wang, Zhang, Wei, and Su (2008) proposed a robust hashing in which NMF is applied to produce a coefficient matrix, which is coarsely quantized to produce the final hash.The algorithm exhibits a low collision probability.Karsh, Laskar, and Richhariya (2016) proposed image hashing on the basis of ring-based projected gradient non-negative matrix factorization (PG-NMF) features and local features.PGNMF generated features are combined with salient region-based features to produce the final hash.The method is robust to content preserving operations and is capable of localizing the counterfeit area.Tang, Dai, and Zhang (2012a) proposed perceptual hashing for color images using seven invariant moments.These moments are invariant to translation, scaling and rotation and have been widely used in image classification and image matching.
Although many hashing algorithms have been reported, there are still some practical problems in hashing design.More efforts are needed for developing high-performance algorithms having a desirable balance between robustness and discrimination particularly considering shifting and shearing attacks.Few of the algorithms have claimed varying degrees of success against shearing (Lei, Wang, & Huang, 2011;Zou et al., 2013;Lv & Wang, 2012).However, for shifting only one author reported its use (Wu, Zhou, & Niu, 2009).
In this work, we have proposed an image hashing based on a Hough transform and DWT which is robust to shifting & shearing apart from giving comparable performance against different image processing attacks.The key advantage of using Hough transform is that it is tolerant of gaps in the edges and relatively unaffected by the noise & occlusion in the image.DWT can be used to convert a signal to its approximation based short representation.As shifting & shearing attacks change the orientation of the image, keeping rest of the image contents unchanged, Hough transform is applied to get its unique edge based feature for better identification.Many experiments have been conducted to validate the efficacy of our technique.receiver operating characteristics (ROC) curve comparisons with some of the representative hashing algorithms are also done, and the results indicate that proposed hashing outperforms the compared algorithms in terms of classification performance.The rest of the paper is arranged as follows: The next section describes the proposed image hashing followed with the section that gives experimental results.

PROPOSED IMAGE HASHING
In this section, we analyze the basic properties of the DWT followed with a brief description of canny edge detector.Hough transform, which is used to generate the image hash on the basis of canny edge detection is then explained in detail.Finally, the proposed approach is given which is based on the features given above.

Discrete Wavelet Transformation
In image processing, 2D wavelet is of great importance where the transformation is first applied along the rows of the image followed by transformation along the columns of the image.Such a process generates four sub-band regions LL, LH, HL and HH where LL represents blur and LH, HL & HH represents horizontal, vertical and diagonal differences respectively (Lu & Hsu, 2005;Thanki, Dwivedi, & Borisagar, 2017).DWT decomposes a signal into a set of mutually orthogonal wavelet basis functions and it is invertible, which means that the original signal can be completely recovered from its DWT representation.The main advantage of using wavelet transformation is its efficiency in converting a signal to its short representation (Tang, Dai, Zhang, Huang, & Yang, 2014a).
Let A is N x N matrix; W N is a wavelet transformation matrix; W N T is the transposed values of W N .The product A*W N T processes the rows of A into weighted averages and differences.Similarly, the product W N *A simply transforms the column of A into weighted averages and differences.Thus, two-dimensional DWT can be easily represented as W N AW N T .In our implementation, we have used Daubechies wavelet transform where four-term orthogonal filter is constructed by using low-pass filter h=(h 0 ,h 1 ,h 2 ,h 3 ) and the high-pass filter g=(g 0 ,g 1 ,g 2 ,g 3 ).Mathematically, such a wavelet transform built from given h and g that is applied to vectors of length N=8 can be written in block format as follows: (1) Next, we compute W 8 W 8 T and show that if W 8 orthogonal then it gives: (2) where I 4 is the 4x4 identity matrix and 0 4 is the 4x4 zero matrix.After computation we get the following value for I 4 : (3) where a = h 0 2 + h 1 2 +h 2 2 + h 3 2 and b = h 0 h 2 +h 1 h 3 .In this way, the following nonlinear equations are generated and are used to produce the Daubechies filter components.
The above generated filter components are used to produce the different sub-bands of the input image.In our proposed algorithm, we only use the approximation values (LL) of the transformed image for further steps of hash generation.

Hough Transform
Hough transform is used to identify specific shapes in an image.It converts all the points in the curve to a single location in another parameter space by coordinate transformation (Fig. 1).Hough transform is applied to the image that is obtained after applying one of the edge detection algorithms like canny edge detection, which returns a binary image containing 1's where it finds edges in the input image and 0's elsewhere (Shih, 2010).The Hough transform is used to detect straight lines uses the following parametric representation of the line (Aminuddin et al., 2017). ( Here r is the distance from the origin to the line, along a vector perpendicular to the line and theta is the angle between the x-axis and the line (Shih, 2010).The calculation of the Hough transform is a parameter space matrix whose rows and columns correspond to the values of r and theta, respectively.For every point of interest in the image, r is calculated for every theta and it is rounded off to the nearest value.The value of that accumulator cell is incremented by one.At the end of this procedure, any value T in the matrix means that T points in the XY plane lie on the line specified by distance r and angle theta.Peak values in the matrix represent the potential lines in the input image.The algorithm for Hough transform can be given as follows: 1.
Identify the maximum and minimum values of r and theta.

2.
Subdivide the parametric space into accumulator cells.

3.
Initialize the accumulator cells to be all zeros.

4.
For all edge points (x,y) in the image a.
Use gradient direction for theta.b.
Compute r from the equation.c.
Increment A(r, theta) by one.
In the end, any value Q in A(r,theta) means Q points in the XY plane lie on the line specified by angle theta and r.

6.
Peak values in the accumulator matrix A(r,theta) represents potential lines in the input image.
Hough transform maps each of the points in the input image into sinusoids.As given above, the Hough transform is tolerant of gaps in the edges and therefore it is relatively unaffected by the noise in the image.

Implementation Approach
The hash generation algorithm consists of various steps of preprocessing, transformation and hash generation.The process for feature extraction is shown in Fig. 2.

Preprocessing
The first step is normalization in which the input image is normalized by employing image resizing and color space conversion.Image resizing is used to resize the original image to a standard size of 512´512.The image thus produced is converted to a grayscale image for further processing and hash generation.

Transformation and Hash Generation
In the next step, the processed image is filtered through 2D DWT by using Daubechies wavelet filter.After applying the wavelet transform, the four different sub-bands are generated where we use approximation coefficients of size 256x256 for further processing.The Canny edge detection is then applied to the approximation matrix to produce a binary image (BW).It is important here to specify that BW is logical and having a size of 256x256.Hough transform is then applied to the generated BW matrix to produce a matrix of size 1445x360, where the rows correspond to the distance bins and the columns correspond to the angle in theta.Row-wise mean is calculated to produce a column vector of size 1445x1.Such integer column vector is used as an image hash for image identification.

Similarity Metric
To measure the similarity between a pair of hashes, L1 norm is used, which is one of the standard methods used for measuring the hash distance (Lei, Wang, & Huang, 2011).Let h1 and h2 be two image hashes, then hash distance can be calculated as follows: (9) If the hash distance (HD) is less than a predefined threshold T, the images are considered to be visually identical.Otherwise, they are classified as different images.

EXPERIMENTAL RESULTS
To demonstrate the efficacy of the proposed mechanism, we conduct a series of experiments to verify the proposed approach's accuracy, efficiency and sensitivity against a number of image processing attacks.

Robustness Analysis
The proposed technique is applied to test images from the USC-SIPI image database (USC-SIPI, 2007).A sample of some of the standard images is shown in Fig. 3.Each of the original images is used to create 88 modified versions by employing a number of image processing operations such as rescaling, brightness adjustment, contrast adjustment, gamma correction, Gaussian low-pass filtering, rotation as these are used in most of the research papers of the area of image hashing (Tang et al., 2014b), (Tang et al., 2013a), (Tang et al., 2008), (Tang et al., 2014c).The modified versions were created using MATLAB with the attack parameters as shown in Table 1.For example, we take an input image like 'Airplane' and create its brightness adjustment based attacked copies by changing its intensity values as mentioned in Table 1.Similarly, duplicate copies based on different attacks of the original image 'Airplane' is created by using attacks given in Table 1.This process of duplicate image creation will be over, only when we create the duplicate copies of all the original images which are to be used in the experiment.After generating the "duplicates", hashes are extracted from all the images including the original one and the Hash distance is calculated between the original image and its duplicate copies.Definitely, the hash distance in such a scenario represents the distance between hash of each of the original image and its different attacked copies.The hash distance value, which is categorized on the basis of different attacks for all the considered images, is given in Table 2.  Table 2 presents the maximum, minimum and mean of hash distance under different attacks.It is observed that all the mean values are less than 0.7, while the maximum distance, taking into account all attacks, is less than 2.6.It is justifiable to choose a distance value of 0.78 as the threshold on which the proposed technique is resistant to most image processing operations.In this case, 96.59% visually similar images are correctly identified as copies of original images.In this experiment, we have used 10 original images and created 88 copies of each of the original images to produce 880 copies.Out of the total number of 880 duplicate copies, our system can correctly identify 850 as copied ones, i.e. 96.59% as copied ones.Ideally, we are looking for that threshold value where the percentage of visually similar images identified as copied is higher and the percentage for different images identified as similar images is low.It is inferred that the threshold value of 0.78 gives good experimental results.

Discrimination Analysis
To demonstrate discriminability, 36 different images of sizes ranging from 225´225 to 2144´1424 are collected from USC-SIPI database (USC-SIPI, 2007).The Hash distance is calculated between each pair of 36 images to generate 630 different hash distances.The distribution of such Hash distances is shown in Fig. 4. The maximum, minimum, mean and standard deviation calculated on the basis of 630 hash distances are 5.93, 0.417, 1.8 and 0.99 respectively.If the threshold is 0.78, then 4.45% different images are falsely identified as similar images, which is because out of 630 calculated hash distances 28 is having values less than a threshold of 0.78.In general, a small threshold will improve the discrimination but simultaneously decreases robustness.Keeping in view this important point, threshold must be chosen depending upon the requirements of the specific application.
The mean value of discrimination is 1.8, which is more than four times larger than the highest mean of robustness except for Gaussian noise and median filter.For Gaussian noise and median filter, the mean of discrimination value is almost three times larger than the highest mean of robustness.Also, the maximum value of discrimination is 5.93 which is more than six times larger than the maximum value of robustness, except for Gaussian noise and salt & pepper.For Gaussian noise and salt & pepper, this value is almost more than twice the maximum value of robustness.Ideally, a copy detection technique should exhibit very low values corresponding to robustness and high values corresponding to discrimination.This would imply that the technique is capable of correctly identifying duplicated copies while at the same time rejecting different images.Keeping in view this definition of robustness and discrimination, the proposed hashing exhibits promising results as evidenced by the graph shown above.

Normalized Average Mean Value Difference
The observations based on the Hash distance presented in the previous section were categorized on the basis of different attacks.In this section, the analysis is performed image-wise and the maximum, minimum, mean and standard deviation of the Hash distance is calculated by considering all but one of the attacks.Since, 16 different kinds of attacks have been considered, such an analysis results in 16 different maximum, minimum, mean and standard deviation values.Further, a set of values is obtained when all the attacks are considered together.The difference between the mean values is obtained by considering all the attacks and all but one of the attacks.Finally, the averaging of the difference values is done in order to reach to some conclusion.One of the important reasons to consider such an analysis is to analyze the effect of

Hash distance
Frequency different attacks on the proposed approach.However, in this article for the sake of brevity & without sacrificing any understandability, only the mean values of the proposed technique are included for calculation.
Acronym used in the Table 3 indicates the mean values that are obtained by considering all the attacks except the attack represented by the acronym.For instance, column 3 depicts the mean values obtained for the image by considering all attacks except brightness adjustment (BA).Other acronyms are: contrast adjustment (CA), Cropping (Crop), Gamma correction (GC), Gaussian low-pass filtering (GLPF), Gamma correction (GN), JPEG compression (JPEG), Median filter (MF), Rescaling (RE), Rotation (RO), Salt and pepper noise (S&P), Speckle noise (SPK), Shifting-H(S-H), Shifting-V(S-V), Shifting-HV(S-HV) and Shearing (Shea).To make a fair comparison of the obtained mean values, four state-of-the-art techniques are referenced.
Table 3 represents the mean values obtained by the proposed approach.Similarly, we obtained the mean values by using the techniques reported by (Tang et al., 2014b), (Tang et al., 2013a), (Tang et al., 2008) and (Ou et al., 2009) respectively.It is important here to emphasize that there may be slight difference between the values reported by (Tang et al., 2014b), (Tang et al., 2013a), (Tang et al., 2008), (Ou et al., 2009) and the values obtained by us.This is because the dataset used by us is different as compared to the dataset used by the reported techniques.Also approach adopted to generate duplicate copies of original ones also differs.However, in our experiment in order to make a fair comparison among all the reported techniques and the proposed technique, they are evaluated on the same dataset.Therefore, the comparison here can be correctly used to draw any findings from the calculated results.

Performance Comparison with State-of-the-art Techniques
Performance comparison of the proposed technique with the state-of-the-art techniques is also done in terms of robustness and discriminability by using ROC curve.The techniques compared with include (Tang et al., 2014b), (Tang et al., 2013a), (Tang et al., 2008) and (Ou et al., 2009).In (Tang et al., 2014b), the images were pre-processed by converting to a dimension of 512´512 image, application of Gaussian filtering and then converted to YCbCr for hash generation.In (Tang et al., 2013a) the image is resized to 512´512, followed by color conversion to YCbCr and HSI color models.In (Tang et al., 2008), the image is resized to 512´512 followed by gray-scale conversion for hash generation.In (Ou et al., 2009), the images are resized to 512´512 followed by conversion to YCbCr and application of 5´5 Gaussian filtering to generate the final image which is used for hash generation.To represent the performance in terms of robustness and discriminability, the receiver operating characteristics (ROC) curve is employed which is usually plotted between the true positive rate (TPR) and the false positive rate (FPR).These parameters are defined as: (10) where n1 is the number of visually identical images correctly identified as copies and N1 is the total number of identical images.Similarly, n2 is the number of different images incorrectly identified as a copy and N2 is the total (3) 1.4 [Tang et al., 2014b] [Tang et al., 2013a] [Tang et al., 2008] [ Ou et al., 2009] Proposed NAMVD number of different images.TPR and FPR can be used to evaluate the robustness and the discriminability respectively.If two algorithms exhibit the same TPR, then the algorithm with the lower FPR is considered as better performing.Similarly, if two algorithms exhibit the same FPR, then the algorithm with the higher TPR is considered to be better performing.In order to draw the ROC curve, it is important to calculate the above parameters for varying thresholds.In general, a small threshold will improve the discrimination but simultaneously decreases the robustness.
The ROC curve for different algorithms including the proposed is given in Fig. 6.The various thresholds used for producing the ROC for all the algorithms are given in Table 6.From Fig. 6, it is evident that the ROC curve of the proposed technique is closer to zero as compared to the techniques reported in Tang et al. (2014b), Tang et al. (2013a), Tang et al. (2008) and Ou et al. (2009).The value of TPR when FPR = 0 in case of Tang et al. (2014b), Tang et al. (2013a), Tang et al. (2008) and Ou et al. (2009) is 0.61, 0.85, 0.38, 0.30 respectively while for the proposed technique the value is 0.93.Similarly, the value of the FPR when TPR = 1 in case of (Tang et al., 2014b), (Tang et al., 2013a), (Tang et al., 2008) and (Ou et al., 2009) is 0.91, 0.21, 1.0, 0.98 respectively while for the proposed technique the value is 0.13.Taking into account the values of the robustness and the discriminability from the previous subsection, along with the TPR and FPR values obtained in this subsection, it is quite clear that the proposed hashing technique outperforms some of the notable hashing techniques.[Tang et al., 2014b] 48.348 200 0.2417 [Tang et al., 2013a] 57.469 200 0.2873 [Tang et al., 2008] 228.458 200 1.1422 [Ou et al., 2009

Distribution of Hash Distance
To evaluate the distribution of the Hash distance, two sets of image datasets were employed.One for the similar images and the other for dissimilar images.
To produce the dataset of similar images, 225 unique images are taken like Airplane, Baboon, Lena in addition to images from the 17 Category Flower dataset (17 Category).For each of these images, 6 copies are generated using different image processing attacks to produce a set of 1350 similar images.The attacks applied include rotation, rescaling, Gaussian noise, brightness adjustment etc.Similarly, for the dataset of dissimilar images, 1350 different images are taken from the 17 Category Flower dataset (17 Category).After arranging the images in the dataset, Hash distance is calculated by using the proposed algorithm.
The distribution of the Hash distance for both similar and dissimilar images is given in Fig. 7 and Fig. 8 respectively.Here, the threshold value used is 0.78, as it is identified during the robustness and the discrimination analysis.It is evident from these figures that the Hash distance between similar images is below threshold of 0.78, with a few exceptions.More specifically, out of 1350 similar images 1305 images return a Hash distance less than the threshold i.e. 96.66% of the total images within the dataset return a Hash distance below the threshold.Correspondingly, most of the dissimilar images return a Hash distance well above the threshold i.e. out of 1350 different images, 58 images return hash distances below the threshold.Therefore, we can say that 4.29% different images are identified as similar ones.This analysis proves the efficacy of the proposed approach.Also, the number of outliers in both the categories of (similar and dissimilar) images conforms to the true positive and false negative analysis performed for evaluating robustness and discrimination.

Comparison between Different Variants of Wavelet Transform
Implementing a hashing technique based on the DWT requires the calculation of wavelet coefficients at different levels.Any change in the level of DWT would change the corresponding coefficients.This section verifies the effect of DWT by calculating the hash for different levels of DWT, whereas keeping the rest of the parameters constant.It is important here to specify that the proposed technique makes use of single level of 2D DWT.To demonstrate the effect of this variation, a set of 300 images were taken from 17 Category Flower dataset (17 Category).A further 88 "copies" were produced from a single image after applying various image processing operation (Table 1) leading to a total dataset size of 388 images.Initially, single level 2D DWT is applied in the proposed algorithm for finding duplicate copies of the original image.In the next iteration, two level 2D DWT is used for finding the duplicate copies.This procedure is repeated until we reach the level six of 2D DWT.
To effectively represent results of this analysis, ranking of the results based on the Hash distance is done.Ranking is basically used to represent the order in which multiple copies of the single image are found in the large dataset of images.For copy detection, ideally we require that the rank at which all the copies are found must be equal to the number of copies, i.e. the copied images should be represented at top with low rank and the non-copied images should have higher rank as compared to copied one.The rank of the first 83 copies of the original image at one, two, three, four, five and six levels of DWT  , 90, 105, 100, 196 and 193 respectively.We can easily draw conclusion from the given values that at lower level of DWT lower rank is generated and at the higher level it generates higher rank.Specifically, at level one we obtained the lowest rank of 83 among all the compared levels.Therefore, the best performance of the proposed technique can be obtained when one level 2D DWT is used for hash generation and comparison.It is important here to clarify that result of 88 copies is not considered in this analysis as due to some outliers we are getting higher ranks for all the levels of DWT.However, such result conforms to the results obtained for 83 copies.The representation of ranks is shown in Fig. 9.

CONCLUSION
In this paper, we have presented a robust image hashing technique that employs Discrete Wavelet Transform and Hough transform to generate an image hash, which is used to differentiate the duplicate copies of images from their original ones.Many experiments have been conducted to validate the performances of the proposed hashing.Normalized average mean value difference (NAMVD) is calculated to show that the proposed technique shows remarkable robustness to various shifting operations apart from performing well for other content preserving operations like rotation, contrast adjustment.Compared with four standard algorithms, the proposed technique achieves better performance in terms of ROC curves, which clearly shows that proposed technique is having better classification in terms of robustness and discrimination.The proposed technique is also evaluated to know the effect of different levels of DWT in its performance.The result shows that level one gives best results as compared to different levels.Lastly, the execution time of the proposed approach is measured which is the smallest as compared to other referenced techniques.Therefore, proposed method can be used for contentbased image authentication in large-scale image databases.

Figure 1 .
Figure 1.Parametric description of a straight line.

Figure 2
Figure 2 Basic block diagram of the proposed hashing.

Figure 4 .
Figure 4. Distribution of hash distance for discrimination.

Figure 5 .
Figure 5. Normalized average mean difference values for different techniques.

Figure 6 .
Figure 6.ROC curve comparison between proposed and other hashing algorithms.

Figure 7 .
Figure 7. Distribution of hash distance for similar images.

Figure 8 .
Figure 8. Distribution of hash distance for different images.

Figure 9 .
Figure 9. Ranking of results based upon hash distance for different level.

Table 1
Generation of Duplicate Copies of Original Images

Table 2
Maximum, Minimum, Mean and Standard Deviation of Hash Distances for Different Attacks

Table 3
Mean values of hash distances under different attacks for the proposed technique

Table 4
Average mean values difference of compared techniques for different attacks

Table 5
Normalized average mean values difference of compared techniques for different attacks

Table 6
Ou et al. (2009)8)) generating ROC curves of different algorithmsThe running time of the proposed algorithm is analyzed by generating an image hash of 200 different images.The image hashes are generated by using a computer having an Intel Pentium Core-2-Duo processor with a clock frequency of 1.8 GHz and 4GB of RAM.The MATLAB version used was R2014b.The average time for hash generation as reported inTang et al. (2014b),Tang et al. (2013a),Tang et al. (2008)andOu et al. (2009)is 0.24, 0.28, 1.14, 0.5 seconds respectively while for the proposed algorithm it is 0.22 seconds.It is evident from the Table7that the execution time of the proposed technique is smallest as compared to few of the notable algorithms.

Table 7
Summary of execution time for different algorithms