Satellite image classification using proposed singular value decomposition method

In this work, satellite images for Razaza Lake and the surrounding areadistrict in Karbala province are classified for years 1990,1999 and2014 using two software programming (MATLAB 7.12 and ERDASimagine 2014). Proposed unsupervised and supervised method ofclassification using MATLAB software have been used; these aremean value and Singular Value Decomposition respectively. Whileunsupervised (K-Means) and supervised (Maximum likelihoodClassifier) method are utilized using ERDAS imagine, in order to getmost accurate results and then compare these results of each methodand calculate the changes that taken place in years 1999 and 2014;comparing with 1990. The results from classification indicated thatwater and hills are decreased, while vegetation, wet land and barrenland are increased for years 1999 and 2014; comparable with 1990.The classification accuracy was done by number of random pointschosen on the study area in the field work and geographical data thencompared with the classification results, the classification accuracy forthe proposed SVD method are 92.5%, 84.5% and 90% for years1990,1999,2014, respectivety, while the classification accuracies forunsupervised classification method based mean value are 92%, 87%and 91% for years 1990,1999,2014 respectivety.


Introduction
Remotely sensed imagery can be used in a number of applications, A principal application of remotely sensed data is to create a classification map of the identifiable or meaningful features or classes of land cover types in a scene [1]. Classification is one of the data mining methods which are used to classify the object into predefined group. It is the most frequently used decision making tasks of human activity. A classification problem occurs when an object needs to be assigned into a predefined group or class based on a number of observed attributes related to that object. The classification also plays very important role in the remote sensing and satellite image classification [2]. Image classification is a complex process that may be affected by many factors. Huge number of classification techniques can be found in the literature; mostly they have been categorized as either supervised or unsupervised methods. The supervised techniques are often required sort of prior knowledge in selecting correct region of interest "ROI", inadequate selection of "ROI" or the number of correct existed regions, often, yields an inadequate classification results, while the unsupervised methods need to identify the correct number of regions existed in the processed image. In this paper, we classified Landsat satellite images for Razaza lake and the area surrounding district in Karbala province for years 1990,1999 and 2014 using two software programming (MATLAB 7.12 and ERDAS imagine 2014. Unsupervised and supervised classification method have been used to classify satellite image; these were unsupervised method (mean value), supervised methods (Singular Value Decomposition) using MATLAB software, and unsupervised method (K-Means), supervised method (Maximum Likelihood Classifier ) using ERDAS imagine.

The study area
The study area chosen is Razaza lake and the surrounding area. The total area is 7101.4 km². It is bounded between the longitudes 42° 83ʹ to 43° 62ʹ E and Latitude 32° 03ʹ to 33° 42ʹ N. it is lying 15 km west of Karbala. Razaza linked from the north by Habbaniyah Lake by Nazim Al-Warawr Canal, and surrounded by the other three sides of the land of desert interspersed with some hills. The satellite image for the study area is capture from landsat-5,lansat-7 and landsat-8 for years 1990, 1999 and 2014 respectivety. The bands which are chosen (2, 3 and 4) with 30m spatial resolution. Fig.1 shows the original map of Iraq. While Fig. 2 shows the study area for period 1990,1999 and 2014.

The proposed classification methods
Two classification methods, were performed; unsupervised and supervised on the study area for the years 1990,1999 and 2014 using MATLAB programs as follows:

Unsupervised classificationusing mean value
Unsupervised classification is based on value of mean. The mechanisme of this classification can be illustrated in the following steps: 1. Read the Satellite image. 2. Partitioned the Satellite image into fixed size blocks. In this work we used block size (3*3). 3. For each block compute mean value, and then determine the maximum and minimum value of mean for these blocks. 4. To get classes subtract the maximum value from the minimum then divided on number of class. 5. Add the results of step (4) to minimum value to get first class then add it to first class to get second class, and so on .... 6. Finally classification decision is made according to comparing the mean value for each block with mean values of classes.the block assigned to the class when the mean value of block lying in class.

Supervised classification using singular value decomposition
The need to minimize the amount of digital information stored and retrieved is an ever growing concern in the modern world, singular value decomposition (SVD) is an effective tool forminimizing data storage and data transfer. SVD is one technique of semantic indexing (SI). Singular value decomposition (SVD) is proposed to perform supervised classification method, it is consists of two phases: the training and classification; The training phase is responsible on storing the classes in the database, while the task of classification phase is to compute the similarity measure between the SVD of the target image and SVD of the classes found in the database. The following steps illustrated the mechanism of this proposed classification: 1. Read the satellite image. 2. Partitioned the Satellite image into fixed size blocks. In this work we used block size (3*3). Each block of the satellite image represent the query vectors (q). 3. The satellite image that selecte consist of three band (green, red and near infrared) and five class (water, vegetation,Wet land, hills and Barren land). Select five blocks each block represents class, then putting it in document matrix A (m*n): where m represents number of class element and n represents number of class. 4. Compute transpose of document matrix (A) then calculate value of (AᵀA). 5. Compute (K) that represent largest eigenvalues and eigenvectors. 6. Compute the coordinates of documents in the k-dimensional orthogonalspace by the SVD-free approach [3]: 4.Compute the coordinate of query vector q [3]: where S represents square root of (S₂) 5.Compute the similarity coefficients between the query vector and documents [3]: z =qc*v(i,:)ᵀ/ (norm(qc)*norm(v(i, : ))) (3) sim(i)=1-acos(z) (4) where; sim represents the similarity.

6.
Finally classification: the classification decision is made according to similarity measures of each block, the block assigned to the class that gives highest similarity measure. Fig.3 shows the block diagram of mean value and singular value decomposition (SVD) classification methods for the satellite images using MATLAB software.
The results from applying mean value classification method can be shown in Figs Table 4 presents the change in the years 1999, 2014; comparing with 1990. The results indicated that five classes found comparison with the original image. These classes represent five major features in the study area (water, vegetation, wet land, hills and Barren land).

Accuracy assessment
Classification accuracy assessment is a general term for comparing the classification to geographical data that are assumed to be true to determine the accuracy of the classification process. Usually, the assumed true data are derived from ground truth. Ground truth or field survey is done in order to observe and collect information about the actual condition on the ground at a test site and determine the relationship between remotely sensed data and object to be observed. It is recommended to have a ground truth at the same time of data acquisition, or at least within the time that the environmental condition does not change, it is usually not practical to ground truth or otherwise test every pixel of a classified image, therefore a set of reference pixel is usually used, reference pixels are points on the classified image for which actual data will be known. The reference pixel is randomly selected [4]. The most common tool used for the classification accuracy assessment is in terms of a confusion (or error) matrix. A confusion matrix is a square array of dimension n × n, where n is the number of classes. The matrix shows the relationship between two samples of measurements taken from the area that has been classified. The first set represents test data that have been collected via field observation, inspection of agricultural records, air photo interpretation, or other similar means. The second sample is composed of the labels of the pixels, allocated by the classifier, that correspond to the test data points.
The columns in a confusion matrix represent test data, while rows represent the labels assigned by the classifier.
The main diagonal entries of the confusion matrix represent the number of pixels that are given the same identification by the test data and the classifier, and these are the number of pixels that are considered to be correctly classified. Several indices of classification accuracy can be derived from the confusion matrix. The overall accuracy is obtained by dividing the sum of the main diagonal entries of the confusion matrix by the total number of samples. In order to assess the accuracy of each information class separately, the concepts of producer's accuracy and user's accuracy can be used. For each information class i in a confusion matrix, the producer's accuracy is calculated by dividing the entry (i, i) by the sum of column i, while the user's accuracy is obtained by dividing the entry (i, i) by the sum of row i. Thus, the producer's accuracy tells us the proportion of pixels in the test data set that are correctly recognized by the classifier. The user's accuracy measures the proportion of pixels identified by the classifier as belonging to class i that agree with the test data [5,6]. For example, Fig. 17 shows an error matrix with producers, users, and overall accuracy calculations for a simple 6-class. In this work, the classification accuracy was done for years 1990, 1999 and 2014. Number of random points which chosen on the study area in field work were then compared with the classification results. Fig.17: Example error matrix with producers, users, and overall accuracy calculations [6].
Tables 5-7 present error matrix with producers, users, and overall accuracy calculations for unsupervised classification using mean value for years 1990, 1999 and 2014, respectively. While Tables8-10 present error matrix with producers, users, and overall accuracy calculations for supervised classification using SVD for years 1990, 1999 and 2014, respectively.      and 91% for years 1990,1999,2014 respectivety, it is so nearest from supervised classification results. While the fluctuations from using K-mean method make them not trusty, this is because of the many thresholds used in their performance (i.e. maximum number iterations, minimum number of points in the class, maximum class standard deviation, maximum class distance, maximum number of merge pairs, maximum std from mean, and maximum distance error). 5. The results from classification indicated that water and hills in decrease, while vegetation, wet land and barren land in increase for years 1999 and 2014; comparable with 1990. The cause of contraction water area due to scarcity of rainfall and desertification, conversely the vegetation and wet land increase. While the hills in decreasing conversely increase in the barren land this due to erosion factors.