2013

Acceleration of Vertex Component Analysis for Spectral Unmixing with CUDA Shih-Chieh Wei1, Bormin Huang2, and Antonio Pl...

3 downloads 56 Views 3MB Size
Acceleration of Vertex Component Analysis for Spectral Unmixing with CUDA Shih-Chieh Wei1, Bormin Huang2, and Antonio Plaza3 1

Department of Information Management, Tamkang University, Taiwan

2

Space Science and Engineering Center, University of Wisconsin-Madison, USA

3

Department of Technology of Computers and Communications, University of Extremadura, Spain Abstract

Hyperspectral images can be used to identify the unique materials present in an area.Due to the limited spatial resolution, each pixel of the image is considered as a mixture of several different pure substances or endmembers. Several spectral unmixing methods have been developed for endmember extraction in an image. Among them, the vertex component analysis (VCA) algorithm is a popular one for its superior performance. As there are a lot of matrix/vector operations involved in the VCA algorithm, this work aims to apply the highly parallel computing power of recent GPUs which are reported to have good success in acceleration of many compute intensive applications. In the experiment, the compute unified device architecture (CUDA) which provide more convenient programming model is used. The speedup is measured with respect to standard C code on a single core CPU for evaluation.Our experiments are performed on a typical case where the number of extracted endmembers is 30 from the 188-band Cuprite hyperspectral dataset.The resultsshow that a speedup of 42x can be achieved on a pure GPU implementation using CULA and CUBLAS libraries. As VCA involves Singular Value Decomposition (SVD) operation and SVD is faster on CPU than GPU for small data sizes as is our case, a speedup of 58x can be achieved on a hybrid implementation when SVD is carried out on CPU. Index Terms:Spectral Unmixing, Vertex Component Analysis, CUDA.. 1. Introduction Hyperspectral remote sensing has been widely applied in various research fields such as geology, meteorological research, urban studies, and agriculture [1]. Hyperspectral imaging instruments collect hundreds of images on different wavelength channelsfor the same area on the surface of the Earth [2].For example, the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS), operated byNASA’sJet Propulsion Laboratory, captures images using224 spectral bands withnarrow spectral resolution of 10 nm and swath width about 10.6 km. Fig. 1 shows the concept of hyperspectral imaging. Hyperspectralimages providereflectance valuesthat can be interpreted as spectral signatures providing detailed information about the observed materials and which can be very useful to conduct image analysis and interpretation tasks. Due to the often more limited spatial resolution of hyperspectral images, a single pixel inthe scene can comprise the response of different spectrally pure materials. Every pixel is therefore a mixtureof several different substances.To accurately characterize thesesubstances, spectral unmixingalgorithmsaregenerally High-Performance Computing in Remote Sensing III, edited by Bormin Huang, Antonio J. Plaza, Zhensen Wu, Proc. of SPIE Vol. 8895, 889509 · © 2013 SPIE · CCC code: 0277-786X/13/$18 · doi: 10.1117/12.2031527

Proc. of SPIE Vol. 8895 889509-1 Downloaded From: http://spiedigitallibrary.org/ on 08/27/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

designned to unmix a pixel into a series of refleectance curves of different substances [3, 4, 5].Nowadayys many popular specttral unmixing algorithms asssumelinear innteraction betw weendifferent pure p substancees members), weig ghted by the abuundance fractioons of each endmember [6]. (endm

Fig. 1. The hypperspectral imaggingconcept1

VCA) algorithm m is a popular approach to perform p automaatic endmember The veertex componeent analysis (V extraction in hypersspectral sceness [7]. Likely thhe Pixel Purityy Index (PPI) [8] and the N-FINDR N [9], it mes the presencce of pure pixxels in hypersspectral imageesandgenerally shows good performance as a assum compaared to other ap pproaches.In order o to apply this t algorithm in i scenarios thhat requirea reaal-time responsse, the allgorithm shou uld be accelerrated.In previous research, a parallel im mplementationn of VCA waas conduuctedon Symmeetrical Multiprrocessing (SMP P) cluster enviironments [10]].It was experiimentally show wn that, when w the numb ber of the SMP P nodes increased to 20, the method obtainneda speedup of o 9x. Howeveer, this im mplementation neglectsthe diimensionality reductionstep r i included in thee original VCA A algorithm annd exhibiits higher comp putational cost.. Reccent developm ment in graphiic processing units (GPUs)) opened a neew door for general-purposse massivvely parallel implementation i ns using com mmodity speciaalized hardwaare. With the help of GPU Us, researcchers can signiificantly acceleerate processinng algorithms. After A NVIDIA A released the Compute C Unifieed Devicee Architecturee (CUDA) prrogramming model, m paralleel programminng on GPUs became morre convennient and wideely available, too the point thatt researchers caan now port their algorithms to domestic annd professsional GPUs usingCUDA u [11, 12, 13].In this work, we w propose a new n strategy too accelerate thhe 1

Datta source: http://sspeclab.cr.usgs.ggov

Proc. of SPIE Vol. 8895 889509-2 Downloaded From: http://spiedigitallibrary.org/ on 08/27/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

vertexx component analysis (VCA A) algorithm on GPUs too enhance its computationaalefficiency annd obtainnreal-time perrformance. In our experim ments using real hyperspeectral data, the t GPU-baseed implem mentation of VCA V competes with thestandaard C code and gets better perrformance. Thee remainder of this paper is structured as foollows. Sectionn 2 introduces the t vertex com mponent analyssis algoritthm. Section 3 describes general g GPU programming p aspects. Section 4describesour GPU-baseed implem mentation.Secttion 5providess experimentall results, and section 6conncludes the paaper with som me remarkks. 2.Verttex Componen nt Analysis (V VCA) Algorithm Thee vertex compo onent analysis algorithm is an a unsupervisedd spectral unm mixing method.. It assumes thhat each pixel p in the hyp perspectral imaage is linearly mixed by seveeral endmembersin proportioons given by thhe correspponding abund dance fractionss. In a linear mixing m system, we denote the number of speectral bands asb, the nuumber of endm members to be found in a single pixel aspp. With these definitions d in mind, m the linear mixturre model is giv ven by: (1) whereodenotesthe b--dimensional vector v of a singgle pixel observvation on contiguous spectrall bands, M is thhe m mixing matrixw which presentss the p existingg endmemberss,andwhere a column c in this matrix containns the enndmember sig gnature of a certain c endmeember. is a scale s factor which w modelstthe illuminatioon variabbility caused by y surface topoography, is thee p-dimensionnal vector ofabbundance contrributionsof eacch endmeember in the mixture, m andn iss the system nooise.

F 2. An exampple of VCA iteraatively projects Fig.

Thee VCA assumees that the verttices of a simpplex are the enndmembers to be found, andd that the affinne transfoormation of a simplex s is also a simplex.This algorithm performsthe folloowing two stepps: 1) Dimensionallity reduction and noise chaaracterization.. This step first calculates thee signal-to-noisse ratio (SNR) of input imagge,and comparees it with athrreshold SNRthh. The spectrall image will be b projected onto asubspaceusingprincipal component analysis (PCA) [14]if SNR

Proc. of SPIE Vol. 8895 889509-3 Downloaded From: http://spiedigitallibrary.org/ on 08/27/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

SNR , otherwisse

the projection n is performed with singular value v decompoosition (SVD) [15]. 2) VCA main procedure.Thiss step iterativelly projects dataa onto a directiion orthogonall to the subspacce spanned by th he endmemberrs already deterrmined. Theenndmember signnature to be fouund correspondds to the extrem me of the projeection. The alggorithm runs until u all endmeembers are souught out. Fig. 2 illustrates graaphically this prrocedure in a case c of two enddmembers withh scale factor

1.

In thhescenario sho own by Fig. 2, all mixed pixeelsbelong to thhe simplex S.Thhe simplex Sp is the projectivve projecction onto the p-dimensional p plane. The daata are first proojected onto thhe direction f1. The extreme of o this prrojection is thee firstendmembberma. In the next iteration, thhe data are proojected onto diirection f2 whicch is orthhogonal to ma and a find endmeenbermb. 3. GPU Programming Aspects A GPU U computing in nvolved the usse of OpenGL or DirectX thrrough graphicss APIs in earlyy years. With thhe rapid development d of o general-purppose GPU, masssively parallell programmingg using GPUsbeecameextremelly populaar. After NVID DIA released thhe CUDA archhitecture, GPU programmers get a new easiier way to finissh the poorting from CP PU to GPUs. With W CUDA, an a extension of o standard C, many applicaations have beeen develooped in recent years. Parallell programs cann be createdin CUDA and exxecutedusing alarge a amount of o threadds under the Single S Instructtion Multiple Thread (SMIIT) strategy. Threads T are sttructured into a grid-block-thread thrree layer modeel.

Grid Block (o, o)

Block (l, Io>

Shafired Memory (116/48KB)

Z

Shared Ailemory(1 6/48]KB)

Register

egister

-t--"

threa

th[read (1, 0)

Loca Mem

L