v TheIBMVirtualSharedDiskcomponentprovidingdiskdriver levelsupportfor GPFSclusterwidedisk
accessibility.
v TheIBMRecoverableVirtualSharedDiskcomponentprovidingthecapabilitytofenceanodefrom
accessingcertaindisks,whichisaprerequisitefor successfulrecoveryofthatnode. Italso providesfor transparentfailoverof diskaccessintheeventofthefailureofadiskserver.
Softwaresimulationof aSANisprovided bytheuse oftheIBMVirtualShared DiskcomponentofRSCT. Disksattached inthis mannerareformattedintovirtualshareddisksforuse byGPFSthroughthe
mmcrvsdcommandprovidedbyGPFSorbythecreatevsdcommandprovided bytheRSCTsubsystem. TheIBMVirtualSharedDisk subsystemsupportstwo methodsof externaldiskaccess:
v Anon-concurrentmodeinwhichonly onevirtualshareddiskserverhas accessto asharedexternal
diskata giventime.Aprimaryandabackupserveraredefined.
v Aconcurrentmodeinwhichmultipleserversaredefinedtoaccessthediskconcurrently.
TheIBMRecoverableVirtualSharedDisk componentallowsasecondaryor backupservertobedefined fora logicalvolume,providingthefencingcapabilitiesrequiredtopreserve dataintegrityintheeventof certainsystemfailures.See“Nodefailure”onpage14.Therefore,theIBMRecoverableVirtualShared Diskcomponentisrequiredevenintheeventtherearenotwin-tailed disks.TheReliableScalable Cluster
Technology:ManagingSharedDisksmanualcontainsinstallation,management,andusageinformationfor
boththeIBMVirtualShared DiskandtheIBMRecoverableVirtualSharedDisk.
Virtual
shared
disk
server
considerations
Willyourvirtualshareddiskserversbededicatedserversorwillyoualsobeusingthemtorun applications?Ifyou willhave non-dedicatedservers,considerrunninglesstime-criticalapplicationson thesenodes.If youruntime-criticalapplicationsonavirtualshareddiskserver,servicingdiskrequests fromothernodesmightconflictwiththedemandsoftheseapplications.
ThespecialfunctionsoftheGPFSfilesystemmanagerconsumeextraprocessing time.If possible,avoid usingavirtualshareddiskserverasthefilesystem manager.Thevirtualshareddiskserverconsumes bothmemoryandprocessorcyclesthatcouldimpacttheoperationofthefilesystem manager.See“The filesystemmanager”onpage72.
Theactualprocessing capabilityrequiredforvirtualshareddiskserviceisafunction oftheapplicationI/O accesspatterns, thetype ofnode,thetype ofdisk,andthediskconnection.You canlaterruniostaton theserverto determinehow muchof aloadyouraccesspatternwillplaceona virtualshareddiskserver. Assurethatyouhave sufficientresourcesto runtheIBMVirtualSharedDiskprogramefficiently.This includesenoughbuddybuffers ofsufficientsizetomatchyourfilesystemblocksize,aswellassetting otherparametersinthecommunicationssubsystem.SeetheReliableScalable ClusterTechnology:
ManagingShared Disksmanualfor yourenvironmentandsearchonPerformanceandtuning
considerationsfor virtualshareddisks.
Disk
distribution
Planhowto distributeyourdisksamongthevirtualshareddiskservers.Two considerationsshouldguide yourdecision.Oneinvolvesprovidingsufficientdisksandadaptersonthesystem toyieldtherequiredI/O bandwidth.Theotherinvolvesknowingapproximatelyhowmuchstoragecapacityyouwillneed foryour
data.Dedicatedvirtualshareddiskserversshouldhavesufficientdisksandadaptersto drivetheI/Oload youexpectthemto handle.See“DiskI/O”onpage65forfurther informationonconfiguringyourdiskI/O options.
Preparealist ofdisksthateach virtualshareddiskserverwillbeusing.This listwillbehelpfulwhen creatingdiskdescriptorsduringfilesystem creation.If youhavemulti-taileddisks, andwantto configure forprimaryandbackupvirtualshareddiskservers(toprotectagainstvirtualshareddiskservernode failure),recordthediskdevice nameontheprimaryserver,andthenodenumbersof theprimaryand backupservers.For example,if yourvirtualshareddiskserversarenodes1,3,5, and7:
Disk on Primary node Backup node
hdisk2 1 3
hdisk3 3 1
hdisk2 5 n/a
hdisk2 7 n/a
Inthis case,nodes1and3 sharedisksusingmulti-tailingandwillbackupeach other.However, nodes5 and7willeachbearthefullresponsibility ofservingtheirdisks.Thesearethedisksthatwillbemadeinto virtualshareddisks fromwhich yourGPFSfile systemwillbeconstructed.
Disk
connectivity
Ifyourdisksarecapableof twin-tailingandyouwishto exploitthiscapability, youmustselectanalternate nodeasthebackupvirtualshareddiskserver.SeetheReliableScalable ClusterTechnology:Managing
SharedDisksmanualforyourenvironment forhelpinselectingthesenodes.
Virtual
shared
disk
creation
considerations
GPFSusesvirtualshareddiskstoaccessrawlogicalvolumesasifthey werelocalateach ofthenodes. AlthoughtheManagingShared Disksbook isthedefinitivesourcefor instructionsonhowto createvirtual shareddisks,youcanhaveGPFScreatethemthroughthemmcrvsdcommand.
Forperformancereasons, GPFScreatesonevirtualshareddiskforeachphysicaldiskspecifiedfor thefile system,andassignsanoptimalpartitionsizebasedonthedisk’scapacity.Avirtualshareddisknameis alsoautomatically generated.Ifyouwantto takeadvantageoftheflexibilityavailable increatingvirtual shareddisks,followtheinstructionsintheReliableScalableCluster Technology:ManagingShared
Disksmanualthen passthenewlycreatedvirtualshareddiskto theGPFSfilesystem byspecifyingthe
virtualshareddiskname(see “Disksfor yourfilesystem”onpage33).
Themmcrvsdcommandexpectsasinput afile,DescFile,containing adiskdescriptor,oneperline, for eachof thedisks tobeprocessed.Diskdescriptorshave theformat:
DiskName:PrimaryServer:BackupServer:DiskUsage:FailureGroup:DesiredName:StoragePool
DiskName
Thephysicaldevice nameof thediskyouwanttodefineasavirtualshareddisk.Thisisthe/dev namefor thediskonthenode onwhichthemmcrvsdcommandisissuedandcan beeither an hdisknameora vpathname foranSDDdevice. Eachdiskwillbeusedto createasingle virtual shareddisk.
Alternatively,thenameof avirtualshareddiskcreatedusingAIXandvirtualshareddisk
commands.Inthis case,thevirtualshareddiskisregisteredintheGPFSconfigurationdatabase forsubsequentuse byGPFScommands.
PrimaryServer
Thename oftheprimaryvirtualshareddiskservernode. BackupServer
DiskUsage
Whatisto bestoredonthedisk. dataAndMetadata
Indicatesthatthediskcontainsboth dataandmetadata.Thisisthedefault. dataOnly
Indicatesthatthediskcontainsdata anddoes notcontainmetadata. metadataOnly
Indicatesthatthediskcontainsmetadata anddoesnotcontaindata. descOnly
Indicatesthatthediskcontainsnodataormetadata andisusedsolelyto keepacopyof thefilesystem descriptor.Suchadiskallowsfilesystem descriptorquorumto be
maintained.
Diskusageconsiderations:
1. TheDiskUsageparameterisnotutilizedbythemmcrvsd
commandbutiscopiedintactto theoutputfilethatthe
commandproduces.Theoutputfilemaythenbeusedasinput to themmcrnsd command.
2. RAIDdevicesarenotwell-suitedforperformingsmallblock
writes.Since GPFSmetadatawritesareoftensmallerthanafull block,youmay findusingnon-RAIDdevicesforGPFSmetadata betterfor performance.
FailureGroup
Anumberidentifyingthefailuregroup towhichthis diskbelongs.Alldisksthatareeither attached tothesameadapterorvirtualshareddiskserverhave acommonpoint offailureandshould thereforebeplacedinthesamefailuregroupasshowninFigure14.
GPFSusesthisinformationduringdataandmetadataplacementto assurethatnotworeplicasof thesameblockwillbecomeunavailableduetoasinglefailure.Youcanspecifyanyvaluefrom-1 (where-1 indicatesthatthediskhasnopointof failureincommonwithanyotherdisk)to4000. If youspecifynofailuregroup,thevaluedefaultstotheprimaryvirtualshareddiskservernode numberplus 4000,therebycreatingdistinctfailuregroups.
Ifyouplanto usebothtwin-taileddisksandreplication, assigndiskstothefailuregroupswiththeir primaryservers,asshown inFigure15onpage90.Thisarrangementwouldassureavailabilityof replicateddataifeitherserverfailed.
Failuregroupconsiderations: TheFailureGroupparameterisnotutilizedbythemmcrvsd
commandbutiscopiedintacttotheoutputfilethatthecommand produces.Theoutputfilemaythenbeusedasinputto the mmcrnsdcommand.
DesiredName
Specifythenameyoudesireforthevirtualshareddiskto becreated.Thisname mustnotalready beused byanotherGPFSdiskname,andit mustnotbeginwiththereservedstring″gpfs″.If a desirednameisnotspecified,thevirtualshareddiskisassignedanameaccordingto the convention:
gpfsNNvsd
WhereNNisauniquenonnegativeintegernotused inanypriorvirtualshareddisk. TheseglobaldisknamesmustbesubsequentlyusedonallGPFScommands. GPFS commands,otherthanthemmcrvsdcommand,willnotacceptphysicaldiskdevice names.
Ifadesiredname isspecifiedonthediskdescriptor,mmcrvsdusesthatnameasthebasisfor thenamesof theglobalvolumegroup,locallogicalvolume, andlocalvolume groupname accordingtotheconvention:
DesiredNamegvg
Theglobalvolumegroup DesiredNamelv
Thelocallogicalvolume DesiredNamevg
Thelocalvolumegroup
Ifadesiredname isnotspecifiedonthediskdescriptor,mmcrvsdassignsthenamesof the globalvolumegroup,locallogicalvolume, andlocalvolume groupnameaccordingto the convention:
gpfsNNgvg
WhereNNisauniquenonnegativeintegernotused inanypriorglobalvolumegroup namedwiththisconvention.
gpfsNNlv
WhereNNisauniquenonnegativeintegernotused inanypriorlogical volumenamed withthisconvention.
gpfsNNvg
WhereNNisauniquenonnegativeintegernotused inanypriorvolume groupnamed withthisconvention.
StoragePool
Specifiesthename ofthestoragepoolthattheNSDisassignedto.This fieldisignoredbythe mmcrnsdcommand,andispassedunchangedtotheoutputdescriptorfileproduced bythe mmcrnsdcommand.
Uponsuccessfulcompletion ofthemmcrvsdcommandthediskdescriptorsintheinput filearerewritten: v Thephysicaldeviceor vpathname isreplaced withthecreatedvirtualshareddiskname.
v Theprimaryandbackupserversareomitted.
v TheDiskUsageandFailureGroup fieldsarenotchanged.
Therewrittendiskdescriptorfile,DescFile, canthenbeusedasinputto themmcrnsdcommand.The
DiskUsageandFailureGroupspecificationsinthediskdescriptorareonly preservedintheDescFilefile
rewrittenbythemmcrvsdcommand.Ifyoudonotusethisfile, youmustacceptthedefaultvaluesor specifythesevalueswhencreatingdiskdescriptorsforsubsequent mmcrfs,mmadddisk,ormmrpldisk commands.
Ifnecessary,theDiskUsageandFailureGroup valuesfora diskcanbechangedwiththemmchdisk command.Thevirtualshareddiskname cannotbechanged.
Virtual
shared
disk
server
and
disk
failure
Onemeansofdata protectionistheuseofa RAIDcontroller,whichmasks diskfailures withparitydisks. AnidealconfigurationisshowninFigure16,whereaRAIDdeviceistwin-tailedto twonodes.This protectsagainstserverfailureaswell.
Ifnode1, theprimaryserver,fails,itsresponsibilitiesareassumedbynode 2,thebackupserver,as showninFigure17onpage 92.
IfyourdisksareSAN-attachedtothevirtualshareddiskservers,anidealconfigurationisshown in Figure18.
Anothermeansof dataprotectionisthroughtheuseofconcurrentvirtualshareddisks,asshownin Figure19onpage93.Concurrentdiskaccessallowsyou tousemultipleserverstosatisfy diskrequests bytakingadvantageof theconcurrentdiskaccessenvironment suppliedbyAIX. Forfurtherinformation regardingconcurrentvirtualshareddisks, seetheReliableScalable ClusterTechnology:ManagingShared
Disksmanualfor yourenvironment.
Figure17.BackupnodeservingRAIDdevice
High Performance Switch
RAID controller/ESS
FC switch 2 FC switch 1
node 2
secondary vsd server vsd clientnode 3 node 1
primary vsd server
Youcanalso protectyourfilesystem againstdiskfailurebymirroringdataat thelogicalvolumemanager (LVM)level,writing thedatatwiceto twodifferentdisks.Theadditionof twin-taileddiskstosucha configurationadds protectionagainstserverfailurebyallowingtheIBMRecoverableVirtualShared Disk programto routerequeststhroughabackup server.