• No results found

Study of the Application of Disagreement based Collaborative Regression in Log Interpretation

N/A
N/A
Protected

Academic year: 2020

Share "Study of the Application of Disagreement based Collaborative Regression in Log Interpretation"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

(1)

) 8 1 0 2 S M S M C ( s c it s it a t S l a c it a m e h t a M d n a n o it a l u m i S , g n il e d o M , l a n o it a t u p m o C n o e c n e r e f n o C l a n o it a n r e t n I 8 1 0 2 8 7 9 : N B S

I -1-60595- 25 -9 6

f

o

y

d

u

t

S

t

h

e

A

p

p

il

c

a

it

o

n

o

f

D

i

s

a

g

r

e

e

m

e

n

t

-

b

a

s

e

d

C

o

ll

a

b

o

r

a

it

v

e

n

o

i

s

s

e

r

g

e

R

i

n

L

o

g

I

n

t

e

r

p

r

e

t

a

it

o

n

u

Y - e

z Z

h

H

E

N

G

1

,

Z

h

a

o

- i

h YE

u

1

a

n

d

C

o

n

g

-

h

u

i

Z

H

A

N

G

2

1DepartmentofAutomaiton,TsinghuaUniverstiy,Beiijng100084,China

2ChinaOliifeldServicesLimtied ,Sanhe ,Hebe i065201,China

: s d r o w y e

K Semi-supervised learning, Disagreement-based, Collaboraitve regression, Log . n o it a t e r p r e t n i t c a r t s b

A .In thefield of log interpretation ,it’seasy to acquirea lo tof data ,however ,i trequires , n o i t a m r o f n i l e b a l e h t t e g o t t s o

c thus the labeled samples are often no tenough .The secondary r e b m u n e g r a l a d n a s e l p m a s d e l e b a l w e f h t i w k s a t n o it a c i l p p a l a c i p y t a s i y t i s o r o p f o n o i t a t e r p r e t n i n o r t s f o s e g a t n a v d a s i d e h t s a h n o i t a t e r p r e t n i l a u n a M . s e l p m a s d e l e b a l n u f

o g subjectivity and low

y c a r u c c

a . A disagreement-based c -o training style semi-supervised regression algorithm was it a n r e t l a n a s a d e s o p o r

p ve to the manua linterpretation .Two kNN regressors with disagreemen t f n o c h g i h h t i w s e l p m a s d e l e b a l n u s l e b a l m e h t f o h c a E . d e y o l p m e e r e

w idenceleve lfortheotherto

. s e t a m it s e n o i s s e r g e r e v o r p m

i Themethod wasverifiedthroughtheexperimentswhichshowedtha t r o f r e p n o i t a z i l a r e n e g e h

t mance of this semi-supervised mode lis better than the other supervised s

l e d o

m insuchcases.

n o it c u d o r t n I f o s s e c o r p e h t n

I petroleumexploraiton[ ,p1] eopleusespecia linstruments ,suchasacousticwave , , y t i v it c a o i d a r d n a , y t i c i r t c e l

e to measurevariousparametersofthestratuma tdifferen tdepthsinthe ,

l l e

w and then analyzetheparamete . rs Thisiscalled log interpretaiton .Becausereservoirresources , s k c o r r o s k c a r c e r o p d e t c e n n o c r e t n i d n u o r g r e d n u n i d e t u b i r t s i d y ll a r e n e g e r

a prediction ofporosity

g o l f o s s e c o r p e h t n i t n a t r o p m i s

i interpretation. Prediction of porosity includes primary d n a n o i t a t e r p r e t n

i secondary interpretaiton .Tradiitonally ,people usethe dataacquired to calculate n o i t a t e r p r e t n i y r a m i r p f o s s e c o r p e h t n i y ti s o r o

p based on response equation .Bu tthe secondary s d e e n n o i t a t e r p r e t n

i tobecarriedoutt hroughcoreanalysis .Coreanalysisi st heprocessofmeasuring d n u o r g r e d n u e h t n i h t p e d e m o s t a s e l p m a s k c o r n i y ti s o r o p l a u t c a e h

t andt hencorrectingt heprimary

h t y b t l u s e r n o i t a t e r p r e t n

i ese samples. In actua lproduciton ,the resul tof primary interpretation is . t s o c f o t o l a s e r i u q e r o s l a s i s y l a n a e r o c d n a , e t a r u c c a n i n e t f

o Moreover ,secondary interpretaiton

h g u o r h

t aritficia lmethodsi shighlysubjecitveandhashighrequirementsfort echnicians.

a t a d l a c i r o t s i h g n it s i x e e h t f o d i a e h t h ti w t u o d e i r r a c e b n a c l l e w n a f o s i s y l a n a e h t , t c a f n

I of

a e r a e m a s e h t n i s l l e w r e h t

o . Artificia lintelligencetechnology can independenltydiscoverandlearn e l p m a s w e n f o t u p t u o e h t t c i d e r p d n a a t a d l a c i r o t s i h g n i t s i x e m o r f s e l u

r s .Itswayofprocessingdata

. y r o e h t l a n o i ti d a r t m o r f t n e r e f f i d y l e t e l p m o c s

i Many scholars have applied aritficia lintelilgence o l o t y g o l o n h c e

t g interpretation[2][3][4] , bu t mos t of these are supervised methods . I n log y r t s u d n i n o i t a t e r p r e t n

i ,usuallythecos tofgetitngthelabe linformaiton ishigh ,so ti ’soften hard to t

e

g toomanylabeledsamples .Therefore ,theaccuracyofthesesupervisedlearningmethodsisoften . h g i h t o n

Because core analysis can only be carried ou ta tsome depth ,the secondary interpretaiton of l a c i p y t a s i y ti s o r o

p appilcaiton wherethedatase tconsistsofasmal lnumberoflabeled dataanda . a t a d d e l e b a l n u f o r e b m u n e g r a

l Experienceshowst ha tgenerallysupervisedmethodst endt ofalli nto e r e h w s n o it a u t i s r o f g n it t i f r e v

o labeledsamplesarescarce ,whliesemi-supervisedmethodcan make l n u f o e s

u abeled samples and perform better. A tpresent ,thereare few studies of semi-supervised . y r t s u d n i n o it a t e r p r e t n i g o l n i n o it a c i l p p a g n i n r a e

l In this paper ,we propose to apply a semi

-d e s i v r e p u

(2)

t n e m i r e p x

e s wtih the actua l produciton data of China Olifield Services Limtied show tha t r

o i r e p u s s i n o i s s e r g e r e v it a r o b a ll o

c toothermethodsinthisapplicaiton.

i m e

S -supervsiedLearning

h t i

W thedevelopmen tofmoderni nformaitont echnology ,tii susuallyeasyt oacquireal argenumber . ] 5 [ n o it a m r o f n i l e b a l e h t t e g o t t s o c e m o s s e k a t t i t u b , s d l e i f y n a m n i s e l p m a s d e l e b a l n u f

o The

i m il s i d o h t e m g n i n r a e l d e s i v r e p u s f o e c n a m r o f r e p n o it a z i l a r e n e

g ted by the number of labeled

d n a , s e l p m a

s ifonlyunsupervisedlearningisadopted ,thevalueoflabeledsamplesiswasted ,while i

m e

s -supervisedl earningmethodcanmakeuseofbothl abeledandunlabeledsamples[ . 6] i

m e s r a l u p o p t s o m e h t ,t n e s e r p t

A -supervisedl earningmethodi sdisagreement-basedcollaboraitve g

n i n r a e

l ,whichtakesadvantageoft hedifferencesbetweenmulitpleclassifiersorregressorstomake .

s e l p m a s d e l e b a l n u f o e s

u I thasthe advantagesof few assumpitons ,simpleand effecitve learning h

t e

m ods ,and wide appilcation scopeso ti’ hs t emainstream algortihm in semi-supervised learning .

y l t n e r r u

c Figure1i saschematicdiagramoft hedisagreement-basedcollaboraitvel earning[ . 7] t

n e m e e r g a s i d f o n o it a c il p p a r o

F -basedcollaborativelearningi nregressionproblems ,ZhouandLi[ 8] d e n i m r e t e d e b n a c r o s s e r g e r e h t y b d e t a m it s e s l e b a l e l p m a s e h t f o y ti c it n e h t u a e h t t a h t d r a w r o f t u p

d n a s e l p m a s e h t f o l e v e l e c n e d i f n o c e h t g n i n i m a x e y

b samples wtih high confidence level is

n o i s s e r g e r f o d n e r t e h t h ti w t n e t s i s n o

c . In this paper ,a disagreement-based collaboraitve learning d

o h t e

m isdesigned forthe secondary interpretaiton of porosity ,which wli lbe detalied in the nex t .

n o it c e s

Figure1. Schemaitcdiagramoft hedisagreement-basedcollaborativelearning.

n o it a t e r p r e t n I y r a d n o c e

S o fPorostiyBasedonDsiagreement-basedCo llabora itveLearning

y l n o s d o h t e m d e s i v r e p u s l a n o it i d a r

T uitilze labeled samples .This algortihm mainly solves the e

s u e k a m o t w o h m e l b o r

p fo unlabeledsamplestoi mprovet hegenerailzationperformanceundert he e

r e h w s e c n a t s m u c r i

c labeledsamplesisi nsufficient. t

e

L L={(x1,y1),(x2,y2)...(xn,yn)} denote the labeled sample set, na d U={x1',x2'...xn'} denote the

t e s e l p m a s d e l e b a l n

u .Disagreement-based collaborative regression utliizes se tLandU to train a r

o s s e r g e

r f :XY. Theprocessoft healgorithmi sdesignedasfollows.

t e S a t a D e h t e z il a it i n I

y l m o d n a

R pickNLsamples from labeled datase tto form tse Lused fortraining ,and theremaining s

i a t a

d retainedast es tse.tThenrandomlypickNU samplesfromunlabeleddatasett oformsetU.

r o s s e r g e R e r u g if n o C

s i h t n

I paper ,kNN regressor, which is simple bu teffecitve ,is used as the base learner. The n

o it a r u g i f n o

c of the regressor includes determining the neighbor number k and distance c

i r t e

(3)

: c i r t e m e c n a t s i d d e n i f e d e h

t X1,X2...Xk .Suppose their labels areY1,Y2...Yk ,then the labe lof Xu is

: e b o t d e t a m it s e

2

1 ... k

u

Y Y

Y Y

k

+ + +

= ( 1) n

i s e l p m a s d e l e b a l y s i o n e m o s e r a e r e h t e s o p p u

S L ,asshown in Figure 2 ,C isa noisy sample . l

n o n e h

W yonekNNregressorisemployed ,supposeanunlabeledsampleX1 islabeledthenpu tinto

L .ForasampleX2whichi sverycloset oX1, i twli lsufferfromnoisemoreseriouslyt hanX1 .

Figure2 .Singleregressoraffectedbyno . ise o

w t f i t u

B regressors wtihcertaindifferencesareemployedandX1isl abeledbyanotherregressor ,

X2maysufferfromnoiseonlyonce.Soi ti swisert ouset woregressorst oreducet heeffec tofnoise.

r o

F a sampleXuand two kNN regressorsK1 and K2 , tle Ω1={X11,X12...X1k}denote these tof k

-s e l p m a s g n i r o b h g i e n t s e r a e

n of Xu on K1 , and Ω2={X21,X22,...X2k}denote the se t on K2 . If

1

Ω andΩ2isno tsame ,K1andK2 isdifferen tonXu. Thedisagreemen tleve loftwo regressorscanbe

e c n e r e f f i d e h t y b d e r u s a e

m onlabeledse tL .

e z il a i ti n

I L1andL2 wtihL ,whichdenotet hel abeledse tofK1andK2.

g n i n i a r T e v it a r o b a ll o C

, s m e l b o r p n o it a c i f i s s a l c n

I classifierscanprovidean esitmated probabiilty foreveryclass .Suppose f

o y t il i b a b o r p e h t t a h

t asampleX1belonging to classAis0.7and to classB is0.3,whileasample

X2 belonging to class A is 0.9 and to class B is 0.1 ,then obviously X2 is more confiden tto be

. d e l e b a

l

e r a s n o i t c i d e r p e l b i s s o p e h t , s m e l b o r p n o i s s e r g e r n i , y l e t a n u t r o f n

U infintie .The defintiion of

y t i c i t n e h t u

a is thekey to this algortihm .S ma ples with high confidence leve lshould eb consisten t n

o i s s e r g e r f o d n e r t e h t h t i

w ,soasamplewtih high confidenceleve lshould makethemean squared .

e s a e r c e d t e s e l p m a s d e l e b a l e h t n o r o s s e r g e r e h t f o ) E S M ( r o r r

e Since repeatedly measuring t he

t a e r g f o s i t e s d e l e b a l e l o h w e h t n o E S

M computaitona lload ,thefollowingmethodisadoptedasan .

n o i t a m i x o r p p a

e l p m a s d e l e b a l n u n a r o

F Xa inU ,useK1topredic ttislabe lYa .Le tΩ={Xa1,Xa2...Xak}denotethe

k f o t e

s -neares tneighboring samples of Xa ,and their labels areYa1,Ya2,...Yak .Le tK1(X)denote the

K f o e u l a v d e t a m it s

e 1toX .Thent heerrorofK1 on Ω isdefinedas:

2 1 1

] ) ( [

k

i a i

a i

X K Y E

=

=

( 2)

t e

L '

1

K denote therefined regressor which hasuitilzed theinformaiton provided by (Xa ,Ya) .The

f o r o r r

e '

1

K onΩi s:

2 ' '

1 1

] ) ( [

k

i a i

a i

X K Y E

=

=

( 3)

X f o l e v e l e c n e d i f n o c e h t n e h

T acanbedefinedas:

' a E E

(4)

n i e l p m a s e h

T Uwhich maximizes Ta is the samplewith highes tconfidence .L Xe t m denote this

K . e l p m a

s 1 wil lpu t(Xm,Ym) into L2 ,and in the same iteraiton K2 wil lpu tthe sample with the

o t n i e c n e d i f n o c t s e h g i

h L1.

s i h t f o y ti l i b i s a e f e h t f o s i s y l a n a n a s i g n i w o ll o f e h

T criterion.

, t s r i

F assumethat Xmisonly among thek-neares tneighborsofsomesamplesinΩ .In thiscase ,

y lt n e r a p p

a maximizing Ta also makesthe MSE ofthe regressoron thewholelabeled se tdecrease

t s o m .

X t a h t e m u s s a , d n o c e

S misno tamong thek-neares tneighborsof any samplesinΩ. In thiscase ,

, o r e z s i l e v e l e c n e d i f n o c s ti ) 4 ( o t g n i d r o c c

a thusi tleads toacontradiciton. e

m u s s a , d r i h

T tha tXm is among the k-neares tneighbors of some samples inΩ and some other

s e l p m a

s no tinΩ .In thiscase ti’shard toevaluate whetherXmcan makestheMSE ofregressoron

. t s o m e s a e r c e d t e s d e l e b a l e l o h w e h

t Nevertheless ,experimentsshowtha tin mos tcasesthismethod .

e v it c e f f e s i

n o it i d n o C g n i p p o t

S

: s r u c c o s n o it i d n o c e s e h t f o y n a l it n u C n o it c e s n i d e n o it n e m s s e c o r p g n i n i a r t e h t t a e p e R

)

1 Themaximumnumberofi terationi sreached.

)

2 Themaximumamoun toft imei sexceeded.

)

3 Nosamplet ha tcanmake(4)beposiitveexistsi nU. e

l p m a s w e n a r o f t u p t u o l a n i

F X :i s

2

1( ) ( )

2

X K X K

Y = + ( 5)

s t n e m i r e p x E

f o a t a d n o i t c u d o r p l a u t c a e h t m o r f e r a a t a d l a t n e m i r e p x e e h

T China Olifield Services Limited in

. a e r a i a h o B

. t n e m i r e p x e e h t n i d e s u e r a s l l e w t n e r e f f i d m o r f a t a d f o s t e s o w

T Each ad tase tismadeupof300

. s e l p m a s d e l e b a l n u 0 0 7 4 d n a s e l p m a s d e l e b a

l Each sample has 7 attributes :diameter ,neutron , ,

e m it c it s u o c

a Gammaray ,deepresistivtiy ,density ,andphotoelectricabsorpitoni ndex .Allt hei npu t .

1 , 0 . 0 [ o t d e z il a m r o n n e e b e v a h s e t u b i r t t

a 0 ].Theoutputi sporostiy. K

r o s s e r g e r f o e u l a v k e h

T 1issett o3 ,andEucildeandistancei sused:

2 / 1 2 , , 1

) | |

( ) ,

( a b d al bl

l

X X X

X D

=

=

( 6)

K r o s s e r g e r f o e u l a v k e h

T 2isalsose tto3 ,bu tMinkowskidistanceisusedandthepvalueisse t

: 5 o t

/ 1 , , 1

) | |

( ) ,

( d p p

l b l a b

a

l

X X X

X D

=

=

( 7)

e h

T maximum numberof tieration T is se tto 100 ,the maximum amoun tof itme is se tto 600 .

s d n o c e

s 50%ofl abeledsamplesareusedfortrainingandtheremaining50%ofsamplesarekep tas .t

e s t s e

t Ineachi teraiton ,unlabeleddatasetUcontains100sampleswhicharerandomlypicked. ,

n o s i r a p m o c r o f d e t s e t s i m h t i r o g l a n o i s s e r g e r r a e n i l n o n

A whichi susuallyadoptedbymanuall og

, r e v o e r o M . n o i t a t e r p r e t n

i b ka -c propagaiton neura lnetwork is also tested as a representative of g

n i n r a e l d e s i v r e p u

s .

d e t a e p e r s a w t n e m i r e p x e e h

T fivehundred itmesforeach dataset .Theresul tisshownin Table1 2

e l b a T d n

a . Average relaitve error and correlation coefficien tare employed to measure their .

e c n a m r o f r e p n o it a z i l a r e n e

g The data in Table 1 and Table 2 refer to the average value of five .

s n u r d e r d n u

(5)

t n e m i r e p x E . 1 e l b a

T result ons 1. et m

h ti r o g l

A AverageRelaitveError CorrelaitonCoefifcient e

v i t a r o b a l l o C

n o i s s e r g e

R 12.7% 0.8576

n o i s s e r g e R r a e n i l n o

N 20.4% 0.7922

k c a

b -propagation k r o w t e n l a r u e

n 16.3% 0.8254

. 2 e l b a

T Experimen tresul tonse t2.

m h ti r o g l

A AverageRelaitveError CorrelaitonCoefifcient e

v i t a r o b a l l o C

n o i s s e r g e

R 8 % .6 0.9137

n o i s s e r g e R r a e n i l n o

N 18.4% 0.8126

k c a

b -propagation k r o w t e n l a r u e

n 13.2% 0.8679

1 e l b a

T andTable2 showstha tCollaborativeRegressionhasaloweraveragerelaitveerroranda e

h g i

h rcorrelationcoefficientt hant heothert woalgorithm ,whichprovest hati tcanexploi tunlabeled g o l n i y ti s o r o p f o n o it a t e r p r e t n i y r a d n o c e S . e c n a m r o f r e p n o i t a z il a r e n e g e v o r p m i o t s e l p m a s

s e l p m a s d e l e b a l e r e h w k s a t n o i t a c il p p a l a c i p y t a s i n o i t a t e r p r e t n

i isi nsufficien,t sothegenerailzaiton

. d e ti m il e b o t y l e k il s i s d o h t e m g n i n r a e l d e s i v r e p u s f o e c n a m r o f r e

p Disagreement-based

. s e s a c h c u s n i r e tt e b s m r o f r e p n o i s s e r g e r e v it a r o b a ll o c

n o is u l c n o C

s i n o it a t e r p r e t n i g o l n i y ti s o r o p f o n o it a t e r p r e t n i y r a d n o c e

S atypica lappilcationtask wherelabeled

t n e i c i f f u s n i e r a s e l p m a

s .The generailzaiton performance of supervised models is related to the s

i y c a r u c c a e h t n e t f o o s , s e l p m a s d e l e b a l f o r e b m u

n ilmtied. This paper propose to use

t n e m e e r g a s i

D -basedcollaborativeregressiontosolvet hisproblem .Thealgortihmemployst wokNN r

o s s e r g e

r s wtihdisagreement ,and in everyiteraitoneach regressorlabelsan unlabeled samplewith m

h t i r o g l a s i h T . r o s s e r g e r r e h t o e h t r o f l e v e l e c n e d i f n o c t s e h g i

h can exploi tunlabeled samples to

s e t a m it s e n o i s s e r g e r e v o r p m

i .The experimen tresul tshows tha tin both sets disagreement-based n

o i s s e r g e r e v it a r o b a ll o

c issuperiort ot heotheralgortihms. d

o h t e m s i h

T doesn’ trequire rich experience and high cost ,thus i tcan avoid the shortage of d

n a n o i t a t e r p r e t n i l a u n a

m bea substtiution for ti .In the actua lproduction ,especially in the newly .

e u l a v n o it a c i l p p a n i a t r e c s a h t i , e c r a c s y l r i a f s i a t a d d e l e b a l e r e h w a e r a d e p o l e v e d

s e c n e r e f e R

] 1

[ B.T. Sun, C.C. Zhou ,and J.W. Zhao .Identification and Evaluation of petroleum reservoir .

d e t s 1 , g n i g g o

l PetroleumIndustryPress :Bejiing ,China ,2014 ,pp1- .1 5 ]

2

[ A .Dashti ,E. Sefidari .Physica lproperties modeilng of reservoirs in Mansur ioi lfield,Zagros ,

n o i g e

r Iran .PetroleumExplorationandDevelopment ,vo.l43 ,pp .559-563 ,Apri l2016. ]

3

[ R.B. Han ,e tal .Selection of Mode lVariables for Pattern Recognition Methods and Its .

n o i t a f f i c i t n e d I y a P y t i v i t s i s e R w o L n i n o i t a c i l p p

A Wel lLoggingTechnology, v o.l41 ,pp .171-175 , u

r b e

F a ry2017. ]

4

[ M. Li ,K.G. Chen ,Z. Yang ,J.H. Zhang ,X. Liu .ComplicatedLithology Identificaiton in Heavy s

e R l i

O ervoirBasedonPatternRecognition .Wel lLoggingTechnology ,vo .l41 ,pp .453-457 ,Apri l .

7 1 0 2

] 5

[ J.Y. Liang ,J.W. Gao ,Y. Chang .Research Progress of Semi-supervised Learning .Journa lof .l

o v , y t i s r e v i n U i x n a h

(6)

] 6

[ X.J. Zhu .Semi-supervisedLearningLiteratureSurvey .Madison :UniversityofWisconsin ,2008. ]

7

[ Z.H. Zhou .Disagreement-basedSemi-supervisedLearning .ActaAutomaticaSinica ,vo.l39 ,pp . 1

7 8

1 -1878 ,November2013. ]

8

[ Z.H. Zhou ,M. Li .Sem -isupervised regression wtih co-training .Internationa lJoin tConference e

c n e g il l e t n I l a i c i f i t r A n

References

Related documents

In this study, the in− fluence of the extract from the radix of Scutellaria baicalensis on the activity of ALT manifested itself as an increase in the activity of this enzyme in

  Additionally,  maternal  asthma  is  likely  to  increase  the  risk  of  asthma  in  offspring  indirectly,  since   maternal  asthma  exacerbations  during

The interactive effect of drought stress and salicylic acid on the thousand seed weight was significant (Table 1), and from irrigation based on 100 ℅ FC and 1 mM salicylic

Light vehicles: Any production motor vehicle that is not highly specialized in design and use and is not subject to Federal Excise Tax.. Heavy Vehicles: Any vehicle that is subject

participating in the summer math program with Number Worlds is not significantly different than the summer loss in academic achievement as measured by MAP of the fifth grade

Buses enable people to travel to work, school and college, for leisure, entertainment, shopping and to access important services like health appointments.. They enable families

levels of support based on their unique needs. Currently, the majority of students who are deaf or hard 

A Pilot Study of the Effects of Mindfulness-Based Stress Reduction on Post-traumatic Stress Disorder Symptoms and Brain Response to Traumatic Reminders of Combat in Operation