) 8 1 0 2 S M S M C ( s c it s it a t S l a c it a m e h t a M d n a n o it a l u m i S , g n il e d o M , l a n o it a t u p m o C n o e c n e r e f n o C l a n o it a n r e t n I 8 1 0 2 8 7 9 : N B S
I -1-60595- 25 -9 6
f
o
y
d
u
t
S
t
h
e
A
p
p
il
c
a
it
o
n
o
f
D
i
s
a
g
r
e
e
m
e
n
t
-
b
a
s
e
d
C
o
ll
a
b
o
r
a
it
v
e
n
o
i
s
s
e
r
g
e
R
i
n
L
o
g
I
n
t
e
r
p
r
e
t
a
it
o
n
u
Y - e
z Z
h
H
E
N
G
1,
Z
h
a
o
- i
h YE
u
1a
n
d
C
o
n
g
-
h
u
i
Z
H
A
N
G
21DepartmentofAutomaiton,TsinghuaUniverstiy,Beiijng100084,China
2ChinaOliifeldServicesLimtied ,Sanhe ,Hebe i065201,China
: s d r o w y e
K Semi-supervised learning, Disagreement-based, Collaboraitve regression, Log . n o it a t e r p r e t n i t c a r t s b
A .In thefield of log interpretation ,it’seasy to acquirea lo tof data ,however ,i trequires , n o i t a m r o f n i l e b a l e h t t e g o t t s o
c thus the labeled samples are often no tenough .The secondary r e b m u n e g r a l a d n a s e l p m a s d e l e b a l w e f h t i w k s a t n o it a c i l p p a l a c i p y t a s i y t i s o r o p f o n o i t a t e r p r e t n i n o r t s f o s e g a t n a v d a s i d e h t s a h n o i t a t e r p r e t n i l a u n a M . s e l p m a s d e l e b a l n u f
o g subjectivity and low
y c a r u c c
a . A disagreement-based c -o training style semi-supervised regression algorithm was it a n r e t l a n a s a d e s o p o r
p ve to the manua linterpretation .Two kNN regressors with disagreemen t f n o c h g i h h t i w s e l p m a s d e l e b a l n u s l e b a l m e h t f o h c a E . d e y o l p m e e r e
w idenceleve lfortheotherto
. s e t a m it s e n o i s s e r g e r e v o r p m
i Themethod wasverifiedthroughtheexperimentswhichshowedtha t r o f r e p n o i t a z i l a r e n e g e h
t mance of this semi-supervised mode lis better than the other supervised s
l e d o
m insuchcases.
n o it c u d o r t n I f o s s e c o r p e h t n
I petroleumexploraiton[ ,p1] eopleusespecia linstruments ,suchasacousticwave , , y t i v it c a o i d a r d n a , y t i c i r t c e l
e to measurevariousparametersofthestratuma tdifferen tdepthsinthe ,
l l e
w and then analyzetheparamete . rs Thisiscalled log interpretaiton .Becausereservoirresources , s k c o r r o s k c a r c e r o p d e t c e n n o c r e t n i d n u o r g r e d n u n i d e t u b i r t s i d y ll a r e n e g e r
a prediction ofporosity
g o l f o s s e c o r p e h t n i t n a t r o p m i s
i interpretation. Prediction of porosity includes primary d n a n o i t a t e r p r e t n
i secondary interpretaiton .Tradiitonally ,people usethe dataacquired to calculate n o i t a t e r p r e t n i y r a m i r p f o s s e c o r p e h t n i y ti s o r o
p based on response equation .Bu tthe secondary s d e e n n o i t a t e r p r e t n
i tobecarriedoutt hroughcoreanalysis .Coreanalysisi st heprocessofmeasuring d n u o r g r e d n u e h t n i h t p e d e m o s t a s e l p m a s k c o r n i y ti s o r o p l a u t c a e h
t andt hencorrectingt heprimary
h t y b t l u s e r n o i t a t e r p r e t n
i ese samples. In actua lproduciton ,the resul tof primary interpretation is . t s o c f o t o l a s e r i u q e r o s l a s i s y l a n a e r o c d n a , e t a r u c c a n i n e t f
o Moreover ,secondary interpretaiton
h g u o r h
t aritficia lmethodsi shighlysubjecitveandhashighrequirementsfort echnicians.
a t a d l a c i r o t s i h g n it s i x e e h t f o d i a e h t h ti w t u o d e i r r a c e b n a c l l e w n a f o s i s y l a n a e h t , t c a f n
I of
a e r a e m a s e h t n i s l l e w r e h t
o . Artificia lintelligencetechnology can independenltydiscoverandlearn e l p m a s w e n f o t u p t u o e h t t c i d e r p d n a a t a d l a c i r o t s i h g n i t s i x e m o r f s e l u
r s .Itswayofprocessingdata
. y r o e h t l a n o i ti d a r t m o r f t n e r e f f i d y l e t e l p m o c s
i Many scholars have applied aritficia lintelilgence o l o t y g o l o n h c e
t g interpretation[2][3][4] , bu t mos t of these are supervised methods . I n log y r t s u d n i n o i t a t e r p r e t n
i ,usuallythecos tofgetitngthelabe linformaiton ishigh ,so ti ’soften hard to t
e
g toomanylabeledsamples .Therefore ,theaccuracyofthesesupervisedlearningmethodsisoften . h g i h t o n
Because core analysis can only be carried ou ta tsome depth ,the secondary interpretaiton of l a c i p y t a s i y ti s o r o
p appilcaiton wherethedatase tconsistsofasmal lnumberoflabeled dataanda . a t a d d e l e b a l n u f o r e b m u n e g r a
l Experienceshowst ha tgenerallysupervisedmethodst endt ofalli nto e r e h w s n o it a u t i s r o f g n it t i f r e v
o labeledsamplesarescarce ,whliesemi-supervisedmethodcan make l n u f o e s
u abeled samples and perform better. A tpresent ,thereare few studies of semi-supervised . y r t s u d n i n o it a t e r p r e t n i g o l n i n o it a c i l p p a g n i n r a e
l In this paper ,we propose to apply a semi
-d e s i v r e p u
t n e m i r e p x
e s wtih the actua l produciton data of China Olifield Services Limtied show tha t r
o i r e p u s s i n o i s s e r g e r e v it a r o b a ll o
c toothermethodsinthisapplicaiton.
i m e
S -supervsiedLearning
h t i
W thedevelopmen tofmoderni nformaitont echnology ,tii susuallyeasyt oacquireal argenumber . ] 5 [ n o it a m r o f n i l e b a l e h t t e g o t t s o c e m o s s e k a t t i t u b , s d l e i f y n a m n i s e l p m a s d e l e b a l n u f
o The
i m il s i d o h t e m g n i n r a e l d e s i v r e p u s f o e c n a m r o f r e p n o it a z i l a r e n e
g ted by the number of labeled
d n a , s e l p m a
s ifonlyunsupervisedlearningisadopted ,thevalueoflabeledsamplesiswasted ,while i
m e
s -supervisedl earningmethodcanmakeuseofbothl abeledandunlabeledsamples[ . 6] i
m e s r a l u p o p t s o m e h t ,t n e s e r p t
A -supervisedl earningmethodi sdisagreement-basedcollaboraitve g
n i n r a e
l ,whichtakesadvantageoft hedifferencesbetweenmulitpleclassifiersorregressorstomake .
s e l p m a s d e l e b a l n u f o e s
u I thasthe advantagesof few assumpitons ,simpleand effecitve learning h
t e
m ods ,and wide appilcation scopeso ti’ hs t emainstream algortihm in semi-supervised learning .
y l t n e r r u
c Figure1i saschematicdiagramoft hedisagreement-basedcollaboraitvel earning[ . 7] t
n e m e e r g a s i d f o n o it a c il p p a r o
F -basedcollaborativelearningi nregressionproblems ,ZhouandLi[ 8] d e n i m r e t e d e b n a c r o s s e r g e r e h t y b d e t a m it s e s l e b a l e l p m a s e h t f o y ti c it n e h t u a e h t t a h t d r a w r o f t u p
d n a s e l p m a s e h t f o l e v e l e c n e d i f n o c e h t g n i n i m a x e y
b samples wtih high confidence level is
n o i s s e r g e r f o d n e r t e h t h ti w t n e t s i s n o
c . In this paper ,a disagreement-based collaboraitve learning d
o h t e
m isdesigned forthe secondary interpretaiton of porosity ,which wli lbe detalied in the nex t .
n o it c e s
Figure1. Schemaitcdiagramoft hedisagreement-basedcollaborativelearning.
n o it a t e r p r e t n I y r a d n o c e
S o fPorostiyBasedonDsiagreement-basedCo llabora itveLearning
y l n o s d o h t e m d e s i v r e p u s l a n o it i d a r
T uitilze labeled samples .This algortihm mainly solves the e
s u e k a m o t w o h m e l b o r
p fo unlabeledsamplestoi mprovet hegenerailzationperformanceundert he e
r e h w s e c n a t s m u c r i
c labeledsamplesisi nsufficient. t
e
L L={(x1,y1),(x2,y2)...(xn,yn)} denote the labeled sample set, na d U={x1',x2'...xn'} denote the
t e s e l p m a s d e l e b a l n
u .Disagreement-based collaborative regression utliizes se tLandU to train a r
o s s e r g e
r f :X →Y. Theprocessoft healgorithmi sdesignedasfollows.
t e S a t a D e h t e z il a it i n I
y l m o d n a
R pickNLsamples from labeled datase tto form tse Lused fortraining ,and theremaining s
i a t a
d retainedast es tse.tThenrandomlypickNU samplesfromunlabeleddatasett oformsetU.
r o s s e r g e R e r u g if n o C
s i h t n
I paper ,kNN regressor, which is simple bu teffecitve ,is used as the base learner. The n
o it a r u g i f n o
c of the regressor includes determining the neighbor number k and distance c
i r t e
: c i r t e m e c n a t s i d d e n i f e d e h
t X1,X2...Xk .Suppose their labels areY1,Y2...Yk ,then the labe lof Xu is
: e b o t d e t a m it s e
2
1 ... k
u
Y Y
Y Y
k
+ + +
= ( 1) n
i s e l p m a s d e l e b a l y s i o n e m o s e r a e r e h t e s o p p u
S L ,asshown in Figure 2 ,C isa noisy sample . l
n o n e h
W yonekNNregressorisemployed ,supposeanunlabeledsampleX1 islabeledthenpu tinto
L .ForasampleX2whichi sverycloset oX1, i twli lsufferfromnoisemoreseriouslyt hanX1 .
Figure2 .Singleregressoraffectedbyno . ise o
w t f i t u
B regressors wtihcertaindifferencesareemployedandX1isl abeledbyanotherregressor ,
X2maysufferfromnoiseonlyonce.Soi ti swisert ouset woregressorst oreducet heeffec tofnoise.
r o
F a sampleXuand two kNN regressorsK1 and K2 , tle Ω1={X11,X12...X1k}denote these tof k
-s e l p m a s g n i r o b h g i e n t s e r a e
n of Xu on K1 , and Ω2={X21,X22,...X2k}denote the se t on K2 . If
1
Ω andΩ2isno tsame ,K1andK2 isdifferen tonXu. Thedisagreemen tleve loftwo regressorscanbe
e c n e r e f f i d e h t y b d e r u s a e
m onlabeledse tL .
e z il a i ti n
I L1andL2 wtihL ,whichdenotet hel abeledse tofK1andK2.
g n i n i a r T e v it a r o b a ll o C
, s m e l b o r p n o it a c i f i s s a l c n
I classifierscanprovidean esitmated probabiilty foreveryclass .Suppose f
o y t il i b a b o r p e h t t a h
t asampleX1belonging to classAis0.7and to classB is0.3,whileasample
X2 belonging to class A is 0.9 and to class B is 0.1 ,then obviously X2 is more confiden tto be
. d e l e b a
l
e r a s n o i t c i d e r p e l b i s s o p e h t , s m e l b o r p n o i s s e r g e r n i , y l e t a n u t r o f n
U infintie .The defintiion of
y t i c i t n e h t u
a is thekey to this algortihm .S ma ples with high confidence leve lshould eb consisten t n
o i s s e r g e r f o d n e r t e h t h t i
w ,soasamplewtih high confidenceleve lshould makethemean squared .
e s a e r c e d t e s e l p m a s d e l e b a l e h t n o r o s s e r g e r e h t f o ) E S M ( r o r r
e Since repeatedly measuring t he
t a e r g f o s i t e s d e l e b a l e l o h w e h t n o E S
M computaitona lload ,thefollowingmethodisadoptedasan .
n o i t a m i x o r p p a
e l p m a s d e l e b a l n u n a r o
F Xa inU ,useK1topredic ttislabe lYa .Le tΩ={Xa1,Xa2...Xak}denotethe
k f o t e
s -neares tneighboring samples of Xa ,and their labels areYa1,Ya2,...Yak .Le tK1(X)denote the
K f o e u l a v d e t a m it s
e 1toX .Thent heerrorofK1 on Ω isdefinedas:
2 1 1
] ) ( [
k
i a i
a i
X K Y E
=
−
=
∑
( 2)t e
L '
1
K denote therefined regressor which hasuitilzed theinformaiton provided by (Xa ,Ya) .The
f o r o r r
e '
1
K onΩi s:
2 ' '
1 1
] ) ( [
k
i a i
a i
X K Y E
=
−
=
∑
( 3)X f o l e v e l e c n e d i f n o c e h t n e h
T acanbedefinedas:
' a E E
n i e l p m a s e h
T Uwhich maximizes Ta is the samplewith highes tconfidence .L Xe t m denote this
K . e l p m a
s 1 wil lpu t(Xm,Ym) into L2 ,and in the same iteraiton K2 wil lpu tthe sample with the
o t n i e c n e d i f n o c t s e h g i
h L1.
s i h t f o y ti l i b i s a e f e h t f o s i s y l a n a n a s i g n i w o ll o f e h
T criterion.
, t s r i
F assumethat Xmisonly among thek-neares tneighborsofsomesamplesinΩ .In thiscase ,
y lt n e r a p p
a maximizing Ta also makesthe MSE ofthe regressoron thewholelabeled se tdecrease
t s o m .
X t a h t e m u s s a , d n o c e
S misno tamong thek-neares tneighborsof any samplesinΩ. In thiscase ,
, o r e z s i l e v e l e c n e d i f n o c s ti ) 4 ( o t g n i d r o c c
a thusi tleads toacontradiciton. e
m u s s a , d r i h
T tha tXm is among the k-neares tneighbors of some samples inΩ and some other
s e l p m a
s no tinΩ .In thiscase ti’shard toevaluate whetherXmcan makestheMSE ofregressoron
. t s o m e s a e r c e d t e s d e l e b a l e l o h w e h
t Nevertheless ,experimentsshowtha tin mos tcasesthismethod .
e v it c e f f e s i
n o it i d n o C g n i p p o t
S
: s r u c c o s n o it i d n o c e s e h t f o y n a l it n u C n o it c e s n i d e n o it n e m s s e c o r p g n i n i a r t e h t t a e p e R
)
1 Themaximumnumberofi terationi sreached.
)
2 Themaximumamoun toft imei sexceeded.
)
3 Nosamplet ha tcanmake(4)beposiitveexistsi nU. e
l p m a s w e n a r o f t u p t u o l a n i
F X :i s
2
1( ) ( )
2
X K X K
Y = + ( 5)
s t n e m i r e p x E
f o a t a d n o i t c u d o r p l a u t c a e h t m o r f e r a a t a d l a t n e m i r e p x e e h
T China Olifield Services Limited in
. a e r a i a h o B
. t n e m i r e p x e e h t n i d e s u e r a s l l e w t n e r e f f i d m o r f a t a d f o s t e s o w
T Each ad tase tismadeupof300
. s e l p m a s d e l e b a l n u 0 0 7 4 d n a s e l p m a s d e l e b a
l Each sample has 7 attributes :diameter ,neutron , ,
e m it c it s u o c
a Gammaray ,deepresistivtiy ,density ,andphotoelectricabsorpitoni ndex .Allt hei npu t .
1 , 0 . 0 [ o t d e z il a m r o n n e e b e v a h s e t u b i r t t
a 0 ].Theoutputi sporostiy. K
r o s s e r g e r f o e u l a v k e h
T 1issett o3 ,andEucildeandistancei sused:
2 / 1 2 , , 1
) | |
( ) ,
( a b d al bl
l
X X X
X D
=
−
=
∑
( 6)K r o s s e r g e r f o e u l a v k e h
T 2isalsose tto3 ,bu tMinkowskidistanceisusedandthepvalueisse t
: 5 o t
/ 1 , , 1
) | |
( ) ,
( d p p
l b l a b
a
l
X X X
X D
=
−
=
∑
( 7)e h
T maximum numberof tieration T is se tto 100 ,the maximum amoun tof itme is se tto 600 .
s d n o c e
s 50%ofl abeledsamplesareusedfortrainingandtheremaining50%ofsamplesarekep tas .t
e s t s e
t Ineachi teraiton ,unlabeleddatasetUcontains100sampleswhicharerandomlypicked. ,
n o s i r a p m o c r o f d e t s e t s i m h t i r o g l a n o i s s e r g e r r a e n i l n o n
A whichi susuallyadoptedbymanuall og
, r e v o e r o M . n o i t a t e r p r e t n
i b ka -c propagaiton neura lnetwork is also tested as a representative of g
n i n r a e l d e s i v r e p u
s .
d e t a e p e r s a w t n e m i r e p x e e h
T fivehundred itmesforeach dataset .Theresul tisshownin Table1 2
e l b a T d n
a . Average relaitve error and correlation coefficien tare employed to measure their .
e c n a m r o f r e p n o it a z i l a r e n e
g The data in Table 1 and Table 2 refer to the average value of five .
s n u r d e r d n u
t n e m i r e p x E . 1 e l b a
T result ons 1. et m
h ti r o g l
A AverageRelaitveError CorrelaitonCoefifcient e
v i t a r o b a l l o C
n o i s s e r g e
R 12.7% 0.8576
n o i s s e r g e R r a e n i l n o
N 20.4% 0.7922
k c a
b -propagation k r o w t e n l a r u e
n 16.3% 0.8254
. 2 e l b a
T Experimen tresul tonse t2.
m h ti r o g l
A AverageRelaitveError CorrelaitonCoefifcient e
v i t a r o b a l l o C
n o i s s e r g e
R 8 % .6 0.9137
n o i s s e r g e R r a e n i l n o
N 18.4% 0.8126
k c a
b -propagation k r o w t e n l a r u e
n 13.2% 0.8679
1 e l b a
T andTable2 showstha tCollaborativeRegressionhasaloweraveragerelaitveerroranda e
h g i
h rcorrelationcoefficientt hant heothert woalgorithm ,whichprovest hati tcanexploi tunlabeled g o l n i y ti s o r o p f o n o it a t e r p r e t n i y r a d n o c e S . e c n a m r o f r e p n o i t a z il a r e n e g e v o r p m i o t s e l p m a s
s e l p m a s d e l e b a l e r e h w k s a t n o i t a c il p p a l a c i p y t a s i n o i t a t e r p r e t n
i isi nsufficien,t sothegenerailzaiton
. d e ti m il e b o t y l e k il s i s d o h t e m g n i n r a e l d e s i v r e p u s f o e c n a m r o f r e
p Disagreement-based
. s e s a c h c u s n i r e tt e b s m r o f r e p n o i s s e r g e r e v it a r o b a ll o c
n o is u l c n o C
s i n o it a t e r p r e t n i g o l n i y ti s o r o p f o n o it a t e r p r e t n i y r a d n o c e
S atypica lappilcationtask wherelabeled
t n e i c i f f u s n i e r a s e l p m a
s .The generailzaiton performance of supervised models is related to the s
i y c a r u c c a e h t n e t f o o s , s e l p m a s d e l e b a l f o r e b m u
n ilmtied. This paper propose to use
t n e m e e r g a s i
D -basedcollaborativeregressiontosolvet hisproblem .Thealgortihmemployst wokNN r
o s s e r g e
r s wtihdisagreement ,and in everyiteraitoneach regressorlabelsan unlabeled samplewith m
h t i r o g l a s i h T . r o s s e r g e r r e h t o e h t r o f l e v e l e c n e d i f n o c t s e h g i
h can exploi tunlabeled samples to
s e t a m it s e n o i s s e r g e r e v o r p m
i .The experimen tresul tshows tha tin both sets disagreement-based n
o i s s e r g e r e v it a r o b a ll o
c issuperiort ot heotheralgortihms. d
o h t e m s i h
T doesn’ trequire rich experience and high cost ,thus i tcan avoid the shortage of d
n a n o i t a t e r p r e t n i l a u n a
m bea substtiution for ti .In the actua lproduction ,especially in the newly .
e u l a v n o it a c i l p p a n i a t r e c s a h t i , e c r a c s y l r i a f s i a t a d d e l e b a l e r e h w a e r a d e p o l e v e d
s e c n e r e f e R
] 1
[ B.T. Sun, C.C. Zhou ,and J.W. Zhao .Identification and Evaluation of petroleum reservoir .
d e t s 1 , g n i g g o
l PetroleumIndustryPress :Bejiing ,China ,2014 ,pp1- .1 5 ]
2
[ A .Dashti ,E. Sefidari .Physica lproperties modeilng of reservoirs in Mansur ioi lfield,Zagros ,
n o i g e
r Iran .PetroleumExplorationandDevelopment ,vo.l43 ,pp .559-563 ,Apri l2016. ]
3
[ R.B. Han ,e tal .Selection of Mode lVariables for Pattern Recognition Methods and Its .
n o i t a f f i c i t n e d I y a P y t i v i t s i s e R w o L n i n o i t a c i l p p
A Wel lLoggingTechnology, v o.l41 ,pp .171-175 , u
r b e
F a ry2017. ]
4
[ M. Li ,K.G. Chen ,Z. Yang ,J.H. Zhang ,X. Liu .ComplicatedLithology Identificaiton in Heavy s
e R l i
O ervoirBasedonPatternRecognition .Wel lLoggingTechnology ,vo .l41 ,pp .453-457 ,Apri l .
7 1 0 2
] 5
[ J.Y. Liang ,J.W. Gao ,Y. Chang .Research Progress of Semi-supervised Learning .Journa lof .l
o v , y t i s r e v i n U i x n a h
] 6
[ X.J. Zhu .Semi-supervisedLearningLiteratureSurvey .Madison :UniversityofWisconsin ,2008. ]
7
[ Z.H. Zhou .Disagreement-basedSemi-supervisedLearning .ActaAutomaticaSinica ,vo.l39 ,pp . 1
7 8
1 -1878 ,November2013. ]
8
[ Z.H. Zhou ,M. Li .Sem -isupervised regression wtih co-training .Internationa lJoin tConference e
c n e g il l e t n I l a i c i f i t r A n