• No results found

Regression on Probability Measures: A Simple and Consistent Algorithm

N/A
N/A
Protected

Academic year: 2021

Share "Regression on Probability Measures: A Simple and Consistent Algorithm"

Copied!
62
0
0

Loading.... (view fulltext now)

Full text

(1)

Regression on Probability Measures: A Simple and

Consistent Algorithm

Zolt´an Szab´o (Gatsby Unit, UCL)

Joint work with

◦Bharath K. Sriperumbudur (Department of Statistics, PSU),

◦Barnab´as P´oczos (ML Department, CMU),

◦Arthur Gretton (Gatsby Unit, UCL)

CRiSM Seminars Department of Statistics,

University of Warwick May 29, 2015

(2)

The task

Samples: {(xi,yi)}l

i=1. Goal: f(xi)≈yi, findf ∈H.

Distribution regression:

xi-s are distributions,

available only through samples: {xi,n}Nn=1i .

(3)

Example: aerosol prediction from satellite images

Bag := pixels of a multispectral satellite image over an area. Label of a bag := aerosol value.

Relevance: climate research.

Engineered methods [Wang et al., 2012]: 100×RMSE = 7.5−8.5.

(4)

Wider context

Context:

machine learning: multi-instance learning,

statistics: point estimation tasks (without analytical formula).

Applications:

computer vision: image = collection of patchvectors, network analysis: group of people = bag of friendshipgraphs, natural language processing: corpus = bag ofdocuments,

(5)

Several algorithmic approaches

1 Parametric fit: Gaussian, MOG, exp. family

[Jebara et al., 2004, Wang et al., 2009, Nielsen and Nock, 2012]. 2 Kernelized Gaussian measures:

[Jebara et al., 2004, Zhou and Chellappa, 2006]. 3 (Positive definite) kernels:

[Cuturi et al., 2005, Martins et al., 2009, Hein and Bousquet, 2005]. 4 Divergence measures (KL, R´enyi, Tsallis): [P´oczos et al., 2011]. 5 Set metrics: Hausdorff metric [Edgar, 1995]; variants

[Wang and Zucker, 2000, Wu et al., 2010, Zhang and Zhou, 2009, Chen and Wu, 2012].

(6)

Theoretical guarantee?

MIL dates back to [Haussler, 1999, G¨artner et al., 2002].

Sensible methods in regression: require density estimation [P´oczos et al., 2013, Oliva et al., 2014, Reddi and P´oczos, 2014] + assumptions:

1 compact Euclidean domain.

(7)

Kernel, RKHS

k :D×DRkernel on D, if

∃ϕ:DH(ilbert space) feature map,

(8)

Kernel, RKHS

k :D×DRkernel on D, if

∃ϕ:DH(ilbert space) feature map,

k(a,b) =hϕ(a), ϕ(b)iH (∀a,b∈D). Kernel examples: D=Rd (p >0,θ > 0) k(a,b) = (ha,bi+θ)p: polynomial, k(a,b) =e−ka−bk22/(2θ 2) : Gaussian, k(a,b) =e−θka−bk1: Laplacian. In the H=H(k) RKHS (∃!): ϕ(u) =k(·,u).

(9)

Kernel: example domains (

D

)

Euclidean space: D=Rd.

Graphs, texts, time series, dynamical systems.

(10)

Problem formulation (

Y

=

R

)

Given:

labelled bags ˆz={(ˆxi,yi)}ℓi=1,

ith bag: ˆx

i={xi,1, . . . ,xi,N}i

.i.d.

∼ xi ∈P(D),yi ∈R.

(11)

Problem formulation (

Y

=

R

)

Given:

labelled bags ˆz={(ˆxi,yi)}ℓi=1,

ith bag: ˆx

i={xi,1, . . . ,xi,N}i

.i.d.

∼ xi ∈P(D),yi ∈R.

Task: find aP(D)Rmapping based on ˆz.

Construction: distribution embedding (µx)

(12)

Problem formulation (

Y

=

R

)

Given:

labelled bags ˆz={(ˆxi,yi)}ℓi=1,

ith bag: ˆx

i={xi,1, . . . ,xi,N}i

.i.d.

∼ xi ∈P(D),yi ∈R.

Task: find aP(D)Rmapping based on ˆz.

Construction: distribution embedding (µx) +ridge regression

(13)

Problem formulation (

Y

=

R

)

Given:

labelled bags ˆz={(ˆxi,yi)}ℓi=1,

ith bag: ˆx

i={xi,1, . . . ,xi,N}i

.i.d.

∼ xi ∈P(D),yi ∈R.

Task: find aP(D)Rmapping based on ˆz.

Construction: distribution embedding (µx) +ridge regression

P(D)−−−−→µ=µ(k) X H =H(k)−−−−−−−→f∈H=H(K) R.

Our goal: risk bound compared to the regression function

fρ(µx) = Z

R

(14)

Goal in details

Expected risk:

R[f] =E(x,y)|f(µx)−y|2.

Contribution: analysis of the excess risk

(15)

Goal in details

Expected risk:

R[f] =E(x,y)|f(µx)−y|2.

Contribution: analysis of the excess risk

(16)

Goal in details

Expected risk:

R[f] =E(x,y)|f(µx)−y|2.

Contribution: analysis of the excess risk

E(fˆzλ,fρ) =R[fˆzλ]− R[fρ]≤g(ℓ,N,λ)→0 and rates, fˆzλ = arg min f∈H 1 ℓ ℓ X i=1 |f(µˆxi)−yi| 2 +λkfk2H, (λ >0).

(17)

Goal in details

Expected risk:

R[f] =E(x,y)|f(µx)−y|2.

Contribution: analysis of the excess risk

E(fˆzλ,fρ) =R[fˆzλ]− R[fρ]≤g(ℓ,N,λ)→0 and rates, fˆzλ = arg min f∈H 1 ℓ ℓ X i=1 |f(µˆxi)−yi| 2 +λkfk2H, (λ >0). We consider two settings:

1 well-specified case: fρ∈H,

2 misspecified case: fρ∈L2 ρX\H.

(18)

Step-1: mean embedding

k :D×DR kernel; canonical feature map: ϕ(u) =k(·,u).

Mean embedding of a distribution x,xiˆ ∈P(D):

µx = Z D k(·,u)dx(u)H(k), µxˆi = Z D k(·,u)dxˆi(u) = 1 N N X n=1 k(·,xi,n).

(19)

Step-1: mean embedding

k :D×DR kernel; canonical feature map: ϕ(u) =k(·,u).

Mean embedding of a distribution x,xiˆ ∈P(D):

µx = Z D k(·,u)dx(u)H(k), µxˆi = Z D k(·,u)dxˆi(u) = 1 N N X n=1 k(·,xi,n).

Linear K ⇒ set kernel:

K(µxˆi, µˆxj) = µxˆi, µˆxj H = 1 N2 N X n,m=1 k(xi,n,xj,m).

(20)

Step-1: mean embedding

k :D×DR kernel; canonical feature map: ϕ(u) =k(·,u).

Mean embedding of a distribution x,xiˆ P(D):

µx = Z D k(·,u)dx(u)H(k), µxˆi = Z D k(·,u)dxˆi(u) = 1 N N X n=1 k(·,xi,n). NonlinearK example: K(µxˆi, µˆxj) =e −kµˆxi−µˆxjk 2 H 2σ2 .

(21)

Step-2: ridge regression (analytical solution)

Given: training sample: ˆz, test distribution: t. Prediction on t: (fˆzλµ)(t) =k(K+ℓλIℓ)−1[y1;. . .;yℓ], K = [K(µˆxi, µˆxj)]∈R ℓ×ℓ, k= [K(µˆx1, µt), . . . ,K(µˆxℓ, µt)]∈R 1×ℓ. (1) (2) (3)
(22)

Blanket assumptions: both settings

D: separable, topological domain.

k:

bounded: supu∈Dk(u,u)≤Bk ∈(0,∞),

continuous.

K: bounded; H¨older continuous: L>0,h (0,1] such that

kK(·, µa)−K(·, µb)kH≤Lkµa−µbk

h H.

y: bounded.

(23)

Well-specified case: performance guarantee

Difficulty of the task:

fρis ’c-smooth’,

’b-decaying covariance operator’.

Contribution: If ℓλ−1b−1, then with high probability

E(fλ ˆz,fρ)≤ logh() Nhλ3 +λ c+ 1 ℓ2λ+ 1 ℓλ1b | {z } g(ℓ,N,λ) .

(24)

Well-specified case: performance guarantee

Difficulty of the task:

fρis ’c-smooth’,

’b-decaying covariance operator’.

Contribution: If ℓ≥λ−1b−1, then with high probability

E(fˆzλ,fρ)≤ log h() Nhλ3 + λc + 1 ℓ2λ+ 1 ℓλ1b | {z } g(ℓ,N,λ) . (4) ˆ xi c-smoothness

(25)

Well-specified case: example

Assume

b is ’large’ (1/b≈0, ’small’ effective input dimension),

h = 1 (K: Lipschitz), ✄ ✂1 =✁ ✄ ✂2 in (4)✁ ⇒λ;ℓ=N a (a>0),

t =ℓN: total number of samples processed.

Then 1 c = 2 (’smooth’fρ): E(fλ ˆz,fρ)≈t− 2 7 –faster convergence, 2 c = 1 (’non-smooth’fρ): E(fλ ˆ z ,fρ)≈t− 1 5 – slower.

(26)

Misspecified case: performance guarantee

Difficulty of the task:

fρis ’s-smooth’ (s>0).

Contribution:

IfL2ρX is separable and 1 λ2 ≤l,

then with high probability E(fλ ˆ z ,fρ)≤ logh2(l) Nh2λ 3 2 +1 lλ+ √ λmin(1,s) λ√l +λ min(1,s) | {z } g(ℓ,N,λ) .

(27)

Misspecified case: performance guarantee

Difficulty of the task:

fρis ’s-smooth’ (s>0).

Contribution: If

L2ρX is separable and 1 λ2 ≤l,

then with high probability E(fˆzλ,fρ)≤ logh2(l) Nh2λ32 + 1 √ lλ+ √ λmin(1,s) λ√l +λ min(1,s) | {z } g(ℓ,N,λ) . (5) ˆ xi s-smoothness

(28)

Misspecified case: example

Assume s 1,h= 1 (K: Lipschitz), ✄ ✂1 =✁ ✄ ✂3 in (5)✁ ⇒λ;ℓ=N a (a>0)

t =ℓN: total number of samples processed.

Then 1 s = 1 (’non-smooth’fρ): E(fλ ˆ z ,fρ)≈t−0.25 – slower, 2 s → ∞ (’smooth’fρ): E(fλ ˆ z ,fρ)≈t−0.5 – fasterconvergence.

(29)

Notes on the assumptions:

ρ,

X

∈ B

(

H

)

k: bounded, continuous

µ: (P(D),B(τw))(H,B(H)) measurable.

µmeasurable,X ∈ B(H)ρonX ×Y: well-defined.

If (*) :=Dis compact metric, k is universal, then

µis continuous, and X ∈ B(H).

(30)

Notes on the assumptions: H¨older

K

examples

In case of (*): KG Ke KC e− kµa−µbk2H 2θ2 e− kµa−µbkH 2θ2 1 +kµa−µbk2H/θ2 −1 h= 1 h= 12 h= 1 Kt Ki 1 +kµaµbkθH−1 kµaµbk2H+θ2− 1 2 h= θ 2 (θ≤2) h= 1
(31)

Notes on the assumptions: misspecified case

L2ρ

X: separable ⇔measure space with d(A,B) =ρX(A△B) is so

(32)

Vector-valued output:

Y

= separable Hilbert space

Objective function: fˆzλ = arg min f∈H 1 l l X i=1 kf(µxˆi)−yik 2 Y +λkfk 2 H, (λ >0). K(µa, µb)∈L(Y): operator-valued kernel, vector-valued RKHS.
(33)

Vector-valued output: analytical solution

Prediction on a new test distribution (t):

(fˆzλµ)(t) =k(K+lλIl)−1[y1;. . .;yl], K= [K(µˆxi, µxˆj)]∈L(Y) l×l, k= [K(µˆx1, µt), . . . ,K(µxˆl, µt)]∈L(Y) 1×l. (6) (7) (8) Specifically: Y =R⇒L(Y) =R;Y =Rd L(Y) =Rd.

(34)

Vector-valued output:

K

assumptions

Boundedness and H¨older continuity of K:

1 Boundedness: kKµak 2 HS =Tr Kµ∗aKµa ≤BK ∈(0,∞), (∀µa ∈X). 2 H¨older continuity: ∃L>0,h∈(0,1] such that

kKµa−KµbkL(Y,H)≤Lkµa−µbk h

(35)

Demo

Supervised entropy learning:

0 1 2 3 −2 −1 0 1 2 RMSE: MERR=0.75, DFDR=2.02 rotation angle (β) entropy true MERR DFDR MERR DFDR 1 2 3 4 RMSE

Aerosol prediction from satellite images:

State-of-the-art baseline: 7.58.5(±0.10.6). MERR:7.81 (±1.64).

(36)

Summary

Problem: distribution regression. Literature: large number of heuristics. Contribution:

a simple ridge solution is consistent,

specifically, the set kernel is so (15-year-old open question).

Simplified version [Y =R,fρ∈H]:

(37)

Summary – continued

Code in ITE, extended analysis (submitted to JMLR): https://bitbucket.org/szzoli/ite/ http://arxiv.org/abs/1411.2066. Closely related research directions (Bayesian world):

∞-dimensional exp. family fitting,

(38)

Thank you for the attention!

Acknowledgments: This work was supported by the Gatsby Charitable Foundation, and by NSF grants IIS1247658 and IIS1250350. The work was carried out while Bharath K. Sriperumbudur was a research fellow in

(39)

Appendix: contents

Topological definitions, separability. Prior definitions (ρ).

Universal kernel: definition, examples. Vector-valued RKHS.

Demos: further details. Hausdorff metric.

(40)

Topological space, open sets

Given: D6=set.

τ 2D is called atopology onDif:

1 ∅ ∈τ,Dτ.

2 Finite intersection: O1∈τ, O2∈τ ⇒O1∩O2∈τ. 3 Arbitrary union: Oi∈τ (i∈I)⇒ ∪iIOi∈τ.

(41)

Closed-, compact set, closure, dense subset, separability

Given: (D, τ). ADis

closed ifD\Aτ (i.e., its complement is open),

compact if for any family (Oi)i∈I of open sets with A⊆ ∪i∈IOi, ∃i1, . . . ,in∈I withA⊆ ∪nj=1Oij.

Closure of A⊆D: ¯ A:= \ A⊆C closed inD C. (9) A⊆Dis denseif ¯A=D.

(D, τ) is separableif countable, dense subset of D.

(42)

Prior (well-specified case):

ρ

∈ P

(

b

,

c

)

Let the T :HH covariance operator be

T =

Z X

K(·, µa)K∗(·, µa)dρX(µa)

with eigenvalues tn (n = 1,2, . . .).

Assumption: ρ∈ P(b,c) = set of distributions on X ×Y

α≤nbtn≤β (∀n≥1;α >0, β >0),

∃g ∈Hsuch that =Tc−1

2 g withkgk2

H≤R (R>0),

whereb (1,), c [1,2].

Intuition: 1/b – effective input dimension,c – smoothness of

(43)

Prior: misspecified case

Let ˜T be defined as:

SK∗ :H ֒L2 ρX, SK :L2ρX →H, (SKg)(µu) = Z X K(µu, µt)g(µt)dρX(µt), ˜ T =SK∗SK :L2ρX →L 2 ρX.

Our range space assumption on ρ: fρ∈Im

˜

(44)

Universal kernel: definition

Assume

D: compact, metric,

k :D×DRkernel is continuous.

Then

Def-1: k is universal ifH(k) is dense in (C(D),k·k

(45)

Universal kernel: definition

Assume

D: compact, metric,

k :D×DRkernel is continuous.

Then

Def-1: k is universal ifH(k) is dense in (C(D),k·k

∞).

Def-2: k is

(46)

Universal kernel: definition

Assume

D: compact, metric,

k :D×DRkernel is continuous.

Then

Def-1: k is universal ifH(k) is dense in (C(D),k·k

∞).

Def-2: k is

characteristic, ifµ:P(D)H(k) is injective.

(47)

Universal kernel: examples

On compact subsets of Rd k(a,b) =e−ka−bk 2 2 2σ2 , (σ >0) k(a,b) =e−σka−bk1, (σ >0) k(a,b) =eβha,bi,(β >0), or more generally k(a,b) =f(ha,bi), f(x) = ∞ X n=0 anxn (∀an>0).
(48)

Vector-valued RKHS:

H

=

H

(

K

)

Definition:

A HYX Hilbert space of functions is RKHS if

Aµx,y :f ∈H7→ hy,f(µx)iY ∈R (10)

is continuous for µx ∈X,y∈Y.

(49)

Vector-valued RKHS:

H

=

H

(

K

) – continued

Riesz representation theorem ⇒∃K(µx|y)∈H:

hy,f(µx)iY =hK(µx|y),fiH (∀f ∈H). (11)

K(µx|y): linear, bounded in y ⇒K(µx|y)=Kµx(y) with

(50)

Vector-valued RKHS:

H

=

H

(

K

) – continued

Riesz representation theorem ⇒∃K(µx|y)∈H:

hy,f(µx)iY =hK(µx|y),fiH (∀f ∈H). (11)

K(µx|y): linear, bounded in y ⇒K(µx|y)=Kµx(y) with

Kµx ∈L(Y,H). K construction: K(µx, µt)(y) = (Kµty)(µx), (∀µx, µt∈X), i.e., K(·, µt)(y) =Kµty, (12) H(K) =span{Kµ ty :µt ∈X,y ∈Y}. (13)

(51)

Vector-valued RKHS:

H

=

H

(

K

) – continued

Riesz representation theorem ⇒∃K(µx|y)∈H:

hy,f(µx)iY =hK(µx|y),fiH (∀f ∈H). (11)

K(µx|y): linear, bounded in y ⇒K(µx|y)=Kµx(y) with

Kµx ∈L(Y,H). K construction: K(µx, µt)(y) = (Kµty)(µx), (∀µx, µt∈X), i.e., K(·, µt)(y) =Kµty, (12) H(K) =span{Kµ ty :µt ∈X,y ∈Y}. (13) Shortly: K(µx, µt)∈L(Y) generalizesk(u,v)R.

(52)

Vector-valued RKHS – examples:

Y

=

R

d

1 Ki :X ×X →Rkernels (i = 1, . . . ,d). Diagonal kernel:

K(µa, µb) =diag(K1(µa, µb), . . . ,Kd(µa, µb)). (14) 2 Combination of Dj diagonal kernels [Dj(µa, µb)∈Rr×r,

Aj Rr×d]: K(µa, µb) = m X j=1 A∗jDj(µa, µb)Aj. (15)

(53)

Demo-1: supervised entropy learning

Problem: learn the entropy of the 1st coo. of (rotated)

Gaussians.

Baseline: kernel smoothing based distribution regression (applying density estimation) =: DFDR.

Performance: RMSE boxplot over 25 random experiments. Experience:

more precise than the only theoretically justified method, by avoiding density estimation.

(54)

Demo-2: aerosol prediction – selected kernels

Kernel definitions (p = 2,3): kG(a,b) =e− ka−bk22 2θ2 , ke(a,b) =e− ka−bk2 2θ2 , (16) kC(a,b) = 1 1 +ka−bk22 θ2 , kt(a,b) = 1 1 +kabkθ2, (17) kp(a,b) = (ha,bi+θ)p, kr(a,b) = 1 ka−bk 2 2 ka−bk22+θ , (18) ki(a,b) = 1 q ka−bk22+θ2 , (19) kM,3 2(a,b) = 1 + √ 3ka−bk2 θ ! e− √ 3ka−bk2 θ , (20) √ 5kabk2 5kabk22! √5ka−bk2
(55)

Existing methods: set metric based algorithms

Hausdorff metric [Edgar, 1995]:

dH(X,Y) = max ( sup x∈X inf y∈Yd(x,y),ysupYxinf∈Xd(x,y) ) . (22)

Metric on compact sets of metric spaces [(M,d);X,Y M]. ’Slight’ problem: highly sensitive to outliers.

(56)

Weak topology on

P

(

D

)

Def.: It is the weakest topology such that the

Lh : (P(D), τw)→R, Lh(x) =

Z

D

h(u)dx(u)

mapping is continuous for all hCb(D), where

(57)

Chen, Y. and Wu, O. (2012).

Contextual Hausdorff dissimilarity for multi-instance clustering.

InInternational Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pages 870–873.

Cuturi, M., Fukumizu, K., and Vert, J.-P. (2005).

Semigroup kernels on measures.

Journal of Machine Learning Research, 6:11691198.

Edgar, G. (1995).

Measure, Topology and Fractal Geometry.

Springer-Verlag.

G¨artner, T., Flach, P. A., Kowalczyk, A., and Smola, A. (2002).

Multi-instance kernels.

InInternational Conference on Machine Learning (ICML), pages 179–186.

(58)

Haussler, D. (1999).

Convolution kernels on discrete structures.

Technical report, Department of Computer Science, University of California at Santa Cruz.

(http://cbse.soe.ucsc.edu/sites/default/files/

convolutions.pdf).

Hein, M. and Bousquet, O. (2005).

Hilbertian metrics and positive definite kernels on probability measures.

InInternational Conference on Artificial Intelligence and Statistics (AISTATS), pages 136–143.

Jebara, T., Kondor, R., and Howard, A. (2004).

Probability product kernels.

Journal of Machine Learning Research, 5:819–844.

(59)

Journal of Machine Learning Research, 10:935–975.

Nielsen, F. and Nock, R. (2012).

A closed-form expression for the Sharma-Mittal entropy of exponential families.

Journal of Physics A: Mathematical and Theoretical, 45:032003.

Oliva, J., P´oczos, B., and Schneider, J. (2013).

Distribution to distribution regression.

International Conference on Machine Learning (ICML; JMLR W&CP), 28:1049–1057.

Oliva, J. B., Neiswanger, W., P´oczos, B., Schneider, J., and Xing, E. (2014).

Fast distribution to real regression.

International Conference on Artificial Intelligence and Statistics (AISTATS; JMLR W&CP), 33:706–714.

(60)

International Conference on Artificial Intelligence and Statistics (AISTATS; JMLR W&CP), 31:507–515.

P´oczos, B., Xiong, L., and Schneider, J. (2011).

Nonparametric divergence estimation with applications to machine learning on distributions.

InUncertainty in Artificial Intelligence (UAI), pages 599–608.

Reddi, S. J. and P´oczos, B. (2014).

k-NN regression on functional data with incomplete observations.

InConference on Uncertainty in Artificial Intelligence (UAI).

Thomson, B. S., Bruckner, J. B., and Bruckner, A. M. (2008). Real Analysis.

Prentice-Hall.

Wang, F., Syeda-Mahmood, T., Vemuri, B. C., Beymer, D., and Rangarajan, A. (2009).

(61)

Medical Image Computing and Computer-Assisted Intervention, 12:648–655.

Wang, J. and Zucker, J.-D. (2000).

Solving the multiple-instance problem: A lazy learning approach.

InInternational Conference on Machine Learning (ICML), pages 1119–1126.

Wang, Z., Lan, L., and Vucetic, S. (2012).

Mixture model for multiple instance regression and applications in remote sensing.

IEEE Transactions on Geoscience and Remote Sensing, 50:2226–2237.

Wu, O., Gao, J., Hu, W., Li, B., and Zhu, M. (2010).

Identifying multi-instance outliers.

InSIAM International Conference on Data Mining (SDM), pages 430–441.

(62)

Multi-instance clustering with applications to multi-instance prediction.

Applied Intelligence, 31:47–68.

Zhou, S. K. and Chellappa, R. (2006).

From sample similarity to ensemble similarity: Probabilistic distance measures in reproducing kernel Hilbert space. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28:917–929.

https://bitbucket.org/szzoli/ite/ http://arxiv.org/abs/1411.2066. (http://cbse.soe.ucsc.edu/sites/default/files/

References

Related documents

In addition, it is also important to men- tion that although Islam-based parties tend to be “the vote-seek- ers” in the electoral arena, the fact demonstrated that Islam-based

Technology articles were reviewed that show arrays that may support such detection and have even proven effective in the detection of a specific virus, as well as clinical

Based on a primary survey of both urban and rural handloom weaver clusters in Ethiopia, one of the country’s most important rural nonfarm sectors, this paper examines the

When considering construction works on any part of the golf course you should: plan well in advance and, if possible, work at a time of year when ground. conditions will be

On March 11th, all CCI Interna- tional Students were recognized as Phi Theta Kappa members in an induction ceremony pre- sented by Phi Theta Kappa offi- cers and Lori Weyers, NTC

In light of the preceding literature review, the aims of the current study were twofold: first, to investigate the concurrent influences of parental and peer attachment and

Six items were included in the survey to establish whether lecturers had positive perceptions of m-learning in terms of: increased opportunities for feedback; use of laptops

Overall, we observe that (1) both vision and tactile sensing modalities can achieve a good recognition accuracy of more than 90% by using the proposed DMCA method; and (2),