The Use of Relations - Relations Between Examples

2.2 Relations Between Examples

2.2.3 The Use of Relations

In the following subsection, we argue the experimental settings and discuss them along the setting on which existing related approaches are based. We compare these settings and stress their differences.

i) Autocorrelation over the target space

Many machine learning algorithms and statistical methods that deal with the autocorrelation phenomenon take into account the autocorrelation of the input space (descriptive attributes) (see for example (Appice et al, 2009; Malerba et al, 2005)). This is very intuitive, especially in spatial regression studies, where it is common to resample the study area until the input variables no longer exhibit statistically significant spatial autocorrelation.

In order to explicitly take autocorrelation into account, we need to define the spatial/network dimension of the data. For this purpose, in addition to the descriptive space and the target space, it is necessary to add information on the spatial/network structure of the data in order to be able to capture the spatial/network arrangement of the examples (e.g., the coordinates of the spatial examples involved in the analysis or the pairwise distances between them).

A naïve solution would consider both the descriptive and autocorrelation attributes together as input of a learning process. This has already been done in a number of studies (e.g., (Appice et al, 2009)) in different domains. However, this solution would lead to models that would be difficult to apply in the same domain, but in different spatial/network contexts.

Following (Ester et al, 1997), we do not consider spatial/network information in together with descriptive one in the learned models. This limitation of the search space allows us to have more general models, at the price of possible loss in predictive power of the induced models.

20 Definition of the Problem

In contrast to these studies, we are interested in accounting for autocorrelation related to the target (output) space, when predicting one or more (discrete and continuous, as well as structured) target variables, at the same time.

ii) Autocorrelation as a background knowledge

Autocorrelation can be considered with the learning process in many different ways. Various approaches, such as collective classification and typical network analysis (already mentioned in Section 2.2.1), consider different types of relations to exploit the phenomenon of autocorrelation within the learning process: from the synthesis of in-network features to the propagation of a response across the network.

In most of the collective classification studies (Gallagher et al, 2008), the connections/relations (edges in the network) between the data in the training/testing set are predefined for a particular instance and are used to generate the descriptive information associated to the nodes of the network. In this way, in-network features are created. However, such features can be a limiting factor in the predictive modeling process that can lead the models to lose their generality and possible general applicability.

In typical network studies, the general focus is on exploring the structure of a network by calculating its properties (e.g. the degrees of the nodes, the connectedness within the network, scalability, robustness, etc.). The network properties are then fitted into an already existing mathematical network model or a theoretical graph model (Steinhaeuser et al, 2011). Also in this case the created in-network features are in a tight, inseparable relation to the data which is a limitation toward the generality of the models.

Another limitation of most of the models is that they only consider the cases where training and testing data (nodes) belong to the same network. This means that the prediction phase requires complete knowledge of the network arrangement (e.g., connections to other nodes of the network) of any unlabeled node to be predicted.

In contrast to these studies, in this dissertation, the connections are not in a tight inseparable relation to the data. In fact they relate to the target space and not to the descriptive attributes. Moreover, different types of relations (explicit and implicit) can be used with the same data, as a tool to access the quality of the relational data.

The network setting that we address in this work is based on the use of both the descriptive information (attributes) and the network structure during training whereas, on the use of the descriptive information in the testing phase where we disregard the network structure.

More specifically, in the training phase we assume that all examples are labeled and that the given network is complete. In the testing phase all testing examples are unlabeled and the network is not given. Because of this setting, a key property of the proposed solution is that the existence of the network is not obligatory in the testing phase, where we only need the descriptive information. This can be very beneficial especially in cases where the prediction needs to be made for those examples for which connections to other examples are not known or need to be confirmed.

The setting where a network with some nodes labeled and some nodes unlabeled (Appice et al, 2009) is given, can be mapped to our setting. In fact, we can always use the nodes with labels and the projection of the network on these nodes for training and only the unlabeled nodes without network information in the testing phase.

The existence of the relations is not obligatory for in the testing set. This leads to the creation of general models.

iii) Non-stationary autocorrelation

Definition of the Problem 21

ods assume that autocorrelation dependencies are stationary (i.e., do not change) throughout the considered context space (time, space or network) (Angin and Neville, 2008). This means that possible significant variabilities in autocorrelation dependencies throughout the space/network cannot be repre- sented and modeled. However, the variabilities could be caused by a different underlying latent structure of the network (or generally in any dimension) that varies among its portions in terms of properties of nodes or associations between them. For example, different research communities may have different levels of cohesiveness and thus cite papers on other topics with varying degrees. As pointed out by Angin and Neville (2008), when autocorrelation varies significantly throughout a network, it may be more accurate to model the dependencies locally rather than globally.

To overcome this issue, in this work, we develop an approach for modeling non-stationary autocorrelation data, where the autocorrelation is related to the target (output) space.

Since we consider the PCT framework, the tree models obtained by the proposed algorithm allow us to obtain a hierarchical view of the network, where clusters can be employed to design a federation of hierarchically arranged networks. This can turn to be useful, for instance, in wireless sensor networks, where a hierarchical structure is one of the possible ways to reduce the communication cost between the nodes (Li et al, 2007).

Moreover, it is possible to browse the generated clusters at different levels of the hierarchy, where each cluster can naturally consider different effects of the autocorrelation phenomenon on different portions of the network: at higher levels of the tree, clusters will be able to consider autocorrelation phe- nomenons that are spread all over the network, while at lower levels of the tree, clusters will reasonably consider local effects of autocorrelation.

This gives us the way to consider non-stationary autocorrelation.

In document CONSIDERING AUTOCORRELATION IN PREDICTIVE MODELS. Daniela Stojanova (Page 35-37)