ABSTRACT
**Biomass** **estimation** performance **from** model-based polari- metric SAR interferometry (**PolInSAR**) using generic para- metric and **non**-**parametric** regression methods is evaluated at L- and P-band frequencies over boreal **forest**. **PolInSAR** **data** is decomposed into ground and volume contributions, esti- mating vertical **forest** structure, and using a set of obtained parameters for **biomass** regression. The considered estima- tion methods include multiple linear regression, support vec- tor machine and random **forest**. The **biomass** **estimation** per- formance is evaluated on DLR’s airborne SAR **data** at L- and P-bands over Krycklan Catchment, a boreal **forest** test site in Northern Sweden. The combination of polarimetric indi- cators and estimated structure information has improved the root mean square error (RMSE) of **biomass** **estimation** up to 28% at L-band and up to 46% at P-band. The cross-validated **biomass** RMSE was reduced to 20 tons/ha.

Show more
At stand level, the accuracy obtained here were 7.4% RMSE (in percent of the surveyed mean) for tree height, 13.2% for stem volume and 11.4% for basal area. These accuracies are higher, or in agreement with, results **from** previous studies using other operational techniques or methods for **forest** management planning **data** acquisition, such as ALS-based and subjective field methods (e.g., Ståhl, 1998; Næsset et al., 2004; Bohlin et al., 2012). Furthermore, the species-specific volume estimates showed marginally lower absolute accuracy compared to Packalén and Maltamo (2007) for spruce, but similar for pine and deciduous volume. Measured in relative terms, the results are not as comparable, though, but probably due to the large differences in field sampled mean values. Furthermore, this pilot study was carried out using a simplified framework compared to the thorough study performed by Packalén et al. (2009). Here, **estimation** was performed using imputation, as a means to preserve the natural dependencies between estimated variables. That is, k-MSN was applied using k=1 rather than a larger value of k, which is expected to produce more accurate results. On the other hand, in order to limit the analysis, accuracy was here assessed without cross-validation, which is expected to underestimate errors. Further differences are the limited image **data** utilized here; only one flight strip of images was used due to the concentrated sampling design of the field plots, providing only little **data** **from** multiple overlapping images, and the fact that these images were also spectrally adjusted by Lantmäteriet. These facts are possible causes of reduced species classification accuracy compared to the results reported by Packalén and Maltamo (2007). Finally, the dataset was heavily dominated by spruce **forest**, probably with insufficient variation in species compositions to accurately estimate species-specific **forest** variables.

Show more
related to a specific timber type. The distance is measured by creating a weight matrix derived by canonical correspondence analysis (Ohmann & Gregory, 2002). Similarly, MSN maps are created using a model that also integrates field plot **data** with satellite imagery. In contrast, MSN uses a canonical correlation analysis to derive a similarity function, with selected response variables, to impute **data** to pixels where no ground plots exist (Moeur & Stage, 1995). The k-MSN method uses the same methods as MSN, but takes an average of the k nearest neigh- bor of plots. The Random **Forest** (RF) imputation method creates a classification matrix and regression tree in order to find similarities between the explanatory and response variables (Crookston & Finley, 2008).

Show more
A B S T R A C T
Reliability of a product or system is the probability that the product performs adequately its intended function for the stated period of time under stated operating conditions. It is function of time. The most widely used nano ceramic capacitor C0G and X7R is used in this reliability study to generate the Time- to failure (TTF) **data**. The time to failure **data** are identiﬁed by Accelerated Life Test (ALT) and Highly Accelerated Life Testing (HALT). The test is conducted at high stress level to generate more failure rate within the short interval of time. The reliability method used to convert accelerated to actual condition is **Parametric** method and **Non**-**Parametric** method. In this paper, comparative study has been done for **Parametric** and **Non**-**Parametric** methods to identify the failure **data**. The Weibull distribution is identi- ﬁed for **parametric** method; Kaplan–Meier and Simple Actuarial Method are identiﬁed for **non**- **parametric** method. The time taken to identify the mean time to failure (MTTF) in accelerating condition is the same for **parametric** and **non**-**parametric** method with relative deviation.

Show more
The role of hierarchical topic models regarding text modeling and natural language processing (NLP) is very important. The hierarchical modeling of topics allows the construction of more accurate and predictive models than the ones constructed by flat models. Models of the former type are more probable to predict unseen documents, than the latter. In most text collections, such as web pages, a hierarchical model, for instance a web directory, is able to describe the structure and organization of the document collection more accurately than flat models. This, however, ultimately depends on the nature of the **data** set and the true generative process of the documents themselves. Assuming that the higher levels of the hierarchy capture generic topics of a particular domain, while lower-level ones focus on particular aspects of that domain, it is expected that a hierarchical probabilistic topic model would be able to “explain” or could have generated the **data** set. In other words, the likelihood of such a model given the **data** set would probably be higher than the likelihood of other flat models (Mimno et al., 2007; Li et al., 2007; Li and McCallum, 2006).

Show more
27 Read more

ple recorded on a Tecator Infratec Food and Feed Analyzer. Each spectrum was sampled at 100 wavelengths uniformly spaced in the range 850–1050 nm. The composition of each meat sample was determined by analytic chemistry, so percentages of moisture, fat, and protein were associated in this way to each spec- trum. We focus on predicting the percentage of fat on the basis of the absorbency spectrum. This problem is more challenging than the ones in Section 3 where the **data** were generated to fulfill the conditions of the DBIC method. The **data** set was randomly split 100 times into training and test sets having approximately the same size. Table 4.2 reports the mean of the mean square error (MSE), and its standard deviation, over the 100 splits, for the DBIC and NWK methods.

Show more
25 Read more

Globally, raingauges are always used to measure the depths and rates of rainfall events. The original database consisted of 21 series of daily precipitation with different lengths mostly concentrated in the nothern part of the country. There were raingauge stations in our analysis e.g. Balakot and Kotli, where within the same locality; position of the observatories were shifted. For these cases by merging the **data** of the observatories located in the same location, we created new series. At some sites we faced a problem of missing rainfall values. To overcome this challenge we selected a **data** series with less than 15% missing values [Karl et al., 1995] for a common record period of 50 years (1961-2010). The reason of adopting this strict criterion is to ensure that all of the time series **data** are sufficiently long so that they not only provide reliable estimates of the extreme events probability [Jones, 1997], but also cover the same record period to avoid variability in the **estimation** of the parameter as a result of inter annual climatic cycles. This led to a final database of 15 observatories.

Show more
124 Read more

The function g is defined on the entire lattice, but when restricted to partial rankings of the same type G = { S γ π : π ∈ S n } ⊂ W ˜ n , constitutes a normalized probability distribution on G. Estimating and examining a restriction of g to a subset H ⊂ W ˜ n (note that in general H may include more than one coset space G) rather than the function p is particularly convenient in cases of large n since H is often much smaller than the unwieldy S n . In such cases it is tempting to specify the function g directly on H without referring to an underlying permutation model. However, doing so may lead to probabilistic contradictions such as g( S γ π) < g( S λ σ) for S λ σ ⊂ S γ π. To avoid these and other probabilistic contradictions, g needs to satisfy a set of linear constraints equivalent to the existence of an underlying permutation model. Figure 3 illustrates this problem for partial rankings with the same (left) and different (right) number of vertical bars. A simple way to avoid such contradictions and satisfy the constraints is to define g indirectly in terms of a permutation model p as in (9). Applied to the context of statistical **estimation**, we define the estimator ˆ g in terms of an estimator ˆ p of the underlying permutation model p.

Show more
29 Read more

The issue of exiting **from** fixed to flexible regimes lies at the junction of two rich bodies of literature: regime choice and currency crisis. Levy-Yeyati et al. (2006) analyze the determinants of regime choice by testing three hypotheses: Optimum Currency Area (OCA), financial integration, and thirdly, political crutch. Countries may opt to have greater exchange rate flexibility in order to cope with changing economic conditions. **From** that perspective, exits can be seen as voluntary decisions. OCA theory relates regime choice to economic structure. It predicts that a pegged regime is more desirable the higher is the degree of trade integration, but harder to sustain when an economy is subject to large real shocks, i.e. terms of trade shock in the absence of price/wage flexibility and labor mobility. The financial integration view highlights the consequences of financial integration for regime choice. According to this hypothesis, as financial integration increases, a higher occurrence of flexible regimes is likely to occur. In the third framework, that of political crutch, theory suggests that fixed regimes are more likely if the country lacks a good institutional track record, but more likely if the government is too weak to sustain them (ibid. p:5).

Show more
63 Read more

Simulation results
The results of errors of the analysed methods for the best selected values of the parameters are presented in appendices A, B, and C. The typical models are presented. For other **data** models, the accuracy analysis results are similar to the presented ones. For each method, the arithmetic mean of the error calculated using 100 simulations is presented in figures. Appendix A contains the density **estimation** results for AKDE, PPDE, IFDE, LSDE, and SKDE methods when the primary **data** clustering is used. The **data** clustering was performed, using automated clustering software (developed by Institute of Mathematics and Infor- matics, Vilnius) which is based on the EM algorithm. Appendix B contains the density **estimation** results for AKDE and PPDE methods, with the preliminary **data** clustering in use and without it. Appendix C contains the accuracy analysis results for the density estimators. The results, obtained by means of AKDE and PPDE, are similar to those obtained by J.N. Hwang, S.R. Lay and A. Lippman, i.e., in the case of small sample sizes and heavy tails (Cauchy samples), it is better to use the kernel density estimators, in the case of large **data** dimensions and large sample sizes (400 and more observations), or in the case of the Gaussian distribution, better results are obtained using the projection pursuit density estima- tor. In the case of the 5-dimensional Gaussian distribution, quite good results are

Show more
19 Read more

The term “survival analysis” pertains to a statistical approach designed to take into account the amount of time an experimental unit contributed to a study i.e., it is the study of time between entry into observation and a subsequent event. Survival times are **data** that measure follow-up time **from** a defined starting point to the occurrence of a given event, for example the time **from** the beginning to the end of a remission period or the time **from** the diagnosis of a disease to death. Standard statistical techniques cannot usually be applied because the underlying distribution is rarely Normal and the **data** are often „censored‟.

Show more
Finally, the results of this dissertation motivate open problems and generate sev- eral ideas for further research. First, it is of interest to investigate the finite-sample, not only asymptotic, performance of the newly proposed estimators. Second, the results of Chapter 6 are limited to complete, grouped, and exponentially distributed **data**, but they could be extended to more general situations and models. Third, additional analysis involving risk measures could be undertaken as well. Finally, all the **data** scenarios considered in this dissertation are based on the i.i.d. assumption, which is a reasonable assumption but not the only one to consider. Some of these problems are discussed in more detail in Section 7.2.

Show more
127 Read more

Figure 4. One example of inversion scenario for the SBPI and DBPI methods in the unit complex plane. (a) Image 1-2 as the first baseline B 1 with DBPI; (b) Image 1-3 as the first baseline B 1 with DBPI. 4.2. Cloude’s DBPI vs. Modified DBPI
In this section, the performance of **forest** height inversion based on the modified DBPI method is presented, and the results **from** the DBPI method are reused for a comparative analysis. Figure 5 a,b shows the **forest** height inversion results **from** the modified DBPI method with swapped choices of the first baseline. Both figures present similar trends and partial differences with respect to the results **from** the DBPI method (see Figure 2 c,d)). The estimated height difference is calculated by subtracting the DBPI results **from** the modified DBPI results, scaled between − 5 m and 5 m, as shown in Figure 5 c,d. **From** these figures, it is obvious that significant differences in some specific locations are detected. Figure 5 e presents the range terrain slope over the test site calculated **from** the LIDAR DEM. As expected, the height difference maps present high correlation with the slope map. When the slope is positive, the results **from** the modified DBPI method decrease the height estimates with respect to the results **from** the DBPI method. That is because positive slopes yield an overestimation of **forest** height, and the S-RVoG model is able to compensate for slope effects. Conversely, negative slopes lead to an underestimation of estimates. Figure 6 presents the corresponding cross-validation plots of **PolInSAR** **forest** height estimates against the LIDAR **forest** height. Figure 6 a,b shows that the modified DBPI results present slight improvements, with a RMSE decrease of 7.10% on average. It must be noted that this small improvement is explained because the slopes of all of the **forest** stands are not large, with a mean value of 3.54 ◦ . In order to better understand the effect of the slope correction, 23 **forest** stands with relatively large range terrain slopes ( | α | > 10 ◦ ) are selected **from** all 362 stands, and both the

Show more
17 Read more

The development of NPTNs offer one such design as- pect, i.e. learning **non**-**parametric** invariances and symme- tries directly **from** **data**. Through our experiments, we found that NPTNs can indeed effectively learn general invariances without any apriori information. Further, they are effective and improve upon vanilla ConvNets even when applied to general vision **data** as presented in CIFAR10 and ETH-80 with complex unknown symmetries. This seems to be a crit- ical requirement for any system that is aimed at taking a step towards general perception. Assuming detailed knowledge of symmetries in real-world **data** (not just visual) is imprac- tical and successful models would need to adapt accordingly. In all of our experiments, NPTNs were compared to vanilla ConvNet baselines with the same number of filters (and thereby more channels). Interestingly, the superior per- formance of NPTNs despite having fewer channels indicates that better modelling of invariances is a useful goal to pur- sue during design. Explicit and efficient modelling of invari- ances has the potential to improve many existing architec- tures. Indeed, we outperform several state-of-the-art algo- rithms on ETH-80. In our experiments, we also find that Capsule Networks which utilized NPTNs instead of vanilla ConvNets performed much better. This motivates and justi- fies more attention towards architectures and other solutions that efficiently model general invariances in deep networks. Such an endeavour might not only produce networks per- forming better in practice, it also promises to deepen our understanding of deep networks and perception in general.

Show more
The content of this chapter is directly inspired by Comte, Genon-Catalot, and Rozenholc (2006; 2007). We consider **non**-**parametric** **estimation** of the drift and diffusion coefficients of a one-dimensional diffusion process. The main assumption on the diffusion model is that it is ergodic and geometrically β- mixing. The sample path is assumed to be discretely observed with a small regular sampling interval ∆. The **estimation** method that we develop is based on a penalized mean square approach. This point of view is fully investigated for regression models in Comte and Rozenholc (2002, 2004). We adapt it to discretized diffusion models.

Show more
42 Read more

The dramatic rise in bank failures over the last decade has led to a search for leading indicators so that costly bailouts might be avoided. While the quality of a bank’s management is generally acknowledged to be a key contributor to institutional collapse, it is usually excluded **from** early-warning models for lack of a metric. This paper describes a new approach for quantifying a bank’s managerial efficiency, using a **data**-envelopment- analysis model that combines multiple inputs and outputs to compute a scalar measure of efficiency. This new metric captures an elusive, yet crucial, element of institutional success: management quality. New failure-prediction models for detecting a bank’s troubled status which incorporate this explanatory variable have proven to be robust and accurate, as verified by in-depth empirical evaluations, cost sensitivity analyses, and comparisons with other published approaches.

Show more
There are some intuitive reasons why this method requires some careful considerations for estimating pdf of variables having support [0, 1]. The foregoing kernel density estimator was developed mainly for estimating densities with unbounded support. For such densities, the main interest would typically be in the middle region of the support. Because, most of the observations lie in the middle region and f e is consistent, although biased, the bias of the estimator in the tail region has not been a serious issue. When the underlying density has support [0, 1], this kernel density estimator may suffer **from** serious boundary bias. The estimator f e with fixed bandwidth would not be suitable because it would have support outside [0, 1]. Choosing the bandwidth sufficiently close to zero for values of x near the boundary 0, in order to avoid this, is a difficult task. A remedy may appear to be to transform the **data** **from** [0, 1] onto the entire real line, estimate the density on the real line using the large literature on this topic, and then transform the estimated density back to have support [0, 1]. However, this does not solve the problem as far as **estimation** of the probability density function of recovery rates is concerned. One of our main objectives is to estimate the pdf on [0, 1] with particular emphasis on the region near the lower boundary 0, because this is the region corresponding to high risk of loss. A small point-wise bias in the estimator f e over an interval in the lower tail region towards −∞ , would translate to a large bias near the lower boundary 0 when transformed back to [0, 1]. These arguments suggest that the traditional large literature surrounding f e in (1), may not be suitable for estimating pdf of recovery rates having support [0, 1].

Show more
32 Read more

In light of increasing congestion in urban areas, monitoring and providing information about tra ﬃc conditions is critical for tra ﬃc management and eﬀective transport policy. Travel time **data** may be collected **from** stationary auto- matic vehicle identiﬁcation (AVI) sensors (automatic number plate recognition (ANPR) cameras, Bluetooth devices, etc.). AVI systems provide direct measurements of route travel times, but the spatial coverage is typically small and may not be representative of the network as a whole. Meanwhile, ﬂoating car **data** (FCD) collected **from** GPS devices installed in vehicle ﬂeets or smart phones provide information **from** the entire network. Travel time **estimation** **from** FCD is often challenging because of low penetration rate, which means that the number of available FCD observations **from** vehicles traveling along the route of interest may be low if it is not a common one.

Show more
The rest of this paper is organized as follows. Section 2 presents the most important practical issues related with the **estimation** of the service level considering: (i) demand distribution selection and validation using sample **data** even when they are scarce, (ii) alternatives for the **estimation** of the demand distribution parameters, and (iii) calculation of the temporal aggregates of the demand. Section 3 is devoted to introduce our proposed **non** **parametric** approach and provide some illustrative examples. Finally, conclusions and further research are presented in Section 4.

Show more
For these reasons, a **parametric** approach may be a promising alternative, in particular, for panels with a small number of time periods. In this paper a vector error correction model (VECM) is employed to represent the dynamics of the system. Our framework can be seen as a panel analog of Johansen’s cointegrated vector autoregression, where the short-run parameters are al- lowed to vary across countries and the long-run parameters are homogenous. Unfortunately, in such a setup the ML estimator cannot be computed **from** solving a simple eigenvalue problem as in Johansen (1988). Instead, in sec- tion 2 we adopt a two-step **estimation** procedure that was suggested by Ahn and Reinsel (1990) and Engle and Yoo (1991) for the usual time series model. As in Levin and Lin (1993) the individual specific parameters are estimated in a first step, whereas in a second step the common long-run parameters are estimated **from** a pooled regression. The resulting estimator is asymp- totically efficient and normally distributed. Furthermore, a number of test procedures that are based on the two-step approach is considered in section 4 and extensions to more general models are addressed in section 5. The results of a couple of Monte Carlo simulations presented in section 6 suggest that the two-step estimator performs better than the FM-OLS estimator in typical sample sizes. Some conclusions and suggestions for future work can be found in section 7.

Show more
25 Read more