Samplingstrategies for saturation recovery methods for myocardial T1-mapping have been optimized and vali- dated experimentally. Improved precision may be achieved by using fixed saturation delays when considering native myocardium and post-contrast T1 ranges. The optimized TS was 591 ms, 193 ms, and 290 ms for native, post- contrast, and wide myocardial T1 ranges, respectively. Pixel-wise estimates of T1 mapping errors have been for- mulated and validated for SR fitting methods. The ability to quantify the measurement error has potential to deter- mine the statistical significance of subtle abnormalities that arise due to diffuse disease processes involving fibro- sis and/or edema and is useful both as a confidence metric for overall quality, and in optimization and comparison of imaging protocols.
Negative samplingstrategies have been studied in many machine learning tasks. In the com- puter vision fields, Faghri et al. (2017) studies hard negatives and introduces a simple change to common loss function on image-caption retrieval tasks. Guo et al. (2018) proposes a fast negative sampler which chooses negative examples that are most likely to meet the requirements of violation according to the latent factors of image. In nat- ural language processing fields, Kotnis and Nas- tase (2017) analyses the impact of negative sam- pling strategies on the performance of link pre- diction in knowledge graphs. Saeidi et al. (2017) studies the affect of a tailored sample strategy on the performance of document retrieval task. Rao et al. (2016) uses three negative strategies to se- lect the most informative negative samples on the pairwise ranking model for answer selection. Xu et al. (2015) introduces a straightforward negative sampling strategy to improve the assignment of subjects and objects on a convolution neural net- work. To our best knowledge, this is the first work to empirical study of negative sampling strate- gies for learning of matching models in multi-turn retrieval-based dialogue systems, which may en- lighten future works in the learning of retrieval- based dialogue systems.
In addition to the anti-correlation between antennae, we found a continuum of antennae bias in trail sampling. In a large fraction of ants (15 out of 22), one antenna overlapped significantly more with the trail than the other during a tracking run (Fig. 6C). In some ants, the left antenna was strongly biased (8 of 15) and in others the right antenna was strongly biased (7 of 15), similar to studies showing a continuum of handedness in turning bias in flies (Buchanan et al., 2014). This bias was radically altered when the odor trails were curved as opposed to straight (Fig. 6E,F). When following curved trails (either to the right or the left), the bias was strongly towards the antenna ipsilateral to the inner curvature. This result could be interpreted in two ways: (1) any inherent bias in sampling (i.e. ‘ handedness ’ ) can be masked by trail features that impose a greater bias in antennae overlap; (2) the pattern of antennae sampling used by ants is not inherent but rather context dependent, such that ants use different samplingstrategies in different situations. Regardless, the presence of a bias suggests lateral specialization within an antennae pair that affects odor sampling on simple trails. This fits with evidence from a variety of studies which has shown a lateralization of insect brains and behaviors (Buchanan et al., 2014; Wes and Bargmann, 2001; Letzkus et al., 2006; Rogers et al., 2013; Frasnelli et al., 2012).
Aim: We compared the following strategies for sampling comparison cohorts in matched cohort studies with respect to time to ischemic stroke and mortality: sampling without replacement in ran- dom order; sampling with replacement; and sampling without replacement in chronological order. Methods: We constructed index cohorts of individuals from the Danish general population with no particular trait, except being alive and without ischemic stroke on the index date. We also constructed index cohorts of persons aged >50 years from the general population. We then applied the samplingstrategies to sample comparison cohorts (5:1 or 1:1) from the Danish general population and compared outcome risks between the index and comparison cohorts. Finally, we sampled comparison cohorts for a heart failure cohort using each strategy. Results: We observed increased outcome risks in comparison cohorts sampled 5:1 without replacement in random order compared to the index cohorts. However, these increases were minuscule unless index persons were aged >50 years. In this setting, sampling without replace- ment in chronological order failed to sample a sufficient number of comparators, and the mortality risks in these comparison cohorts were lower than in the index cohorts. Sampling 1:1 showed no systematic difference between comparison and index cohorts. When we sampled comparison cohorts for the heart failure patients, we observed a pattern similar to when index persons were aged >50 years.
the survey variables are assumed to be the realized values of random variables having a known probability distribution which involves both known and unknown parametric values. In this approach called the ‘super population model approach’ one tries to identify a sampling-estimating strategy that minimizes the average mean square error or the variance where averaging is done with respect to the probability distribution of the underlying variables, namely, Y 1 , Y 2 , ..., Y N . In order to compare the efficiency of the samplingstrategies proposed and discussed in detail in the previous sections we need to find the average variance of the estimators under the super population model suitable for populations with linear trend. The model is described as follows:
The stability of word embedding algorithms, i.e., the consistency of the word representa- tions they reveal when trained repeatedly on the same data set, has recently raised concerns. We here compare word embedding algorithms on three corpora of different sizes, and evalu- ate both their stability and accuracy. We find strong evidence that down-samplingstrategies (used as part of their training procedures) are particularly influential for the stability of SVD PPMI -type embeddings. This finding
nities at 5 m and 25 m depth separated by a distinct transition zone at 15 m depth. In the present study multidimensional scaling showed overlap of the samples taken at 15 m with those taken both at 5 m and 25 m. Both ANOSIM R values (Table 3) and SIMPER dis- similarity values (Table 2) are consistent with these observations although the latter are generally high, and similarity between 25 m samples lower than for samples at shallower depths. The areas (transect lengths) used in the present study did not imply any signiﬁcant vari- ability in the density of the selected groups of macro- invertebrates. This might mean that the transect lengths considered were not suﬃciently large to identify the minimum area for a sampling strategy with this kind of organisms. Nevertheless, Table 3 indicates sea urchin average abundance to be stable across depth classes which leads to the conclusion that the main eﬀect of this interaction arises from the area factor (i.e. transect length). From Fig. 2d it was possible to associate such high variability mainly to the transect lengths of 5 m and 10 m. Sea urchin density associated to longer transects (namely 15 m and 20 m) appeared to be quite stable, and these could therefore be considered as the transect lengths to be used in future samplingstrategies. In this context, given the diving time constraints and the need to combine samplingstrategies for algae and inverte- brates simultaneously in the same dive, and to minimize sampling eﬀort, the area chosen for future macroinver- tebrate sampling in the Azores was 15·1.5 m 2 .
With sufficient previous knowledge of the surface, sampling precision and efficiency can be improved. The adaptive sampling is such a strategy. Plenty of advanced samplingstrategies have been proposed in the past, especially in the last 20 years. Systematic design, stratification and adaptive sampling etc were combined that some advanced adaptive sampling methods were developed. These advanced techniques are classified into three types: optimal model-based strategies, sequential stopping sampling and adaptive allocation (or named as adaptive stratified sampling). Different advanced sampling methods derive from different application areas, like visual optimization, engineering measurement, and environment (ocean, atmosphere, electromagnetic waves, etc) inspection. A primary survey on the advanced sampling techniques is elaborated as below, and examples are listed in Table 1.
Abstract—Many real-world optimisation problems involve un- certainties, and in such situations it is often desirable to identify robust solutions that perform well over the possible future scenarios. In this paper, we focus on input uncertainty, such as in manufacturing, where the actual manufactured product may differ from the specified design but should still function well. Estimating a solution’s expected fitness in such a case is challenging, especially if the fitness function is expensive to evaluate, and its analytic form is unknown. One option is to average over a number of scenarios, but this is computationally expensive. The archive sample approximation method reduces the required number of fitness evaluations by re-using previous evaluations stored in an archive. The main challenge in the application of this method lies in determining the locations of additional samples drawn in each generation to enrich the information in the archive and reduce the estimation error. In this paper, we use the Wasserstein distance metric to approximate the possible benefit of a potential sample location on the estimation error, and propose new samplingstrategies based on this metric. Contrary to previous studies, we consider a sample’s contribution for the entire population, rather than inspecting each individual separately. This also allows us to dynamically adjust the number of samples to be collected in each generation. An empirical comparison with several previously proposed archive-based sam- ple approximation methods demonstrates the superiority of our approaches.
Estimates of crown biomass for each stand condition is necessary to understand nutrient depletion and for evaluating the economic feasibility of crown utilization for energy production or forest products (Hepp and Brister 1982). Furthermore, estimates of crown biomass aid in fuel load assessments and fire management strat- egies (He et al. 2013) because it is one of the important input variables in most wildfire models (Saatchi et al. 2007). Much of the focus in estimating crown biomass has been in the form of regression models and in the selection of predictor variables rather than in the methods of sample selection. In addition, comparisons of samplingstrategies have been carried out mainly for foliar biomass sampling rather than the total crown (branch wood, bark, and foliage) biomass. Thus, the evaluation of different sampling designs and sample size in estimating crown biomass is an important aspect of aboveground biomass estimation.
The time and space coverage provided by oceanographic data sets is generally limited and the optimization of data sampling, although a desirable task, is in practice difficult to achieve due to financial and logistic constraints. The objec- tive of the present work is to assess and compare the useful- ness of a number of samplingstrategies involving the collec- tion of temperature and salinity profiles using Observing Sys- tem Simulation Experiments (OSSE) techniques. The OSSE approach was first adopted by the meteorological community to assess the impact of future (i.e. not yet available from cur- rent instruments) observations, in order to improve numeri- cal weather predictions, and to assess the design of observing systems and observing networks (e.g. Arnold and Dey, 1986; Rohaly and Krishnamurti, 1993). Previous oceanographic applications to sampling strategy optimization, or assess- ment towards optimization, are reported by Kindle (1986), Barth and Wunsch (1990), Bennett (1990), Hernandez et al. (1994) and Hackert et al. (1998). OSSEs were also re- cently applied to observing systems design assessment in the Atlantic Ocean, using statistical methods (Guinehut et al., 2002, 2004), in the Mediterranean Sea, using twin exper- iments (Raicich and Rampazzo, 2003; Griffa et al., 2006; Taillandier et al., 2006) and in the Baltic Sea and North Sea, in the Optimal Design of Observational Networks project (She et al., 2006).
truth mean with different sample sizes was mostly associ- ated with snow depth variability at the plot scale. From the data obtained it was possible to infer a relationship between the degree of spatial autocorrelation and the mean standard error. However, this may have been a consequence of the relationship in this data set between the CV and the semivar- iogram range. A sensitivity analysis conducted with multiple simulations of snow depth for various autocorrelation ranges showed that the effect of autocorrelation on estimates of the mean was much lower than the standard deviation of the field. However, in the presence of spatial autocorrelation the sampling strategy became a relevant factor; snow depth esti- mates improved by maximizing the distance between sam- pling points within the plot and increasing the number of measurements. Specific configurations of the snow measure- ments did not make a significant difference to the quality of the estimates. Overall our results suggest that snow sampling should prioritize the collection at least five snow depth mea- surements at a minimum 2 m spacing to represent a 10 × 10 m plot sized area. The specific numbers presented here relating sample size and snow depth estimates are closely related to the topographic and climatic characteristics of the study area,
In this work, we analyzed the effect of the three para- meters in the GA optimization in Multi-LZerD, namely, the population size, whether to use the crossover opera- tion, and the threshold value in the structural clustering (Figure 1). The results suggest that an excessive sampling of the conformational space is not necessary in our multi- ple-docking procedure to find correct structure models.
Abstract: Sorting data is one of the most important problems that play an important rule in many applications in operations research, computer science and many other applications. Many sorting algorithms are well studied but the problem is not to find a way or algorithm to sort elements, but to find an efficiently way to sort elements and do the job. The output is a stream of data in time and it is a sorted data array. We are interested in this flow of data to estaplish a smart technique to sort elements as well as efficient complexity. For the performance of such algorithms, there has been little research on their stochastic behavior and mathematical properties such existance and convergence properties. In this paper we study the mathematical behavior of some different versions sorting algorithms in the case when the size of the input is very large. This work also discuss the corresponding running time using some different strategies in terms of number of comparisons and swaps. Here, we use a nice approach to show the existence of partial sorting process via the weighted branching process. This approach was inspired by the methods used for the analysis of Quickselect and Quichsort in the standard cases, where fixed point equations on the Cadlag space were considered for the first time.
To identify the reasons for the relatively worse perfor- mance of the original solutions of HMF and COSMIC at RB and WB, we compared these with calibration solutions for which we used the same single days, but with our model and calibration settings (in situ soil moisture data, COS- MIC with both parameters N and α calibrated). The differ- ences between the original and reference solution of HMF seemed to have been caused by the different values for the HMF coefficients and the chosen sampling days. The main cause for the systematic underestimations by COSMIC was that Baatz et al. (2014) calibrated only parameter N, since our solutions using the same days performed clearly bet- ter (MAE val = 7.6 cph at RB; 5.0 cph at WB; compare to
When parental data are missing, a likelihood-ratio test is implemented by using the expectation-maximization (EM) algorithm in incompletely genotyped triads (Weinberg, 1999b). Clayton (1999) proposed a likelihood ratio test for incomplete data in the computer software TRANSMIT. The family-based association test (FBAT) (Horvath, et al., 2001; Laird, et al., 2000; Rabinowitz and Laird, 2000) is a score test which treats the offspring genotype as a random variable, conditioning on observed traits and parental genotypes. If parental genotypes are missing, the FBAT is conditioned on sufficient statistics. These strategies keep the FBAT away from making assumptions about the parental allele frequencies, the trait distribution, and the marker allele frequencies. Therefore, it is robust to population stratification and is applicable to many pedigree structures. Several recent researches have been successfully applied by the FABT approach (Smit, et al., 2008).
tion variance if observations are not independent and if the sample size is not large. An equivalent effect can be expected on the estimator for within-day variance. This is a likely explanation why within-day variance estimates were inaccurate for samplingstrategies with larger block sizes, while they were not for strategies with block size 1, where observations will be (close to) unaffected by auto- correlation. Since variance components are partitions of the total (constant) variance present in the data, a nega- tive bias in the within-day variance estimate propagates to the other variance components, in particular showing up as a positive bias in the between-days variance. When block size increases, the time span between the observa- tions in the sample decreases. Hence, the sample will be more autocorrelated, which leads to a larger bias. We be- lieve that increased autocorrelation explains the occa- sional larger bias of variance components estimated by strategies where a particular sampling time was distribu- ted across four days rather than two. This will lead to a smaller sampling time per day and – if the block size is large – to a more dominant effect of autocorrelation.
With comparable samplingstrategies in both studies, we found almost double the proportion of wells that had high arsenic levels. Although this could imply that arsenic levels are getting worse over time, and urgent action needs to be taken, it also highlights the difficulties in estimating the size of the population at risk. As with our study, previous studies in the region have demonstrated wide variability in arsenic levels between wells. [13,15] even within the same geographical area. This provides a challenge in determining true prevalence in arsenic exposure risk in the population, as well as determining which non- sampled areas are most at risk, and more work needs to be done in establishing the factors that cause high arsenic levels in this region.