• No results found

2.3 A New SAS R Macro

2.3.3 Additional Inputs

In addition to the input files, the user must specify additional information prior to running the macro. This includes the following choices: number of latent classes, computational method, whether or not to fit correlation between the random effects, whether variances inDand/orRare permitted to vary by class, how to model class membership, and whether to calculate standard errors (and which method to use) or predictions of the random effects. Comments related to these choices are provided in the following sections.

Number of Latent Classes

For each run of the SASR macro, the user is required to specify the number of latent classes to fit - this is entered in the macro variable ’NumClasses’. In some cases, the user may have a pretty good idea of how many unique classes they are interested in. In other cases, the user may not be sure how many classes exist and would be served best by running the macro several times, each with a different

Computational Methods

In previous research, LCLMMs have almost exclusively been fit using the EM algorithm. However, with its slow convergence properties and the complexity of the LCLMMs, computation can be too slow for practical use. The new SASR macro presented here allows the user to select from several gradient-based methods and Hessian-based methods, in addition to running the EM algorithm. A simulation study is presented in Section 2.9 which provides the basis for recommendations as to which algorithm works best for various size models.

The user must choose from the following options - note that the computational method is specified in the macro variable ’Method’:

EM − EM Algorithm

CG − Conjugate Gradient Algorithm QN − Quasi-Newton Algorithm

NRA − Newton-Raphson Algorithm (without ridging) NRR − Newton-Raphson Algorithm (with ridging)

The EM and Newton-Raphson algorithms are discussed in detail in Chapter 1. Additional de- tails related to the other algorithms can be found in Chapter 4 of the SASR/OR 9.2 User’s Guide [2008]. Note that when fitting only one latent class (generalizing the LMM), the user cannot specify ’Method’=’EM’.

Is there non-zero correlation between random effects?

As in the LMM, the user must determine whether to model a correlation structure for the random ef- fects. If this is desired, then the associated macro variable in the latent class program, ’D HasCorr YN’, must be set to ’Y’. The structure of the file containing the linear correlation structure was discussed earlier.

Should variances be permitted to vary by latent class?

In many situations, the variances for one class may be very different than another class. If there is rea- son to believe this may be true, then the model should be run allowing variances to differ by latent class. This is controlled by setting the macro variables ’D VarDiffByClass YN’ and ’R VarDiffByClass YN’ to either ’Y’ or ’N’. In Chapter 1, it was observed that such a decision can have a noticeable effect on

the determination of latent classes and the underlying LMMs.

Class Membership - Structured on Unstructured?

Class membership may be specified in several ways. In some cases, the statistician may simply want to generalize the distributional assumptions of the usual LMM to allow the underlying normal distri- butions to instead be mixtures of normal distributions. This would be specified by defining the file ’v’ to be a single column of 1’s and setting the macro variable ’PieMethod’ equal to ’STRUCTURED’. In more general situations where interest focuses on discovering unique groups in the data and fitting appropriate LMMs to these groups, class membership may be specified in two very different ways. When the statistician is interested in simply fitting the best K LMMs or is not sure of which factors may be most important in determining class membership, running the analysis with class member- ship determined solely by the relative fit of the underlying LMMs would be recommended - specify ’PieMethod’=’UNSTRUCTURED’. Alternatively, the statistician may already have a set of risk fac- tors which are of interest, and one of the goals of the analysis might be to determine the relationship between the risk factors and latent class membership. In this situation, the file ’v’ would be specified with factors relevant to class membership, and ’PieMethod’ would be set to ’STRUCTURED’. Note that when fitting only one class (LMM), the user must specify ’PieMethod’=’UNSTRUCTURED’.

Standard Errors and Predictions

The main algorithm does not, by default, compute standard errors for the parameters. If you would like to obtain parameter standard errors, set ’CalcParmStdErr’=’Y’ and specify a computational method for standard errors in the variable ’SEMethod’. For Hessian-based methods, the most efficient method is to compute the standard errors by making use of the computed Hessian - specify ’SEMethod’=’HES’. This is probably a good choice even when other methods are used for the main algorithm. The standard

these starting values to initiate the model. This number of base runs is specified in a macro variable ’HowManyBaseRuns’, and the seeds used to randomly assign latent classes to initiate those base runs are stored in the macro variables ’SeedForClasses x’. These preliminary runs iterate for ten iterations with the random effect variances set to zero.

Second, since the derivations for the first and second derivatives are complex, a self-check has been programmed which allows the user to confirm that the method’s computations are working properly. If the macro variable ’PerformSelfCheck’ is set to ’Y’, then at several points in the algorithm, the algorithm will compute quantities using finite differences and print these as well as the values computed using the derived first and second derivatives. This will add to the runtime but can be helpful in that it can confirm that the model is working as it is supposed to.

Related documents