• No results found

Dimensionality Reduction I: Feature Selection

N/A
N/A
Protected

Academic year: 2021

Share "Dimensionality Reduction I: Feature Selection"

Copied!
28
0
0

Loading.... (view fulltext now)

Full text

(1)

Sebastian Raschka STAT 479: Machine Learning FS 2018

Dimensionality Reduction I:

Feature Selection

Lecture 13

1

STAT 479: Machine Learning, Fall 2018 Sebastian Raschka

http://stat.wisc.edu/~sraschka/teaching/stat479-fs2018/

(2)

Dimensionality Reduction

Feature Selection Feature Extraction

Today

(3)

Sebastian Raschka STAT 479: Machine Learning FS 2018 3

Dimensionality Reduction

Feature Extraction

Filter Methods Wrapper Methods Embedded Methods Feature Selection

(4)

Dimensionality Reduction

Feature Extraction

Filter Methods Wrapper Methods Embedded Methods

Feature SelectionInformation gain

Correlation with target

Pairwise correlation

Variance threshold

...

L1 (LASSO) regularization

Recursive Feature Elimination (RFE)

Sequential Feature Selection (SFS)

Permutation importance

...

(5)

Sebastian Raschka STAT 479: Machine Learning FS 2018 5

Variance Threshold (Filter)

• Compute the variance of each feature

• Assume that features with a higher variance may contain more useful information

• Select the subset of features based on a user-specified threshold 


("keep if greater or equal to x” or “keep the the top k features with largest variance”)

• Good: fast!

• Bad: does not take the relationship among features into account

(6)

Hyperparameters

parametric model: logistic regression

Hyperparameter

(7)

Sebastian Raschka STAT 479: Machine Learning FS 2018 7

Hyperparameter

L1 Regularization / LASSO (Embedded)

Least Absolute Shrinkage and Selection Operator

λ||w||

1

= λ

m

j

|w|

L1 norm:

(8)

Least Absolute Shrinkage and Selection Operator

L1 Regularization / LASSO (Embedded)

Cost minimum

(regularized estimate) w2

w1

Overall Goal:

Minimize cost + penalty Goal 1: Minimize cost

Goal 2) Minimize penalty

(9)

Sebastian Raschka STAT 479: Machine Learning FS 2018 9

Least Absolute Shrinkage and Selection Operator

Wine Dataset

https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data

'Class label' 'Alcohol' 'Malic acid' 'Ash' Alcalinity


of ash' 'Magnesium' 'Total phenols' 'Flavanoids' 'Nonflavanoid


phenols' 'Proanthocyanins' 'Color intensity' 'Hue' OD280/OD315 


of diluted wines' 'Proline'

1 14.23 1.71 2.43 15.6 127 2.8 3.06 0.28 2.29 5.64 1.04 3.92 1065

1 13.2 1.78 2.14 11.2 100 2.65 2.76 0.26 1.28 4.38 1.05 3.4 1050

1 13.16 2.36 2.67 18.6 101 2.8 3.24 0.3 2.81 5.68 1.03 3.17 1185

3 13.27 4.28 2.26 20 120 1.59 0.69 0.43 1.35 10.2 0.59 1.56 835

3 13.17 2.59 2.37 20 120 1.65 0.68 0.53 1.46 9.3 0.6 1.62 840

3 14.13 4.1 2.74 24.5 96 2.05 0.76 0.56 1.35 9.2 0.61 1.6 560

L1 Regularization / LASSO (Embedded)

(10)

Least Absolute Shrinkage and Selection Operator

Wine Dataset

https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data

LASSO Path

(don't forget to normalize/standardize features)

L1 Regularization / LASSO (Embedded)

(11)

Sebastian Raschka STAT 479: Machine Learning FS 2018 11

Recursive Feature Elimination (Wrapper)

Consider a (generalized) linear model:


1. Fit model to dataset

2. Eliminate feature with the smallest coefficient ("most unimportant") 3. Repeat steps 1-2 until desired number of features is reached

(12)

Random Forest Feature Importance

Usually measured as

impurity decrease (Gini, Entropy) for a given node/feature decision

weighted by number of examples at that node

averaged over all trees

then normalize so that sum of feature importances sum to 1

"Method A"

(used in scikit-learn)

(13)

Sebastian Raschka STAT 479: Machine Learning FS 2018 13

Permutation Test (Interlude)

A nonparametric test procedure to test the null hypothesis that two different groups come from the same distribution

Can be used for significance or hypothesis testing w/o requiring to make any assumptions about the sampling distribution (e.g., it doesn't require the samples to be normal distributed).

Under the null hypothesis (treatment = control), any permutations are equally likely

Note that there are (n+m)! permutations, where n is the number of records in the treatment sample, and m is the number of records in the control sample

For a two-sided test, we define the alternative hypothesis that the two samples are different (e.g., treatment != control)

(14)

Permutation Test

1. Compute the difference (here: mean) of sample x (size n) and sample y (size m) 2. Combine all measurements into a single dataset

3. Draw a permuted dataset from all possible permutations of the dataset in 2.

4. Divide the permuted dataset into two datasets x' and y' of size n and m, respectively

5. Compute the difference (here: mean) of sample x' and sample y' and record this difference 6. Repeat steps 3-5 until all permutations are evaluated

7. Return the p-value as the number of times the recorded differences were more extreme than the original difference from 1., then divide this number by the total number of permutations

Here, the p-value is defined as the probability, given the null hypothesis (no difference between the samples) is true, that we obtain results that are at least as extreme as the results we observed (i.e., the sample difference from 1.).

(15)

Sebastian Raschka STAT 479: Machine Learning FS 2018 15

Permutation Test

1. Compute the difference (here: mean) of sample x (size n) and sample y (size m) 2. Combine all measurements into a single dataset

3. Draw a permuted dataset from all possible permutations of the dataset in 2.

4. Divide the permuted dataset into two datasets x' and y' of size n and m, respectively

5. Compute the difference (here: mean) of sample x' and sample y' and record this difference 6. Repeat steps 3-5 until all permutations are evaluated

7. Return the p-value as the number of times the recorded differences were more extreme than the original difference from 1., then divide this number by the total number of permutations

Here, the p-value is defined as the probability, given the null hypothesis (no difference between the samples) is true, that we obtain results that are at least as extreme as the results we observed (i.e., the sample difference from 1.).

(16)

Permutation Test

Here, the p-value is defined as the probability, given the null hypothesis (no difference between the samples) is true, that we obtain results that are at least as extreme as the results we observed (i.e., the sample difference from 1.).

p(t > t0) = 1

(n + m)!

(n+m)!

j=1

I(tj > t0), where t0 is the observed value of the test statistic, and

t is the t-value, the statistic computed from the resamples, and I is the indicator function.

(17)

Sebastian Raschka STAT 479: Machine Learning FS 2018 17

Feature Importance Through Permutation (Wrapper)

1. Take a model that was fit to the training set

2. Estimate the predictive performance of the model on an

independent dataset (e.g., validation dataset) and record it as the baseline performance

3. For each feature i:

a. randomly permute feature column i in the original dataset

b. record the predictive performance of the model on the dataset with the permuted column

c. compute the feature importance as the difference between the baseline performance (step 2) and the performance on the

permuted dataset

Repeat a-c exhaustively (all combinations) or a large number of times and compute the feature importance as the average difference

intuitive & model-agnostic

(18)

Feature Importance Through Permutation (Wrapper)

intuitive & model-agnostic

Column-Drop variant:

For each feature column i:

1. temporarily remove column 2. fit model to reduced dataset

3. compute validation set performance and compare to before

(19)

Sebastian Raschka STAT 479: Machine Learning FS 2018 19

Random Forest Importance vs Permutation

from sklearn.datasets import make_classification from sklearn.ensemble import RandomForestClassifier

# Build a classification task using 3 informative features X, y = make_classification(n_samples=10000,

n_features=10, n_informative=3, n_redundant=0, n_repeated=0, n_classes=2, random_state=0, shuffle=False)

Permutation performance is a universal method, RF just shown as an example

Permutation performance much more expensive, and in case of RF, usually computationally wasteful but can be more robust

In the special case of RF, we can also use the OOB examples instead of a validation set (next slide)

(20)

Random Forest Feature Importance

"Method B"

Out-of-bag accuracy:

During training, for each tree, make prediction for OOB sample (1/3 of the training data)

Based on those predictions where example i was OOB, compute label via majority

voteThe proportion over all examples where the majority vote is wrong is the OOB accuracy estimate

Out-of-bag feature importance via permutation:

Count votes for correct class

Given feature i, permute this feature in OOB examples of a tree

Compute the number of correct votes after permutation from the number of votes before permutation for given tree

Repeat for all trees in the random forest and average the importance

(21)

Sebastian Raschka STAT 479: Machine Learning FS 2018 21

Sequential Forward Selection (Wrapper)

Sequential Feature Selector

Implementation of sequential feature algorithms (SFAs) -- greedy search algorithms -- that have been developed as a suboptimal solution to the computationally often not feasible exhaustive search.

from mlxtend.feature_selection import SequentialFeatureSelector

Overview

Sequential feature selection algorithms are a family of greedy search algorithms that are used to reduce an initial d-dimensional feature space to a k-dimensional feature subspace where k < d. The motivation behind feature selection algorithms is to automatically select a subset of features that is most relevant to the problem. The goal of feature selection is two-fold: We want to improve the computational efficiency and reduce the

generalization error of the model by removing irrelevant features or noise. A wrapper approach such as sequential feature selection is especially useful if embedded feature selection -- for example, a regularization penalty like LASSO -- is not applicable.

In a nutshell, SFAs remove or add one feature at the time based on the classifier performance until a feature subset of the desired size k is reached. There are 4 different flavors of SFAs available via the SequentialFeatureSelector:

1. Sequential Forward Selection (SFS) 2. Sequential Backward Selection (SBS)

3. Sequential Forward Floating Selection (SFFS) 4. Sequential Backward Floating Selection (SBFS)

The floating variants, SFFS and SBFS, can be considered as extensions to the simpler SFS and SBS algorithms. The floating algorithms have an additional exclusion or inclusion step to remove features once they were included (or excluded), so that a larger number of feature subset

combinations can be sampled. It is important to emphasize that this step is conditional and only occurs if the resulting feature subset is assessed as "better" by the criterion function after removal (or addition) of a particular feature. Furthermore, I added an optional check to skip the

conditional exclusion steps if the algorithm gets stuck in cycles.

How is this different from Recursive Feature Elimination (RFE) -- e.g., as implemented in sklearn.feature_selection.RFE? RFE is computationally less complex using the feature weight coefficients (e.g., linear models) or feature importance (tree-based algorithms) to eliminate features

recursively, whereas SFSs eliminate (or add) features based on a user-defined classifier/regression performance metric.

The SFAs are outlined in pseudo code below:

Sequential Forward Selection (SFS)

Input:

The SFS algorithm takes the whole -dimensional feature set as input.

Output: , where

SFS returns a subset of features; the number of selected features , where , has to be specified a priori.

Initialization: ,

Y = { , , . . . , }y1 y2 yd

d

= { | j = 1, 2, . . . , k; ∈ Y}

Xk xj xj k = (0, 1, 2, . . . , d)

k k < d

= ∅

X0 k = 0

We initialize the algorithm with an empty set ("null set") so that (where is the size of the subset).

Step 1 (Inclusion):

Go to Step 1

in this step, we add an additional feature, , to our feature subset .

is the feature that maximizes our criterion function, that is, the feature that is associated with the best classifier performance if it is added to .

We repeat this procedure until the termination criterion is satisfied.

Termination:

We add features from the feature subset until the feature subset of size contains the number of desired features that we specified a priori.

Sequential Backward Selection (SBS)

Input: the set of all features,

The SBS algorithm takes the whole feature set as input.

Output: , where

SBS returns a subset of features; the number of selected features , where , has to be specified a priori.

Initialization: ,

We initialize the algorithm with the given feature set so that the . Step 1 (Exclusion):

Go to Step 1

In this step, we remove a feature, from our feature subset .

is the feature that maximizes our criterion function upon re,oval, that is, the feature that is associated with the best classifier performance if it is removed from .

We repeat this procedure until the termination criterion is satisfied.

Termination:

We add features from the feature subset until the feature subset of size contains the number of desired features that we specified a priori.

Sequential Backward Floating Selection (SBFS)

Input: the set of all features,

The SBFS algorithm takes the whole feature set as input.

Output: , where

SBFS returns a subset of features; the number of selected features , where , has to be specified a priori.

Initialization: ,

k = 0 k

= arg max J( + x), where x ∈ Y −

x+ xk Xk

= +

Xk+ 1 Xk x+ k = k + 1

x+ Xk

x+ Xk

k = p

Xk k p

Y = { , , . . . , }y1 y2 yd

= { | j = 1, 2, . . . , k; ∈ Y}

Xk xj xj k = (0, 1, 2, . . . , d)

k k < d

= Y

X0 k = d

k = d

= arg max J( − x), where x ∈

x xk Xk

=

Xk−1 Xk x k = k − 1

x Xk

x

Xk

k = p

Xk k p

Y = { , , . . . , }y1 y2 yd

= { | j = 1, 2, . . . , k; ∈ Y}

Xk xj xj k = (0, 1, 2, . . . , d)

k k < d

= Y

X0 k = d

(22)

Sebastian Raschka STAT 479: Machine Learning FS 2018 22

Sequential Backward Selection

We initialize the algorithm with an empty set ("null set") so that (where is the size of the subset).

Step 1 (Inclusion):

Go to Step 1

in this step, we add an additional feature, , to our feature subset .

is the feature that maximizes our criterion function, that is, the feature that is associated with the best classifier performance if it is added to .

We repeat this procedure until the termination criterion is satisfied.

Termination:

We add features from the feature subset until the feature subset of size contains the number of desired features that we specified a priori.

Sequential Backward Selection (SBS)

Input: the set of all features,

The SBS algorithm takes the whole feature set as input.

Output: , where

SBS returns a subset of features; the number of selected features , where , has to be specified a priori.

Initialization: ,

We initialize the algorithm with the given feature set so that the . Step 1 (Exclusion):

Go to Step 1

In this step, we remove a feature, from our feature subset .

is the feature that maximizes our criterion function upon re,oval, that is, the feature that is associated with the best classifier performance if it is removed from .

We repeat this procedure until the termination criterion is satisfied.

Termination:

We add features from the feature subset until the feature subset of size contains the number of desired features that we specified a priori.

Sequential Backward Floating Selection (SBFS)

Input: the set of all features,

The SBFS algorithm takes the whole feature set as input.

k = 0 k

= arg max J( + x), where x ∈ Y −

x+ xk Xk

= +

Xk+ 1 Xk x+ k = k + 1

x+ Xk

x+ Xk

k = p

Xk k p

Y = { , , . . . , }y1 y2 yd

= { | j = 1, 2, . . . , k; ∈ Y}

Xk xj xj k = (0, 1, 2, . . . , d)

k k < d

= Y

X0 k = d

k = d

= arg max J( − x), where x ∈

x xk Xk

=

Xk−1 Xk x k = k − 1

x Xk

x

Xk

k = p

Xk k p

Y = { , , . . . , }y1 y2 yd

(23)

Sebastian Raschka STAT 479: Machine Learning FS 2018 23

Sequential Floating Forward Selection

We initialize the algorithm with the given feature set so that the . Step 1 (Exclusion):

Go to Step 2

In this step, we remove a feature, from our feature subset .

is the feature that maximizes our criterion function upon re,oval, that is, the feature that is associated with the best classifier performance if it is removed from .

Step 2 (Conditional Inclusion):

if J(x_k + x) > J(x_k + x):

Go to Step 1

In Step 2, we search for features that improve the classifier performance if they are added back to the feature subset. If such features exist, we add the feature for which the performance improvement is maximized. If or an improvement cannot be made (i.e., such feature

cannot be found), go back to step 1; else, repeat this step.

Termination:

We add features from the feature subset until the feature subset of size contains the number of desired features that we specified a priori.

Sequential Forward Floating Selection (SFFS)

Input: the set of all features,

The SFFS algorithm takes the whole feature set as input, if our feature space consists of, e.g. 10, if our feature space consists of 10 dimensions (d = 10).

Output: a subset of features, , where

The returned output of the algorithm is a subset of the feature space of a specified size. E.g., a subset of 5 features from a 10-dimensional feature space (k = 5, d = 10).

Initialization: ,

We initialize the algorithm with an empty set ("null set") so that the k = 0 (where k is the size of the subset)

Step 1 (Inclusion):

Go to Step 2

Step 2 (Conditional Exclusion):

k = d

= arg max J( − x), where x ∈

x xk Xk

=

Xk−1 Xk x k = k − 1

x Xk

x

Xk

= arg max J( + x), where x ∈ Y −

x+ xk Xk

= +

Xk+ 1 Xk x+ k = k + 1

x+ k = 2

x+

k = p

Xk k p

Y = { , , . . . , }y1 y2 yd

= { | j = 1, 2, . . . , k; ∈ Y}

Xk xj xj k = (0, 1, 2, . . . , d)

= Y

X0 k = d

= arg max J( + x), where x ∈ Y −

x+ xk Xk

= +

Xk+ 1 Xk x+ k = k + 1

= arg max J( − x), where x ∈

x xk Xk

:

Go to Step 1

In step 1, we include the feature from the feature space that leads to the best performance increase for our feature subset (assessed by the criterion function). Then, we go over to step 2

In step 2, we only remove a feature if the resulting subset would gain an increase in performance. If or an improvement cannot be made (i.e., such feature cannot be found), go back to step 1; else, repeat this step.

Steps 1 and 2 are repeated until the Termination criterion is reached.

Termination: stop when k equals the number of desired features

References

Ferri, F. J., Pudil P., Hatef, M., Kittler, J. (1994). "Comparative study of techniques for large-scale feature selection."

(https://books.google.com/books?

hl=en&lr=&id=sbajBQAAQBAJ&oi=fnd&pg=PA403&dq=comparative+study+of+techniques+for+large+scale&ots=KdIOYpA8wj&sig=hdOsBP1HX4hcDjx4RLg_chheojc#v=onepage&q=comparative%20study%20of%20techniques%20for%20large%20scale&f=false) Pattern Recognition in Practice IV : 403-413.

Pudil, P., Novovičová, J., & Kittler, J. (1994). "Floating search methods in feature selection."

(http://www.sciencedirect.com/science/article/pii/0167865594901279) Pattern recognition letters 15.11 (1994): 1119-1125.

Example 1 - A simple Sequential Forward Selection example

Initializing a simple classifier from scikit-learn:

from sklearn.neighbors import KNeighborsClassifier from sklearn.datasets import load_iris

iris = load_iris() X = iris.data y = iris.target

knn = KNeighborsClassifier(n_neighbors=4)

We start by selection the "best" 3 features from the Iris dataset via Sequential Forward Selection (SFS). Here, we set forward=True and

floating=False. By choosing cv=0, we don't perform any cross-validation, therefore, the performance (here: 'accuracy') is computed entirely on the training set.

from mlxtend.feature_selection import SequentialFeatureSelector as SFS sfs1 = SFS(knn,

k_features=3, forward=True, floating=False, verbose=2,

scoring='accuracy', cv=0)

sfs1 = sfs1.fit(X, y)

if J( − x) > J( − x)xk xk

=

Xk−1 Xk x k = k − 1

k = 2 x+

(24)

Sebastian Raschka STAT 479: Machine Learning FS 2018 24

Sequential Floating Backward Selection

We initialize the algorithm with an empty set ("null set") so that (where is the size of the subset).

Step 1 (Inclusion):

Go to Step 1

in this step, we add an additional feature, , to our feature subset .

is the feature that maximizes our criterion function, that is, the feature that is associated with the best classifier performance if it is added to .

We repeat this procedure until the termination criterion is satisfied.

Termination:

We add features from the feature subset until the feature subset of size contains the number of desired features that we specified a priori.

Sequential Backward Selection (SBS)

Input: the set of all features,

The SBS algorithm takes the whole feature set as input.

Output: , where

SBS returns a subset of features; the number of selected features , where , has to be specified a priori.

Initialization: ,

We initialize the algorithm with the given feature set so that the . Step 1 (Exclusion):

Go to Step 1

In this step, we remove a feature, from our feature subset .

is the feature that maximizes our criterion function upon re,oval, that is, the feature that is associated with the best classifier performance if it is removed from .

We repeat this procedure until the termination criterion is satisfied.

Termination:

We add features from the feature subset until the feature subset of size contains the number of desired features that we specified a priori.

Sequential Backward Floating Selection (SBFS)

Input: the set of all features,

The SBFS algorithm takes the whole feature set as input.

Output: , where

SBFS returns a subset of features; the number of selected features , where , has to be specified a priori.

Initialization: ,

k = 0 k

= arg max J( + x), where x ∈ Y −

x+ xk Xk

= +

Xk+ 1 Xk x+ k = k + 1

x+ Xk

x+ Xk

k = p

Xk k p

Y = { , , . . . , }y1 y2 yd

= { | j = 1, 2, . . . , k; ∈ Y}

Xk xj xj k = (0, 1, 2, . . . , d)

k k < d

= Y

X0 k = d

k = d

= arg max J( − x), where x ∈

x xk Xk

=

Xk−1 Xk x k = k − 1

x Xk

x

Xk

k = p

Xk k p

Y = { , , . . . , }y1 y2 yd

= { | j = 1, 2, . . . , k; ∈ Y}

Xk xj xj k = (0, 1, 2, . . . , d)

k k < d

= Y

X0 k = d

We initialize the algorithm with the given feature set so that the . Step 1 (Exclusion):

Go to Step 2

In this step, we remove a feature, from our feature subset .

is the feature that maximizes our criterion function upon re,oval, that is, the feature that is associated with the best classifier performance if it is removed from .

Step 2 (Conditional Inclusion):

if J(x_k + x) > J(x_k + x):

Go to Step 1

In Step 2, we search for features that improve the classifier performance if they are added back to the feature subset. If such features exist, we add the feature for which the performance improvement is maximized. If or an improvement cannot be made (i.e., such feature

cannot be found), go back to step 1; else, repeat this step.

Termination:

We add features from the feature subset until the feature subset of size contains the number of desired features that we specified a priori.

Sequential Forward Floating Selection (SFFS)

Input: the set of all features,

The SFFS algorithm takes the whole feature set as input, if our feature space consists of, e.g. 10, if our feature space consists of 10 dimensions (d = 10).

Output: a subset of features, , where

The returned output of the algorithm is a subset of the feature space of a specified size. E.g., a subset of 5 features from a 10-dimensional feature space (k = 5, d = 10).

Initialization: ,

We initialize the algorithm with an empty set ("null set") so that the k = 0 (where k is the size of the subset)

Step 1 (Inclusion):

Go to Step 2

k = d

= arg max J( − x), where x ∈

x xk Xk

=

Xk−1 Xk x k = k − 1

x Xk

x

Xk

= arg max J( + x), where x ∈ Y −

x+ xk Xk

= +

Xk+ 1 Xk x+ k = k + 1

x+ k = 2

x+

k = p

Xk k p

Y = { , , . . . , }y1 y2 yd

= { | j = 1, 2, . . . , k; ∈ Y}

Xk xj xj k = (0, 1, 2, . . . , d)

= Y

X0 k = d

= arg max J( + x), where x ∈ Y −

x+ xk Xk

= +

Xk+ 1 Xk x+ k = k + 1

(25)

Sebastian Raschka STAT 479: Machine Learning FS 2018 25 Pudil, P., Novovičová, J., & Kittler, J. (1994). "Floating search methods in feature selection." Pattern recognition letters

15.11 (1994): 1119-1125.

(26)

Ferri, F. J., Pudil P., Hatef, M., Kittler, J. (1994). "Comparative study of techniques for large-scale feature

(27)

Sebastian Raschka STAT 479: Machine Learning FS 2018 27 Ferri, F. J., Pudil P., Hatef, M., Kittler, J. (1994). "Comparative study of techniques for large-scale feature

selection." Pattern Recognition in Practice IV : 403-413.

(28)

Code Examples

https://github.com/rasbt/stat479-machine-learning-fs18/blob/master/13_feat-sele/13_feat- sele_code.ipynb

References

Related documents

In smokers without respiratory disease, an Alcohol- consumption pattern is associated with impaired lung function overall population, a Westernised pattern re- duced lung function

The tests as per IS standards were carried out to determine the variation in liquid limit, plastic limit, plasticity index, compaction characteristics and unconfined

Having identified gaps in exist- ing research, relating to the lack of semantic consideration and a syntactic approach that was too narrowly focused, we settled on a

Infant and young child feeding practices differ by ethnicity of Vietnamese mothers RESEARCH ARTICLE Open Access Infant and young child feeding practices differ by ethnicity of

Soliton, breather, lump and their interaction solutions of the (2+1$2+1$) dimensional asymmetrical Nizhnik?Novikov?Veselov equation Liu and Wen Advances in Difference Equations (2019)

appeared that the majority of cauliflower mitochondrial protein spots responsive to cold and heat 211.. stress, belonged to the class participating in carbohydrate

• First, the best single feature is selected (i.e., using some criterion). • Then, pairs of features are formed using one of the remaining features and this best feature, and the