br One important objective of feature selection techniques i
One important objective of feature selection techniques is to avoid over-fitting and improve model performance, i.e., higher pre-diction accuracy for supervised classification and better cluster detection for unsupervised classification. Improving a model per-formance means providing faster and more cost-effective model. There are many FS techniques that are differing in the way each technique copes with the feature space to form a feature subset. In the classification problem context, according to how FS techniques combine the feature selection search with the construction of the classification model, they Okadaic acid can be divided into three categories; fil-ter methods, wrapper methods, and embedded methods.
The Filter techniques select relevant features independently of the selection model. They measure the relevance of features by only using the properties of the data then order the features ac-cording to the calculated relevance score. The Filter techniques are simple and fast but they neglect the features dependencies. Fil-ter techniques are performed as a pre-processing step in the se-lection model and can be followed by one or more classification algorithms. The second category of FS techniques are the Wrapper techniques where selecting the features is dependent on the selec-tion model. They define feature subsets and evaluate each by us-ing a classification algorithm. Then, they select the feature subset with the high evaluation measure. The Wrapper techniques take into consideration the feature dependencies, but they are slower and computationally intensive. In the third category, namely the
Embedded techniques, the feature selection is built into the search step done by classification algorithm. They consider the feature de-pendencies but with less computations than Wrapper techniques. For more details about feature selection, the reader can refer to (Saeys et al., 2007).
Using one FS technique does not guarantee obtaining a univer-sally optimal feature subset (Yeung, Bumgarner, & Raftery, 2005). So, an ensemble FS approach runs different FS techniques where each technique produces a separate feature subset. Then, the en-semble FS approach combines the resulting feature subsets to form a final feature subset as its outcome. Ensemble FS approaches dif-fer from each other in how they combine features. They may use averaging over multiple separate feature subsets (Levner, 2005; Li
& Yang, 2002) that result from performing different runs of the same technique (for example, genetic algorithm) to assess the im-portance of each feature (Li, Umbach, Terry, & Taylor, 2004; Li, Weinberg, Darden, & Pedersen, 2001), and using a collection of decision trees as random forests to assess the relevance of each feature (Díaz-Uriarte & De Andres, 2006; Jiang et al., 2004). En-semble FS approaches improve the robustness, stability, and gen-erality but they require additional computations. The development of ensemble frameworks is a promising trend for improving the gene selection problem and the feature selection process in gen-eral. That’s because the characteristics of an ensemble framework are more flexible and e cient in dealing with high dimensional data (Chin et al., 2015). r> Genetic Algorithms (GAs), inspired by John Holland during the 1970s, are a class of evolutionary algorithms motivated by the bi-ological theory of evolution, made popular (Scrucca et al., 2013). A Genetic Algorithm (GA) is used in search and optimization prob-lems utilizing the “survival of the fittest” concept as known in evolutionary biology. A GA mimics the natural selection process in producing sets of solutions (population). Each solution, called chromosome, consists of a set of features (genes) that represent a candidate solution for the underlined problem. GA repeatedly gen-erates solutions, evaluates their fitness and terminates when the given objective is achieved or when some stopping criteria is met. Genetic operators and fitness function characterize the implemen-tation of a genetic algorithm. Fitness function is considered the main guide to the selection of the features. It is used to assign a probability to each chromosome in the population which reflects the quality of that chromosome and controls keeping that chromo-some to the next generation. Genetic operators are used to inves-tigate the entire search space and to avoid the local minima. The commonly used operators are crossover and mutation. Crossover is a mechanism for swapping genes between two randomly cho-sen chromosomes producing two new chromosomes for the next generation. Crossover can be performed on different kinds of rep-resentations (like binary or floating-point encodings). It also can be performed at single or multiple crossover points between chromo-somes (Coley, 1999).