SimSel - a new simulation feature selection method I
2007 (English)Report (Other scientific)
In pharmaceutical research there are datasets describing the interactions between proteins and molecules. The datasets include a huge number of independent variables (features) and the response variable is typically the binding strength. Thus, one of the most challenging problems is to find the features that have a real influence on the binding strength.
Here we present a feature selection method. The principle of the algorithm is to disturb each single feature by adding pseudo errors and to study the influence on the quality of the model fit.
The main idea is that the change of unimportant features has no effect on the binding strength.
Place, publisher, year, edition, pages
Uppsala University , 2007.
, U.U.D.M. Report, ISSN 1101-3591 ; 2007:65
variable selection, model choice, regression, simulation method, pseudoerrors, SIMEX, residual sum of squares
Pharmacology and Toxicology Probability Theory and Statistics
IdentifiersURN: urn:nbn:se:uu:diva-14231OAI: oai:DiVA.org:uu-14231DiVA: diva2:42001