|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.attributeSelection.ASSearch
weka.attributeSelection.RerankingSearch
public class RerankingSearch
It first creates an univariate ranking of all
attributes in decreasing order given an information-theory-based
AttributeEvaluator; then, the ranking is split in block of size B, and a
ASSearch is run for the first block. Given the selected attributes, the rest
of the ranking is re-ranked and ASSearch is run again on the first current
block. Search stops when no attribute is selected in current block. For more
information, see
Pablo Bermejo et. al. Fast wrapper feature subset selection in
high-dimensional datasets by means of filter re-ranking. Knowledge-Based
Systems. doi:10.1016/j.knosys.2011.01.015.
@article{BermejoRerank, author = "Pablo Bermejo and Luis de la Ossa and Jose A. Gamez and Jose M. Puerta", title = "Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking", journal = "Knowledge-Based Systems", number = "", pages = " - ", year = "2011", issn = "0950-7051", doi = "DOI: 10.1016/j.knosys.2011.01.015", }Valid options are:
-methodSpecifies the method used to re-ranking attributes (default 0: CMIM)
-blockSizeSpecifies the size of blocks over which search is performed (default 20)
-rankingMeasureinformation-theory-based univariate attribute evaluator to create first ranking (default 0: Information Gain)
-searchClass name of ASSearch search algorithm to be used over blocks. Place any options of the search algorithm LAST on the command line following a "--". eg.: -search weka.attributeSelection.GreedyStepwise ... -- -C
Field Summary | |
---|---|
protected static int |
CMIM
Fleuret's Conditional Mutual Information Maximization |
protected static int |
IG
|
protected weka.attributeSelection.ASEvaluation |
m_ASEval
attribute set evaluator applied in search |
protected double[] |
m_attributes_merits_globalIndexes
merit of each attribute evaluated by m_univariateEvaluator position i referes to attribute i in training data |
protected int |
m_B
Block size |
protected int |
m_blocksSearched
Total number of blocks over which search was performed |
protected int |
m_informationBasedEvaluator
Uni-varaite Attribute evaluator respect to the class. |
protected int[] |
m_ranking
ranking of attributes in decreasing order given m_univariateEvaluator |
protected double |
m_rerankingTime_ms
Time (ms) spent in re-ranking |
protected int |
m_rerankMethod
Re-rank method |
protected weka.attributeSelection.ASSearch |
m_searchAlgorithm
search algorithm applied over block to select attributes |
protected double |
m_searchTime_ms
Total time (ms) spent in search |
protected int[] |
m_selected
selected attributes |
protected static int |
MIFS
Battiti's Mutual Information-Based Feature Selection |
protected static int |
MRMR
Peng's Max-Relevance and Min-Redundancy |
(package private) static long |
serialVersionUID
for serialisation |
protected static int |
SU
|
static weka.core.Tag[] |
TAGS_INFORMATION_BASED_EVAL
|
static weka.core.Tag[] |
TAGS_RERANK
|
Constructor Summary | |
---|---|
RerankingSearch()
|
Method Summary | |
---|---|
java.lang.String |
bTipText()
Returns the tip text for this property. |
protected void |
createUnivariateRanking(weka.core.Instances data)
It creates a ranking of attributes in decreasing order of merit, given the chosen attribute evaluator. |
protected boolean |
different(int[] a,
int[] b)
check if two arrays contain the same values, in any order |
int |
getB()
get size of blocks over which search is performed |
int |
getBlocksSearched()
get number of blocks attributes in ranking over which search has been performed. |
protected double |
getConditionalMutualInformation(int posX,
int posY,
int posZ,
weka.core.Instances data)
Computes I(X;Y|Z) |
protected int[] |
getGlobalIndexes(int[] relative,
weka.core.Instances relativeData,
weka.core.Instances globalData)
It converts the indexes of attributes in relative[] refering to relativeData, to the indexes of the refered attributes in globalInstances data. |
weka.core.SelectedTag |
getInformationBasedEvaluator()
* Get method used to crate first univariate ranking |
protected double[][] |
getJointXY(int pX,
int pY,
weka.core.Instances data)
Compute joint probability por attribute pX and pY in data |
protected double[][][] |
getJointXYZ(int pX,
int pY,
int pZ,
weka.core.Instances data)
Compute joint probability por attribute pX, pY and pZ in data |
protected double[] |
getMarginalProb(int pos,
weka.core.Instances data)
Computes marginal probability for attribute with index pos, in data |
protected double |
getMutualInformation(int posX,
int posY,
weka.core.Instances data)
Computes I(X;Y) |
java.lang.String[] |
getOptions()
get a String[] describing the value set for all options |
double |
getRerankingTime_ms()
time in milliseconds spent during the search in re-ranking computations |
weka.core.SelectedTag |
getRerankMethod()
Get method used for re-ranking |
weka.attributeSelection.ASSearch |
getSearchAlgorithm()
Get a deep copy of the search algorithm used over blocks |
double |
getSearchTime_ms()
total time in milliseconds spent during the search |
int[] |
getSelected()
get list of attributes selected in search |
protected java.lang.String |
getStartSetString(int numberSelected)
Creates a string of attributes indexes separated by commas. |
java.lang.String |
globalInfo()
Returns a string describing this search method |
java.lang.String |
informationBasedEvaluatorTipText()
Returns the tip text for m_univariteEvaluator |
protected boolean |
isIn(int n,
int[] array)
Check if value n is in array |
java.util.Enumeration<weka.core.Option> |
listOptions()
Returns an enumeration describing the available options. |
protected int[] |
minMaxOrder(double[][] I_XiC_givenXj)
|
protected weka.core.Instances |
projectBlock(weka.core.Instances data)
It projects data over attributes selected up to know + first m_B attributes in ranking + the class attribute |
protected void |
rerank(weka.core.Instances data)
re-rank remaining attributes in m_ranking[] given m_selected[] |
protected void |
rerankCMIM(weka.core.Instances data)
|
java.lang.String |
rerankMethodTipText()
Returns the tip text for this property. |
protected void |
rerankMIFS_MRMR(weka.core.Instances data,
double factor)
|
void |
resetOptions()
reset all options to their default values |
int[] |
search(weka.attributeSelection.ASEvaluation ASEval,
weka.core.Instances data)
Performs search |
java.lang.String |
searchAlgorithmTipText()
Returns the tip text for this property. |
void |
setB(int B)
set size of blocks (cardinality) over which search is performed |
void |
setInformationBasedEvaluator(weka.core.SelectedTag newType)
set evaluator to create univariate ranking |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setRerankMethod(weka.core.SelectedTag newType)
Set method to use for re-ranking |
void |
setSearchAlgorithm(weka.attributeSelection.ASSearch m_searchAlgorithm)
set the search algorithm to use over blocks in ranking |
java.lang.String |
toString()
Description of the search |
Methods inherited from class weka.attributeSelection.ASSearch |
---|
forName, getRevision, makeCopies |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
static final long serialVersionUID
protected int m_rerankMethod
protected int m_B
protected int m_informationBasedEvaluator
protected weka.attributeSelection.ASSearch m_searchAlgorithm
protected weka.attributeSelection.ASEvaluation m_ASEval
protected double m_searchTime_ms
protected double m_rerankingTime_ms
protected int m_blocksSearched
protected int[] m_selected
protected static final int CMIM
protected static final int MIFS
protected static final int MRMR
protected static final int IG
protected static final int SU
protected int[] m_ranking
protected double[] m_attributes_merits_globalIndexes
public static final weka.core.Tag[] TAGS_RERANK
public static final weka.core.Tag[] TAGS_INFORMATION_BASED_EVAL
Constructor Detail |
---|
public RerankingSearch()
Method Detail |
---|
public int[] search(weka.attributeSelection.ASEvaluation ASEval, weka.core.Instances data) throws java.lang.Exception
search
in class weka.attributeSelection.ASSearch
ASEval
- the attribute evaluator to guide the searchdata
- the training instances.
java.lang.Exception
- if the search can't be completedpublic void setOptions(java.lang.String[] options) throws java.lang.Exception
-methodSpecifies the method used to re-ranking attributes (default 0: CMIM)
-blockSizeSpecifies the size of blocks over which search is performed (default 20)
-rankingMeasureinformation-theory-based univariate attribute evaluator to create first ranking (default 0: Information Gain)
-searchClass name of ASSearch search algorithm to be used over blocks. Place any options of the search algorithm LAST on the command line following a "--". eg.: -search weka.attributeSelection.GreedyStepwise ... -- -C
setOptions
in interface weka.core.OptionHandler
java.lang.Exception
public java.lang.String[] getOptions()
getOptions
in interface weka.core.OptionHandler
protected void createUnivariateRanking(weka.core.Instances data) throws java.lang.Exception
data
- from which to build evaluator
java.lang.Exception
protected weka.core.Instances projectBlock(weka.core.Instances data) throws java.lang.Exception
data
- from which to project attributes to create a block
java.lang.Exception
protected int[] getGlobalIndexes(int[] relative, weka.core.Instances relativeData, weka.core.Instances globalData)
relative
- attributes indexes selected in previous blockrelativeData
- previous blockglobalData
- original data with all the attributes
public java.util.Enumeration<weka.core.Option> listOptions()
listOptions
in interface weka.core.OptionHandler
protected void rerank(weka.core.Instances data) throws java.lang.Exception
data
- original training data
java.lang.Exception
protected java.lang.String getStartSetString(int numberSelected)
numberSelected
- cardinaility of attributes selected in previous block
protected boolean different(int[] a, int[] b)
a
- []b
- []
protected boolean isIn(int n, int[] array)
n
- array
-
public weka.attributeSelection.ASSearch getSearchAlgorithm() throws java.lang.Exception
java.lang.Exception
public void setSearchAlgorithm(weka.attributeSelection.ASSearch m_searchAlgorithm)
m_searchAlgorithm
- new search algorithm to be used over blockspublic weka.core.SelectedTag getRerankMethod()
public void setRerankMethod(weka.core.SelectedTag newType) throws java.lang.Exception
newType
- the type of re-rerank method desired
java.lang.Exception
public double getSearchTime_ms()
public double getRerankingTime_ms()
public int getBlocksSearched()
public int getB()
public void setB(int B)
B
- public weka.core.SelectedTag getInformationBasedEvaluator()
public void setInformationBasedEvaluator(weka.core.SelectedTag newType) throws java.lang.Exception
newType
- the information-based univariate evaluation to create first
ranking
java.lang.Exception
public void resetOptions()
public java.lang.String rerankMethodTipText()
public int[] getSelected()
public java.lang.String bTipText()
public java.lang.String informationBasedEvaluatorTipText()
public java.lang.String searchAlgorithmTipText()
public java.lang.String globalInfo()
public java.lang.String toString()
toString
in class java.lang.Object
protected void rerankCMIM(weka.core.Instances data) throws java.lang.Exception
data
- from which to compute conditional mutual informations
Modifies m_ranking ordering attributes in decreasing order of Fleuret's
CMIM approximation of I(Xi;C|m_selected) for all Xi in m_ranking. CMIM
approximates this value with formula: max_Xi min_Xj I(Xi;C|m_selected)
for all Xi in m_ranking and all Xj in m:selected
java.lang.Exception
protected void rerankMIFS_MRMR(weka.core.Instances data, double factor) throws java.lang.Exception
data
- to compute mutual information valuesfactor
- double value used to multiply by mutual informations
Modifies m_ranking ordering attributes in decreasing order of Battiti's
MIFS approximation of I(Xi;C|m_selected) for all Xi in m_ranking, and all
Xj in m_selected. MIFS approximates this value with formula: I(Xi,C) -
(0.5* sum I(Xi,Xj)) MRMR approximates this value with formula: I(Xi,C) -
(1/|S| * sum I(Xi,Xj)) thus, the only difference between both methods is
the multiplicaton factor
java.lang.Exception
protected int[] minMaxOrder(double[][] I_XiC_givenXj)
I_XiC_givenXj
- conditional information I(X;C|m_selected) for all Xi in
training data
protected double getConditionalMutualInformation(int posX, int posY, int posZ, weka.core.Instances data) throws java.lang.Exception
posX
- index of att XposY
- index of att YposZ
- index of conditioning attribute Zdata
- training data
java.lang.Exception
protected double getMutualInformation(int posX, int posY, weka.core.Instances data)
posX
- index of att XposY
- index of att Ydata
- training data
java.lang.Exception
protected double[][] getJointXY(int pX, int pY, weka.core.Instances data)
protected double[][][] getJointXYZ(int pX, int pY, int pZ, weka.core.Instances data)
protected double[] getMarginalProb(int pos, weka.core.Instances data)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |