vikashraj luhaniwal
1 min readApr 19, 2020

--

The built-in mlxtend.feature_selection module does not provide a way to access the p-values of significant features at every step but the chosen evaluation measure(R-square) score at every step can be accessed using theget_metric_dict() method of the SequentialFeatureSelector object.

Alternatively, the same can be achieved by adding a simpleprint(“p-value of added feature — {} is {}”.format(new_pval.idxmin(),min_p_value)) and print(“p-value of dropped feature — {} is {}”.format(p_values.idxmax(),max_p_value)) statement in significance comparison block of user-defined function forward_selection() and backward_elimination() function respectively, for e.g.

def forward_selection(data, target, significance_level=0.05):
initial_features = data.columns.tolist()
best_features = []
while (len(initial_features)>0):
remaining_features = list(set(initial_features)-set(best_features))
new_pval = pd.Series(index=remaining_features)
for new_column in remaining_features:
model = sm.OLS(target, sm.add_constant(data[best_features+[new_column]])).fit()
new_pval[new_column] = model.pvalues[new_column]
min_p_value = new_pval.min()
if(min_p_value<significance_level):
print("p-value of added feature - {} is {}".format(new_pval.idxmin(),min_p_value))
best_features.append(new_pval.idxmin())
else:
break
return best_features

--

--

vikashraj luhaniwal

AI practitioner and technical consultant with five years of work experience in the field of data science, machine learning, big data, and programming.