Information Gain measures how much information entropy is reduced after splitting the dataset on an attribute. It helps identify the attribute that provides the most information about the target class.
is H is information entropy. D is data set.
Key Idea: The larger the reduction in entropy after the split, the greater the Information Gain. Attributes with higher Information Gain are preferred for splitting.
Applications in Decision Trees: At each node, the algorithm selects the attribute with the highest Information Gain to split the dataset.