Information Gain measures how much information entropy is reduced after splitting the dataset on an attribute. It helps identify the attribute that provides the most information about the target class.
H is information entropy. D is data set.
Key Idea: The larger the reduction in entropy after the split, the greater the Information Gain. Attributes with higher Information Gain are preferred for splitting.
Applications in Decision Trees: At each node, the algorithm selects the attribute with the highest Information Gain to split the dataset.