| title | Sets and Relations | ||||||
|---|---|---|---|---|---|---|---|
| sidebar_label | Sets & Relations | ||||||
| description | Exploring the fundamentals of Set Theory and Relations, and how these discrete structures underpin data categorization and recommendation systems in Machine Learning. | ||||||
| tags |
|
Discrete Mathematics deals with distinct, separated values rather than continuous ranges. Set Theory is the language we use to group these values, and Relations describe how these groups interact. In Machine Learning, these concepts are vital for everything from defining probability spaces to building database schemas for training data.
A Set is an unordered collection of distinct objects, called elements.
-
$A = {1, 2, 3}$ : A set containing numbers 1, 2, and 3. -
$x \in A$ :$x$ is an element of set$A$ . -
$\emptyset$ : An empty set. -
$\mathbb{R}, \mathbb{Z}, \mathbb{N}$ : Sets of Real numbers, Integers, and Natural numbers.
:::tip Common Sets in ML
-
$\mathbb{R}$ (Real Numbers): Used for continuous features like height, price, or weight. -
$\mathbb{Z}$ (Integers): Used for count-based data (e.g., number of clicks). -
${0, 1}$ (Binary Set): The standard output set for binary classification. -
${C_1, C_2, \dots, C_k}$ (Categorical Set): The labels for multi-class classification. :::
The interaction between sets is often visualized using Venn Diagrams.
-
Union (
$A \cup B$ ): Elements in$A$ , or$B$ , or both. (Equivalent to a logicalOR). -
Intersection (
$A \cap B$ ): Elements present in both$A$ and$B$ . (Equivalent to a logicalAND). -
Difference (
$A \setminus B$ ): Elements in$A$ that are not in$B$ . -
Complement (
$A^c$ ): Everything in the universal set that is not in$A$ .
:::tip Sets in ML
In classification tasks, the Label Space is a set. For a cat/dog classifier, the set of possible outputs is
The Cartesian Product of two sets
If
A Relation
In ML, we specifically look for certain properties in relations:
- Reflexive: Every element is related to itself.
-
Symmetric: If
$a$ is related to$b$ , then$b$ is related to$a$ (e.g., "Similarity" in clustering). -
Transitive: If
$a \to b$ and$b \to c$ , then$a \to c$ .
Relations are often represented as Directed Graphs. If
When we perform "One-Hot Encoding" or handle categorical variables, we are mapping elements from a discrete set of categories into a numerical space.
Modern AI often uses Knowledge Graphs (like those powering Google Search). These are massive sets of entities connected by relations (e.g., (Paris, is_capital_of, France)).
Sets and relations form the basis of predicate logic, which is used in "Symbolic AI" and for defining constraints in optimization problems.
Now that we can group objects into sets and relate them, we need to understand the logic that allows us to make valid inferences from these groups.