tutorial/docs/machine-learning/mathematics-for-ml/discrete-mathematics/sets-and-relations.mdx at 28173a1e3390e88593ab25c8f4f64bb3780bc54a · codeharborhub/tutorial

title

Sets and Relations

sidebar_label

Sets & Relations

description

Exploring the fundamentals of Set Theory and Relations, and how these discrete structures underpin data categorization and recommendation systems in Machine Learning.

1. Set Theory Fundamentals

A Set is an unordered collection of distinct objects, called elements.

Notation

$A = {1, 2, 3}$ : A set containing numbers 1, 2, and 3.
$x \in A$ : $x$ is an element of set $A$.
$\emptyset$ : An empty set.
$\mathbb{R}, \mathbb{Z}, \mathbb{N}$ : Sets of Real numbers, Integers, and Natural numbers.

:::tip Common Sets in ML

$\mathbb{R}$ (Real Numbers): Used for continuous features like height, price, or weight.
$\mathbb{Z}$ (Integers): Used for count-based data (e.g., number of clicks).
${0, 1}$ (Binary Set): The standard output set for binary classification.
${C_1, C_2, \dots, C_k}$ (Categorical Set): The labels for multi-class classification. :::

Key Operations

The interaction between sets is often visualized using Venn Diagrams.

Union ($A \cup B$): Elements in $A$, or $B$, or both. (Equivalent to a logical OR).
Intersection ($A \cap B$): Elements present in both $A$ and $B$. (Equivalent to a logical AND).
Difference ($A \setminus B$): Elements in $A$ that are not in $B$.
Complement ($A^c$): Everything in the universal set that is not in $A$.

:::tip Sets in ML In classification tasks, the Label Space is a set. For a cat/dog classifier, the set of possible outputs is $Y = {\text{Cat}, \text{Dog}}$. When evaluating models, we often look at the Intersection of predicted labels and true labels to calculate accuracy. :::

2. Cartesian Products

The Cartesian Product of two sets $A$ and $B$, denoted $A \times B$, is the set of all possible ordered pairs $(a, b)$.

$$ A \times B = { (a, b) \mid a \in A \text{ and } b \in B } $$

If $A$ represents "Users" and $B$ represents "Movies," $A \times B$ represents every possible interaction between every user and every movie. This is the foundation of Utility Matrices in Recommender Systems.

3. Relations

A Relation $R$ from set $A$ to set $B$ is simply a subset of the Cartesian product $A \times B$. It defines a relationship between elements of the two sets.

Types of Relations

In ML, we specifically look for certain properties in relations:

Reflexive: Every element is related to itself.
Symmetric: If $a$ is related to $b$, then $b$ is related to $a$ (e.g., "Similarity" in clustering).
Transitive: If $a \to b$ and $b \to c$, then $a \to c$.

Binary Relations and Graphs

Relations are often represented as Directed Graphs. If $(a, b) \in R$, we draw an arrow from node $a$ to node $b$.

4. Why this matters in Machine Learning

A. Data Preprocessing

When we perform "One-Hot Encoding" or handle categorical variables, we are mapping elements from a discrete set of categories into a numerical space.

B. Knowledge Graphs

Modern AI often uses Knowledge Graphs (like those powering Google Search). These are massive sets of entities connected by relations (e.g., (Paris, is_capital_of, France)).

C. Formal Logic in AI

Sets and relations form the basis of predicate logic, which is used in "Symbolic AI" and for defining constraints in optimization problems.

Now that we can group objects into sets and relate them, we need to understand the logic that allows us to make valid inferences from these groups.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

1. Set Theory Fundamentals

Notation

Key Operations

2. Cartesian Products

3. Relations

Types of Relations

Binary Relations and Graphs

4. Why this matters in Machine Learning

A. Data Preprocessing

B. Knowledge Graphs

C. Formal Logic in AI

Uh oh!

FilesExpand file tree

sets-and-relations.mdx

Latest commit

History

sets-and-relations.mdx

File metadata and controls

1. Set Theory Fundamentals

Notation

Key Operations

2. Cartesian Products

3. Relations

Types of Relations

Binary Relations and Graphs

4. Why this matters in Machine Learning

A. Data Preprocessing

B. Knowledge Graphs

C. Formal Logic in AI