Here is the detailed flow of how the Python ML Service (FastAPI) processes media structurally and visually to detect piracy.
sequenceDiagram
participant Clients as βοΈ Backend / π·οΈ Scraper
participant API as π FastAPI (Routes)
participant ML_Models as π§ ML Models (CNN, OpenCV)
participant Similarity as π Similarity Engine
%% --------------------------------
%% Flow 1: Generate Fingerprint (No Target Provided)
%% --------------------------------
rect rgb(20, 20, 20)
note right of Clients: Flow A: Upload Mode (Generate Profile)
Clients->>API: POST /compare/image or /compare/video (File)
activate API
API->>ML_Models: Extract Hashes & Embeddings
activate ML_Models
ML_Models-->>API: Returns (pHash, dHash, Embedding / Video Fingerprint)
deactivate ML_Models
API-->>Clients: 200 OK (Returns Data dictionary)
deactivate API
end
%% --------------------------------
%% Flow 2: Compare with Target (Piracy Detection)
%% --------------------------------
rect rgb(30, 30, 50)
note right of Clients: Flow B: Scraper Mode (Compare Profile)
Clients->>API: POST /compare/image (File + target_phash + target_embedding)
activate API
API->>ML_Models: Extract Hashes & Embeddings for incoming file
activate ML_Models
ML_Models-->>API: Returns Incoming Data
deactivate ML_Models
API->>Similarity: Calculate Hamming Distance & Cosine Similarity
activate Similarity
Similarity-->>API: Returns (Distance, CosSim)
deactivate Similarity
opt If Match condition met
note right of API: Is Cosine >= 0.85 OR Hamming <= 10?
API-->>Clients: 200 OK (Match: True, Similarity Score)
end
opt If Not Match
API-->>Clients: 200 OK (Match: False)
end
deactivate API
end
Role: Exposes FastAPI endpoints for incoming files.
/compare/image: Acts in two modes based on parameters:- Generation Mode: If only a file is sent, it returns
phash,dhash, and a ResNetembedding. - Comparison Mode: If
target_phashandtarget_embeddingare included, it computes similarity.
- Generation Mode: If only a file is sent, it returns
/compare/video: Extracts frames and generates a list of hashes. Compares against thetarget_fingerprintby finding the minimum distance across frame matches.
Role: Understand and map visual features into mathematical arrays.
- Image Hashing (services/image_hash.py): Uses the
imagehashlibrary to generatepHash(Perceptual Hash) anddHash(Difference Hash). This catches exact duplicates and resized images. - Deep Embeddings (
models/cnn_model.py): Uses a Convolutional Neural Network (likely PyTorch/ResNet or equivalent) to generate a high-dimensional vector representing the semantic meaning of the image.
Role: Decide if two math representations refer to the same visual asset.
- Hamming Distance: Counts the number of different bits between two hash strings. For images, a distance
<= 10is considered a match. For videos,<= 15. - Cosine Similarity: Measures the angle between two embedding vectors. A score
>= 0.85(85%) means the images look structurally identical to the AI, even if crops or watermarks have ruined the simple hashes.
Because FastAPI streams multipart form data, the service temporarily saves incoming files to a local /uploads directory using a uuid4() name, passes the file path to the ML models, and strictly uses os.remove(temp_filepath) to clean up and prevent storage leaks.