Skip to content

Latest commit

 

History

History
91 lines (50 loc) · 4.84 KB

File metadata and controls

91 lines (50 loc) · 4.84 KB
graph LR
    base_data_labeler["base_data_labeler"]
    base_model["base_model"]
    character_level_cnn_model["character_level_cnn_model"]
    regex_model["regex_model"]
    column_name_model["column_name_model"]
    data_processing["data_processing"]
    data_labelers["data_labelers"]
    base_data_labeler -- "delegates to" --> data_processing
    base_data_labeler -- "interacts with" --> character_level_cnn_model
    base_data_labeler -- "interacts with" --> regex_model
    base_data_labeler -- "interacts with" --> column_name_model
    base_model -- "is inherited by" --> character_level_cnn_model
    base_model -- "is inherited by" --> regex_model
    base_model -- "is inherited by" --> column_name_model
    data_labelers -- "utilizes" --> base_data_labeler
Loading

CodeBoardingDemoContact

Details

The Data Labeling Module subsystem is responsible for the end-to-end process of identifying and classifying sensitive or specific data elements. It orchestrates data preparation, model execution (deep learning, regex, column name), and result processing.

base_data_labeler

Acts as the primary entry point and orchestrator for the entire data labeling pipeline. It manages the lifecycle of data labelers, including loading, saving, parameter validation, and coordinating pre-processing, model execution, and post-processing steps.

Related Classes/Methods:

base_model

Serves as the abstract base class for all data labeling models. It provides common functionalities such as managing label mappings, validating parameters, and registering subclasses, ensuring a consistent interface for various model implementations.

Related Classes/Methods:

character_level_cnn_model

Implements a deep learning model (Character-Level CNN) for data labeling. It handles the construction, training, and prediction using character embeddings, specializing in complex pattern recognition.

Related Classes/Methods:

regex_model

Provides a rule-based data labeling mechanism using regular expressions. Its primary function is to validate its configuration parameters and apply regex patterns for classification.

Related Classes/Methods:

column_name_model

Implements a data labeling model that leverages column names for classification. It performs comparisons and predictions based on column name patterns, useful for structured data.

Related Classes/Methods:

data_processing

Functions as a versatile preprocessor and postprocessor for the data labeling pipeline. It handles data transformations, format conversions (e.g., to NER format, structured/unstructured), and prediction result processing.

Related Classes/Methods:

data_labelers

Provides a higher-level facade or utility for initiating labeling processes, abstracting the direct instantiation and utilization of base_data_labeler.

Related Classes/Methods: