Skip to content

(Enhancement): Analytics - Updated regex for entitiy-detection logic of phone_number v2#557

Open
ganeshhaptik wants to merge 1 commit intodevelopfrom
AN-4218-update-regex-entity-change-v2
Open

(Enhancement): Analytics - Updated regex for entitiy-detection logic of phone_number v2#557
ganeshhaptik wants to merge 1 commit intodevelopfrom
AN-4218-update-regex-entity-change-v2

Conversation

@ganeshhaptik
Copy link
Copy Markdown
Contributor

@ganeshhaptik ganeshhaptik commented Sep 24, 2024

JIRA Ticket Number AN-4218

JIRA TICKET: https://hello-haptik.atlassian.net/browse/AN-4218

Description of change

  • Did load testing:
import re
import timeit


def older_regex():
    phone_number_format_regex = r'[-(),.+\s{}]+'
    test_numbers = [
        "(123) 456-7890",
        "+1 123-456-789",
        "123456789",
        "1234567890123"
    ]
    for number in test_numbers:
        re.match(phone_number_format_regex, number)


def simple_regex():
    phone_number_format_regex = r'[0-9\-\(\)\.\+\s]{9,12}'
    test_numbers = [
        "(123) 456-7890",
        "+1 123-456-789",
        "123456789",
        "1234567890123"
    ]
    for number in test_numbers:
        re.match(phone_number_format_regex, number)

def complex_regex():
    phone_number_format_regex = r'^(?:\+?\d{1,3})?[-.\s]?\(?\d{1,4}\)?[-.\s]?\d{1,4}[-.\s]?\d{1,4}$'
    test_numbers = [
        "(123) 456-7890",
        "+1 123-456-789",
        "123456789",
        "1234567890123"
    ]
    for number in test_numbers:
        re.match(phone_number_format_regex, number)

# Benchmark both regex patterns
old_time = timeit.timeit(older_regex, number=1000000)
simple_time = timeit.timeit(simple_regex, number=1000000)
complex_time = timeit.timeit(complex_regex, number=1000000)

print(f"Old regex time: {old_time}")
print(f"Simple regex time: {simple_time}")
print(f"Complex regex time: {complex_time}")

Screenshot 2024-09-24 at 12 01 43 PM

@haptik-deployment
Copy link
Copy Markdown

👍 No lint errors found.

@sonarqubecloud
Copy link
Copy Markdown

Please retry analysis of this Pull-Request directly on SonarCloud

@haptik-deployment
Copy link
Copy Markdown

UNIT TESTS HAVE PASSED... Good To Merge

@codecov
Copy link
Copy Markdown

codecov bot commented Sep 24, 2024

Codecov Report

❌ Patch coverage is 11.11111% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.46%. Comparing base (e9734de) to head (6e90ab4).
⚠️ Report is 14 commits behind head on develop.

Files with missing lines Patch % Lines
...ors/pattern/phone_number/phone_number_detection.py 11.11% 8 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #557      +/-   ##
===========================================
- Coverage    40.48%   40.46%   -0.02%     
===========================================
  Files           81       81              
  Lines         9468     9475       +7     
===========================================
+ Hits          3833     3834       +1     
- Misses        5635     5641       +6     
Flag Coverage Δ
40.46% <11.11%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants