Skip to content

fix: prefer company over uni on overlap during affiliation (CM-1216)#4188

Open
skwowet wants to merge 7 commits into
mainfrom
fix/cm-1216-email-domain-affiliations
Open

fix: prefer company over uni on overlap during affiliation (CM-1216)#4188
skwowet wants to merge 7 commits into
mainfrom
fix/cm-1216-email-domain-affiliations

Conversation

@skwowet

@skwowet skwowet commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

Affiliation logic now treats the email on an activity as the primary signal for determining which organization should be credited for that activity. When an activity does not contain an organization email, overlapping university and company affiliations are resolved in favor of the company.

This behavior now applies consistently during both ingest (data_sink_worker) and bulk refresh (refreshMemberOrganizationAffiliations).

Context

Member activities can be affiliated to an organization based on work history (memberOrganizations), manual segment rules, and other fallback logic. That generally worked well for simple timelines, but broke down in a couple of cases:

  • A member had overlapping university and company roles (e.g. student + internship).
  • An activity contained an explicit organization email (user@company.com, user@school.edu), but date-based affiliation logic resolved the activity to a different organization.

The email used on an activity is first-party evidence of which organization the activity was performed on behalf of. It should take precedence over enrichment-derived timelines and generic tie-breakers (stint length, org size, source rank), unless a manual affiliation override has been configured.

What changed

Ingest (findAffiliation)

Affiliation resolution now follows this order:

  1. Manual segment affiliations (unchanged).
  2. Activity email domain — if the activity username contains an organization email, affiliate to the memberOrganization whose organization has a matching verified primary-domain.
  3. Date-based matching — for activities without an organization email (plain handles, public inbox domains), affiliate using the organization active at that timestamp. When a company and university overlap, the company is preferred.
  4. Existing undated fallbacks (unchanged).

Only the email present on the activity username is considered. Member profile emails are not used as a substitute when the activity itself does not contain an email.

Bulk refresh (prepareMemberOrganizationAffiliationTimeline)

Bulk refresh now mirrors the same behavior used during ingest.

To avoid activities being processed twice, refresh runs in two passes:

Pass Activities Logic
Email-based Username matches a member organization's verified domain Affiliate directly to that organization
Timeline-based No email, or no matching member organization Use date-based affiliation logic, preferring companies over universities when overlaps exist

Manual segment affiliations continue to take precedence through the existing segment-scoped timeline and skipManualAffiliationSegments guards.

Expected behavior

Activity signal Result
Username contains an organization email (@company.com, @school.edu) Affiliate to the organization with the matching verified domain
No organization email, company + university overlap Company wins
Manual segment affiliation Manual rule wins for that segment

skwowet added 3 commits June 10, 2026 15:42
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
@skwowet skwowet self-assigned this Jun 10, 2026
Copilot AI review requested due to automatic review settings June 10, 2026 10:45
@skwowet skwowet changed the title fix: prefer company over uni on overlap during affiliation fix: prefer company over uni on overlap during affiliation (CM-1216) Jun 10, 2026
@skwowet skwowet requested a review from ulemons June 10, 2026 10:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts member activity affiliation to prioritize the email domain present on an activity’s username as the strongest routing signal, aiming to prevent “university vs company” overlap cases from incorrectly affiliating to universities when there’s no email-domain evidence.

Changes:

  • Added DAL support for fetching verified primary domains per organization and introduced a heuristic to prefer company affiliations over university affiliations during timeline overlaps.
  • Updated affiliation resolution to (a) route by activity email domain when present, and (b) otherwise fall back to timestamp-based member organization timelines with a company-over-university preference.
  • Updated ingest-time activity processing to derive the affiliation domain from the activity username (when it is a valid email) rather than from the member’s verified email identities.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
services/libs/data-access-layer/src/organizations/identities.ts Adds org-domain fetch + overlap preference helper used by affiliation logic.
services/libs/data-access-layer/src/members/segments.ts Changes work-experience query to enforce verified-domain matching when an activity domain is provided.
services/libs/data-access-layer/src/member-organization-affiliation/types.ts Extends timeline items with domain-based routing fields.
services/libs/data-access-layer/src/member-organization-affiliation/index.ts Adds domain-partitioned refresh passes to keep email-domain activities disjoint from timeline-based passes.
services/libs/common_services/src/services/common.member.service.ts Updates per-activity affiliation decision flow to prioritize activity email domain and apply company-over-university preference.
services/apps/data_sink_worker/src/service/activity.service.ts Switches ingest-time domain extraction to use the activity username (when it’s a valid email).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/libs/data-access-layer/src/organizations/identities.ts
Comment thread services/libs/common_services/src/services/common.member.service.ts Outdated
Comment thread services/libs/common_services/src/services/common.member.service.ts
skwowet added 2 commits June 10, 2026 17:09
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 10, 2026 11:43

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment on lines +167 to +176
SELECT
oi."organizationId" AS "orgId",
lower(o."displayName") AS "displayName",
array_agg(DISTINCT lower(oi.value)) AS domains
FROM "organizationIdentities" oi
JOIN organizations o ON o.id = oi."organizationId"
WHERE oi."organizationId" IN ($(organizationIds:csv))
AND oi.type = 'primary-domain'
AND oi.verified = true
GROUP BY oi."organizationId", o."displayName"
Comment on lines +182 to +186
/**
* Prioritizes company affiliations over universities when a member has overlapping timelines.
* Without an activity email to prove a school stint, universities can falsely outrank real employers.
* Returns the original candidates untouched if no overlap exists.
*/
Comment on lines +206 to +209
const isUniversity = org
? (org.displayName?.includes('university') ?? false) ||
org.domains.some((d) => d.endsWith('.edu'))
: false

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can extend this to some other domain, im thinking my university for example (unibo.it) or "mit" or something, should we have a separate list for this that we can enrich later ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants