Skip to content

Commit 9787652

Browse files
committed
Adds confidence to suggested pattern and resource
Why are these changes being introduced: * Storing values in the models rather than in the code better aligns with confidence from detectors Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/TCO-91 How does this address that need: * Adds confidence fields for SuggestedPattern and SuggestedResources Document any side effects to this change: * It feels clunky to assign confidence to each Resource or Pattern. From a data model perspective if feels okay. From a practical standpoint I'm nervous the people managing Resources and Patterns may not know what value is appropriate.
1 parent 4d5ed34 commit 9787652

16 files changed

Lines changed: 132 additions & 9 deletions

app/dashboards/suggested_pattern_dashboard.rb

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ class SuggestedPatternDashboard < Administrate::BaseDashboard
1212
ATTRIBUTE_TYPES = {
1313
id: Field::Number,
1414
category: Field::BelongsTo,
15+
confidence: Field::Number.with_options(decimals: 2),
1516
pattern: Field::String,
1617
shortcode: Field::String,
1718
title: Field::String,
@@ -37,6 +38,7 @@ class SuggestedPatternDashboard < Administrate::BaseDashboard
3738
SHOW_PAGE_ATTRIBUTES = %i[
3839
id
3940
category
41+
confidence
4042
pattern
4143
shortcode
4244
title
@@ -50,6 +52,7 @@ class SuggestedPatternDashboard < Administrate::BaseDashboard
5052
# on the model's form (`new` and `edit`) pages.
5153
FORM_ATTRIBUTES = %i[
5254
category
55+
confidence
5356
pattern
5457
shortcode
5558
title

app/dashboards/suggested_resource_dashboard.rb

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ class SuggestedResourceDashboard < Administrate::BaseDashboard
1212
ATTRIBUTE_TYPES = {
1313
id: Field::Number,
1414
category: Field::BelongsTo,
15+
confidence: Field::Number.with_options(decimals: 2),
1516
fingerprints: Field::HasMany,
1617
terms: Field::HasMany,
1718
title: Field::String,
@@ -30,14 +31,15 @@ class SuggestedResourceDashboard < Administrate::BaseDashboard
3031
title
3132
url
3233
terms
34+
category
3335
].freeze
3436

3537
# SHOW_PAGE_ATTRIBUTES
3638
# an array of attributes that will be displayed on the model's show page.
3739
SHOW_PAGE_ATTRIBUTES = %i[
3840
id
3941
category
40-
fingerprints
42+
confidence
4143
terms
4244
title
4345
url
@@ -50,6 +52,7 @@ class SuggestedResourceDashboard < Administrate::BaseDashboard
5052
# on the model's form (`new` and `edit`) pages.
5153
FORM_ATTRIBUTES = %i[
5254
category
55+
confidence
5356
title
5457
url
5558
].freeze

app/models/detector/suggested_resource.rb

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,10 @@ def self.full_term_match(phrase)
2727
# @note Multiple detections are irrelevant for this method. If _any_ match is found, a Detection record is created.
2828
# The uniqueness contraint on Detection records would make multiple detections irrelevant.
2929
#
30-
# @return Category
30+
# @return Hash with keys :category and :confidence (or nil)
3131
def self.record(term)
3232
result = full_term_match(term.phrase)
33+
3334
return unless result.any?
3435

3536
Detection.find_or_create_by(
@@ -40,7 +41,13 @@ def self.record(term)
4041

4142
return if result.empty?
4243

43-
result.first.category
44+
# If a category hasn't been set, nil is better than a confidence with no category
45+
return if result.first.category.blank?
46+
47+
{
48+
category: result.first.category,
49+
confidence: result.first.confidence
50+
}
4451
end
4552
end
4653
end

app/models/detector/suggested_resource_pattern.rb

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,8 @@ def check_patterns(phrase)
2828
shortcode: sp.shortcode,
2929
title: sp.title,
3030
url: sp.url,
31-
category: sp.category
31+
category: sp.category,
32+
confidence: sp.confidence
3233
}
3334
@detections = sps
3435
end
@@ -40,7 +41,7 @@ def check_patterns(phrase)
4041
# @note There are multiple patterns within SuggestedPattern records. Each check is capable of generating
4142
# a separate Detection record.
4243
#
43-
# @return Category
44+
# @return Hash with keys :category and :confidence (or nil)
4445
def self.record(term)
4546
sp = Detector::SuggestedResourcePattern.new(term.phrase)
4647

@@ -53,7 +54,13 @@ def self.record(term)
5354
end
5455
return if sp.detections.empty?
5556

56-
sp.detections.first[:category]
57+
# If a category hasn't been set, nil is better than a confidence with no category
58+
return if sp.detections.first[:category].blank?
59+
60+
{
61+
category: sp.detections.first[:category],
62+
confidence: sp.detections.first[:confidence]
63+
}
5764
end
5865
end
5966
end

app/models/suggested_pattern.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
# created_at :datetime not null
1313
# updated_at :datetime not null
1414
# category_id :integer
15+
# confidence :float default(0.9)
1516
#
1617
class SuggestedPattern < ApplicationRecord
1718
validates :title, presence: true

app/models/suggested_resource.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
# created_at :datetime not null
1111
# updated_at :datetime not null
1212
# category_id :integer
13+
# confidence :float default(0.9)
1314
#
1415
class SuggestedResource < ApplicationRecord
1516
has_many :terms, dependent: :nullify

app/models/term.rb

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -129,9 +129,21 @@ def retrieve_detection_scores
129129
# The detections.scores method returns data like [{3=>0.91}, {1=>0.1}] and [{3=>0.95}]
130130
raw = detections.current.flat_map(&:scores)
131131
# raw looks like [{3=>0.91}, {1=>0.1}, {3=>0.95}]
132-
raw << { @suggested_pattern_category.id => 0.9 } if @suggested_pattern_category.present?
133-
raw << { @suggested_resource_category.id => 0.9 } if @suggested_resource_category.present?
132+
133+
raw << suggested_pattern_score if @suggested_pattern_category.present?
134+
135+
raw << suggested_resource_score if @suggested_resource_category.present?
134136

135137
raw.group_by { |h| h.keys.first }.map { |k, v| { k => v.map { |h| h.values.first } } }
136138
end
139+
140+
# Retrieve category and score for suggested patterns
141+
def suggested_pattern_score
142+
{ @suggested_pattern_category[:category].id => @suggested_pattern_category[:confidence] }
143+
end
144+
145+
# Retrieve category and score for suggested resources
146+
def suggested_resource_score
147+
{ @suggested_resource_category[:category].id => @suggested_resource_category[:confidence] }
148+
end
137149
end
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
class AddConfidenceToSuggestedPatternsAndResources < ActiveRecord::Migration[7.2]
2+
def change
3+
add_column :suggested_patterns, :confidence, :float, default: 0.9
4+
add_column :suggested_resources, :confidence, :float, default: 0.9
5+
end
6+
end

db/schema.rb

Lines changed: 3 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

test/fixtures/fingerprints.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,3 +57,6 @@ nobel_laureate:
5757

5858
astm:
5959
value: 'astm 1'
60+
61+
suggested_resource_with_category:
62+
value: 'are categories cool'

0 commit comments

Comments
 (0)