Skip to content

Commit ed5c7ab

Browse files
committed
replace faraday by http.rb
add demo for thread pool add benchmark class add more comment partially generated by CodeGemma running locally
1 parent c470b80 commit ed5c7ab

File tree

17 files changed

+367
-230
lines changed

17 files changed

+367
-230
lines changed

Gemfile

Lines changed: 7 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,10 @@ source 'https://rubygems.org'
33
gemspec
44

55
group :development, :production do
6-
gem 'faraday', '~> 2.13'
6+
gem 'http', '~> 0.13.3'
77
end
88

99
group :test, :development do
10-
# HTTP client implementation
11-
gem 'faraday-httpclient'
1210
# code coloring for yard
1311
gem 'redcarpet'
1412
# documentation generation
@@ -20,12 +18,11 @@ group :test, :development do
2018
# code coverage to monitor rspec tests
2119
gem 'simplecov'
2220

23-
# for client_async_spec.rb
24-
gem 'async-http-faraday'
21+
# Thread pool example file: thread_pool_spec.rb
22+
# https://github.com/mperham/connection_pool/tree/main
23+
gem 'connection_pool'
24+
25+
# Benchmark is no longer included in Ruby 3.5
26+
gem 'benchmark'
2527

26-
# platforms :mri do
27-
# gem "byebug"
28-
# gem "pry"
29-
# gem "pry-byebug"
30-
# end
3128
end

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2018-2023 SerpApi
3+
Copyright (c) 2018-2025 SerpApi
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 37 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -160,47 +160,6 @@ pp client.account
160160

161161
It prints your account information.
162162

163-
### Change HTTP client implementation
164-
165-
The internal HTTP client is based on the [Faraday](https://lostisland.github.io/faraday/) library.
166-
You can change the HTTP client by passing the adapter as an argument to the `SerpApi::Client.new` method.
167-
168-
It requires to install the adapter gem.
169-
```sh
170-
gem install faraday-httpclient
171-
```
172-
173-
```ruby
174-
# see faraday documentation for adapter description.
175-
client = SerpApi::Client.new(engine: 'google', api_key: ENV['API_KEY'], timeout: 10, adapter: :httpclient)
176-
data = client.search(q: 'Coffee', location: 'Austin, TX')
177-
pp data
178-
```
179-
180-
or a more [advanced parallel requests example](https://lostisland.github.io/faraday/#/advanced/parallel-requests)
181-
182-
```ruby
183-
# install and load
184-
require 'async/http/faraday'
185-
186-
# initialize the client
187-
client = SerpApi::Client.new(adapter: :async_http, api_key: ENV['API_KEY'], timeout: 10)
188-
189-
# run same query in parallel
190-
Async do
191-
Async do
192-
data = client.search(engine: 'google', q: 'Coffee', location: 'Austin, TX')
193-
expect(data.keys.size).to be > 5
194-
expect(data.dig(:search_metadata,:id)).not_to be_nil
195-
end
196-
Async do
197-
data = client.search(engine: 'youtube', search_query: 'Coffee', location: 'Austin, TX')
198-
expect(data.keys.size).to be > 5
199-
expect(data.dig(:search_metadata,:id)).not_to be_nil
200-
end
201-
end
202-
```
203-
204163
## Basic example per search engine
205164

206165
Here is how to calls the APIs.
@@ -841,15 +800,47 @@ Search API enables `async` search.
841800
Here is an example of asynchronous searches using Ruby
842801
```ruby
843802
require 'serpapi'
803+
# The code snippet aims to improve the efficiency of searching using the SerpApi client async function. It
804+
# targets companies in the MAANG (Meta, Amazon, Apple, Netflix, Google) group.
805+
#
806+
# **Process:**
807+
# 1. **Request Queue:** The company list is iterated over, and each company is queried using the SerpApi client. Requests
808+
# are stored in a queue to avoid blocking the main thread.
809+
#
810+
# 2. **Client Retrieval:** After each request, the code checks the status of the search result. If it's cached or
811+
# successful, the company name is printed, and the request is skipped. Otherwise, the result is added to the queue for
812+
# further processing.
813+
#
814+
# 3. **Queue Processing:** The queue is processed until it's empty. In each iteration, the last result is retrieved and
815+
# its client ID is extracted.
816+
#
817+
# 4. **Archived Client Retrieval:** Using the client ID, the code retrieves the archived client and checks its status. If
818+
# it's cached or successful, the company name is printed, and the client is skipped. Otherwise, the result is added back
819+
# to the queue for further processing.
820+
#
821+
# 5. **Completion:** The queue is closed, and a message is printed indicating that the process is complete.
822+
#
823+
# * **Asynchronous Requests:** The `async: true` option ensures that search requests are processed in parallel, improving
824+
# efficiency.
825+
# * **Queue Management:** The queue allows requests to be processed asynchronously without blocking the main thread.
826+
# * **Status Checking:** The code checks the status of each search result before processing it, avoiding unnecessary work.
827+
# * **Queue Processing:** The queue ensures that all requests are processed in the order they were submitted.
828+
829+
# **Overall, the code snippet demonstrates a well-structured approach to improve the efficiency of searching for company
830+
# information using SerpApi.**
831+
832+
# load serpapi library
833+
require 'serpapi'
834+
844835
# target MAANG companies
845836
company_list = %w(meta amazon apple netflix google)
846-
client = SerpApi::Client.new({engine: 'google', async: true, api_key: ENV['API_KEY']})
837+
client = SerpApi::Client.new(engine: 'google', async: true, persistent: true, api_key: ENV['API_KEY'])
847838
search_queue = Queue.new
848839
company_list.each do |company|
849840
# store request into a search_queue - no-blocker
850841
result = client.search({q: company})
851842
if result[:search_metadata][:status] =~ /Cached|Success/
852-
puts "#{company}: client done"
843+
puts "#{company}: search results found in cache for: #{company}"
853844
next
854845
end
855846

@@ -864,16 +855,17 @@ while !search_queue.empty?
864855
search_id = result[:search_metadata][:id]
865856

866857
# retrieve client from the archive - blocker
867-
search_archived = client.2(search_id)
858+
search_archived = client.search_archive(search_id)
868859
if search_archived[:search_metadata][:status] =~ /Cached|Success/
869-
puts "#{search_archived[:search_parameters][:q]}: client done"
860+
puts "#{search_archived[:search_parameters][:q]}: search results found in archive for: #{company}"
870861
next
871862
end
872863

873864
# add results to the client queue
874865
search_queue.push(result)
875866
end
876867

868+
# destroy the queue
877869
search_queue.close
878870
puts 'done'```
879871

README.md.erb

Lines changed: 0 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -179,47 +179,6 @@ pp client.account
179179

180180
It prints your account information.
181181

182-
### Change HTTP client implementation
183-
184-
The internal HTTP client is based on the [Faraday](https://lostisland.github.io/faraday/) library.
185-
You can change the HTTP client by passing the adapter as an argument to the `SerpApi::Client.new` method.
186-
187-
It requires to install the adapter gem.
188-
```sh
189-
gem install faraday-httpclient
190-
```
191-
192-
```ruby
193-
# see faraday documentation for adapter description.
194-
client = SerpApi::Client.new(engine: 'google', api_key: ENV['API_KEY'], timeout: 10, adapter: :httpclient)
195-
data = client.search(q: 'Coffee', location: 'Austin, TX')
196-
pp data
197-
```
198-
199-
or a more [advanced parallel requests example](https://lostisland.github.io/faraday/#/advanced/parallel-requests)
200-
201-
```ruby
202-
# install and load
203-
require 'async/http/faraday'
204-
205-
# initialize the client
206-
client = SerpApi::Client.new(adapter: :async_http, api_key: ENV['API_KEY'], timeout: 10)
207-
208-
# run same query in parallel
209-
Async do
210-
Async do
211-
data = client.search(engine: 'google', q: 'Coffee', location: 'Austin, TX')
212-
expect(data.keys.size).to be > 5
213-
expect(data.dig(:search_metadata,:id)).not_to be_nil
214-
end
215-
Async do
216-
data = client.search(engine: 'youtube', search_query: 'Coffee', location: 'Austin, TX')
217-
expect(data.keys.size).to be > 5
218-
expect(data.dig(:search_metadata,:id)).not_to be_nil
219-
end
220-
end
221-
```
222-
223182
## Basic example per search engine
224183

225184
Here is how to calls the APIs.

Rakefile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,8 @@ end
6363
desc 'run demo example'
6464
task :demo do
6565
sh 'ruby oobt/demo.rb'
66+
sh 'ruby oobt/demo_async.rb'
67+
sh 'ruby oobt/demo_thread_pool.rb'
6668
end
6769

6870
desc 'release the gem to the public rubygems.org'

lib/serpapi.rb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@ module SerpApi
44
# see serpapi for implementation
55
end
66

7-
# load faraday HTTP lib
8-
require 'faraday'
7+
# load HTTP
8+
require 'http'
99

1010
# load
1111
require 'json'

lib/serpapi/client.rb

Lines changed: 58 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,18 @@ module SerpApi
44
# Client for SerpApi.com
55
#
66
class Client
7-
include Errors
87

98
# Backend service URL
109
BACKEND = 'serpapi.com'.freeze
1110

1211
# HTTP timeout requests
1312
attr_reader :timeout,
1413
# Query parameters
15-
:params
14+
:params,
15+
# HTTP persistent
16+
:persistent,
17+
# HTTP.rb client
18+
:socket
1619

1720
# Constructor
1821
#
@@ -30,21 +33,32 @@ class Client
3033
# key can be either a symbol or a string.
3134
#
3235
# @param [Hash] params default for the search
33-
def initialize(params = {}, _adapter = :net_http)
34-
# set default read timeout
35-
@timeout = params[:timeout] || params['timeout'] || 120
36+
def initialize(params = {})
37+
# store client HTTP request timeout
38+
@timeout = params[:timeout] || 120
3639
@timeout.freeze
3740

41+
# enable HTTP persistent mode
42+
@persistent = params[:persistent] || true
43+
@persistent.freeze
44+
3845
# delete this client only configuration keys
39-
params.delete('timeout') if params.key? 'timeout'
40-
params.delete(:timeout) if params.key? :timeout
46+
%i(timeout persistent).each do |option|
47+
params.delete(option) if params.key?(option)
48+
end
4149

42-
# set default params safely in memory
50+
# set default serpapi related parameters
4351
@params = params.clone || {}
52+
53+
# track ruby library as a client for statistic purpose
54+
@params[:source] = 'serpapi-ruby:' << SerpApi::VERSION
55+
4456
@params.freeze
4557

46-
# setup connection socket
47-
@socket = Faraday.new(url: "https://#{BACKEND}")
58+
# create connection socket
59+
if persistent?
60+
@socket = HTTP.persistent("https://#{BACKEND}")
61+
end
4862
end
4963

5064
# perform a search using SerpApi.com
@@ -83,7 +97,7 @@ def location(params = {})
8397
# @param [Symbol] format :json or :html (default: json, optional)
8498
# @return [String|Hash] raw html or JSON / Hash
8599
def search_archive(search_id, format = :json)
86-
raise SerpApiException, 'format must be json or html' unless [:json, :html].include?(format)
100+
raise SerpApiError, 'format must be json or html' unless [:json, :html].include?(format)
87101

88102
get("/searches/#{search_id}.#{format}", format)
89103
end
@@ -105,46 +119,60 @@ def api_key
105119
@params[:api_key]
106120
end
107121

122+
# close open connection
123+
def close
124+
@socket.close if @socket
125+
end
126+
108127
private
109128

110129
# @return [Hash] query parameter
111130
def query(params)
112131
# merge default params with custom params
113132
q = @params.merge(params)
114133

115-
# set ruby client
116-
q[:source] = 'serpapi-ruby:' << SerpApi::VERSION
117-
118134
# delete empty key/value
119135
q.compact
120136
end
121137

138+
# @return [Boolean] HTTP session persistent enabled
139+
def persistent?
140+
persistent
141+
end
142+
122143
# get HTTP query formatted results
123144
#
124145
# @param [String] endpoint HTTP service uri
125146
# @param [Symbol] decoder type :json or :html
126147
# @param [Hash] params custom search inputs
127148
# @param [Boolean] symbolize_names if true, convert JSON keys to symbols
128-
# @return decoded payload as JSON / Hash or String
149+
# @return decoded response as JSON / Hash or String
129150
def get(endpoint, decoder = :json, params = {}, symbolize_names = true)
130-
payload = @socket.get(endpoint) do |req|
131-
req.params = query(params)
132-
req.options.timeout = timeout
151+
# execute get via open socket
152+
if persistent?
153+
response = @socket.get(endpoint, params: query(params))
154+
else
155+
response = HTTP.timeout(timeout).get("https://#{BACKEND}#{endpoint}", params: query(params))
133156
end
134-
# read http response
135-
data = payload.body
136-
# decode payload using JSON native parser
137-
if decoder == :json
138-
data = JSON.parse(data, symbolize_names: symbolize_names)
139-
if data.instance_of?(Hash) && data.key?('error')
140-
raise SerpApiException, "get failed with error: #{data['error']} from url: #{endpoint}, params: #{params}, decoder: #{decoder}, http status: #{payload.status} "
157+
158+
# decode response using JSON native parser
159+
case decoder
160+
when :json
161+
# read http response
162+
data = JSON.parse(response.body, symbolize_names: symbolize_names)
163+
if data.instance_of?(Hash) && data.key?(:error)
164+
raise SerpApiError, "HTTP request failed with error: #{data[:error]} from url: https://#{BACKEND}#{endpoint}, params: #{params}, decoder: #{decoder}, response status: #{response.status} "
165+
elsif response.status != 200
166+
raise SerpApiError, "HTTP request failed with response status: #{response.status} reponse: #{data} on get url: https://#{BACKEND}#{endpoint}, params: #{params}, decoder: #{decoder}"
141167
end
142-
raise SerpApiException, "get failed with response status: #{payload.status} reponse: #{data} on get url: #{endpoint}, params: #{params}, decoder: #{decoder}" if payload.status != 200
168+
169+
# discard response body
170+
response.flush if persistent?
171+
172+
return data
173+
else
174+
return response.body
143175
end
144-
# return raw HTML
145-
data
146-
rescue Faraday::Error => e
147-
raise SerpApiException, "fail: get url: #{endpoint} caused by #{e.class} : #{e.message} (params: #{params}, decoder: #{decoder})"
148176
end
149177
end
150178
end

lib/serpapi/error.rb

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
11
# Module includes SerpApi exception
2-
#
3-
module SerpApi
4-
# Errors module contains all the exceptions used in the SerpApi client.
5-
#
6-
module Errors
7-
# SerpApiException wraps anything related to the SerpApi client errors.
8-
#
9-
class SerpApiException < StandardError
10-
end
2+
module SerpApi
3+
# SerpApiException wraps any errors related to the SerpApi client.
4+
class SerpApiError < StandardError
5+
# List the specific types of errors handled by the Error class.
6+
# - Missing API key
7+
# - Credit limit
8+
# - Incorrect query
9+
# - more ...
1110
end
1211
end

0 commit comments

Comments
 (0)