Skip to content

Commit 8175f60

Browse files
DavidTejuclaude
andcommitted
Update docs and benchmarks for Startpage backend
- Update README search engines section to document Startpage as default - Update README example to use startpage instead of google - Add Startpage to HTML parsing benchmarks - Fix copy-paste doc error in Google scraper (said "duckduckgo") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 3943b1e commit 8175f60

3 files changed

Lines changed: 18 additions & 11 deletions

File tree

README.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,8 @@ $ so how do i reverse a list in python
3434
# search for a latex solution
3535
$ so --site tex how to put tilde over character
3636

37-
# use google to search stackoverflow.com, askubuntu.com, and unix.stackexchange.com
38-
$ so -e google -s askubuntu -s stackoverflow -s unix how do i install linux
37+
# search stackoverflow.com, askubuntu.com, and unix.stackexchange.com via startpage
38+
$ so -e startpage -s askubuntu -s stackoverflow -s unix how do i install linux
3939
```
4040

4141
## installation
@@ -169,15 +169,17 @@ StackExchange API with no key up to 300 times per day per IP, which I imagine is
169169
fine for most users.
170170
171171
### search engines
172-
The available search engines are StackExchange, DuckDuckGo, and Google.
172+
The available search engines are StackExchange, Startpage, DuckDuckGo, and Google.
173173
StackExchange will always be the fastest to search because it doesn't require an
174174
additional request or any HTML parsing; however, it is also very primitive.
175-
~~DuckDuckGo is in second place for speed, as its response HTML is much smaller
176-
than Google's. I've found that it performs well for my queries, so it is the
177-
default search engine.~~
178175
179-
DuckDuckGo [sometimes blocks requests](https://github.com/samtay/so/issues/16), so
180-
it is no longer the default.
176+
**Startpage** is the default search engine. It proxies Google search results and
177+
serves them as static HTML, providing high quality results without requiring
178+
JavaScript.
179+
180+
Google and DuckDuckGo now require JavaScript execution for search results, making
181+
them unreliable from a terminal client. They are still available via `-e google`
182+
or `-e duckduckgo` but may not return results.
181183
182184
### multi-site searching
183185
As stated in the [docs](https://api.stackexchange.com/docs/throttle),

benches/html_parsing.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
use criterion::{criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
2-
use so::stackexchange::scraper::{DuckDuckGo, Google, Scraper};
2+
use so::stackexchange::scraper::{DuckDuckGo, Google, Scraper, Startpage};
33
use std::collections::HashMap;
44
use std::time::Duration;
55

@@ -46,6 +46,12 @@ fn bench_html_parsers(c: &mut Criterion) {
4646
|b, html| b.iter(|| DuckDuckGo.parse(html, &sites, limit)),
4747
);
4848

49+
group.bench_with_input(
50+
BenchmarkId::new("Startpage.parse", "exit-vim"),
51+
include_str!("../test/startpage/exit-vim.html"),
52+
|b, html| b.iter(|| Startpage.parse(html, &sites, limit)),
53+
);
54+
4955
let mut sites = HashMap::new();
5056
sites.insert(
5157
String::from("stackoverflow"),

src/stackexchange/scraper.rs

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,8 +90,7 @@ impl Scraper for Google {
9090
parse_with_selector(anchors, html, sites, limit)
9191
}
9292

93-
/// Creates duckduckgo search url given sites and query
94-
/// See https://duckduckgo.com/params for more info
93+
/// Creates google search url given sites and query
9594
fn get_url<'a, I>(&self, query: &str, sites: I) -> Url
9695
where
9796
I: IntoIterator<Item = &'a String>,

0 commit comments

Comments
 (0)