You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
extracts all the products in Sales and saves a CSV file,
18
+
which contains:
19
+
20
+
- Product name
21
+
- Product detail page URL
22
+
- Price
23
+
```
24
+
25
+
Try it! While the code generated will most likely work out of the box, the resulting program will have a few caveats. Some are usability issues:
26
+
27
+
-_User-operated:_ We have to run the scraper ourselves. If we're tracking price trends, we'd need to remember to run it daily. And if we want alerts for big discounts, manually running the program isn't much better than just checking the site in a browser every day.
28
+
-_Manual data management:_ Tracking prices over time means figuring out how to organize the exported data ourselves. Processing the data could also be tricky since different analysis tools often require different formats.
29
+
30
+
Some are technical challenges:
31
+
32
+
-_No monitoring:_ Even if we knew how to setup a server or home installation so that our scraper runs regularly, we'd have little insight into whether it ran successfully, what errors or warnings occurred, how long it took, or what resources it used.
33
+
-_Anti-scraping risks:_ If the target website detects our scraper, they can rate-limit or block us. Sure, we could run it from a coffee shop's Wi-Fi, but eventually, they'd block that too—risking seriously annoying our barista.
34
+
35
+
To address all of these, we'll use the [Apify](https://apify.com/) platform, where it's possible to deploy any program, as far as it's structured as a so-called Actor. We'll thank ourselves later if we start our program as an Actor from the very beginning.
36
+
37
+
First, we'll use a few commands to setup an Actor template, and then we'll prompt ChatGPT to generate the code necessary for scraping that [Sales page](https://warehouse-theme-metal.myshopify.com/collections/sales) from the prompt above.
38
+
39
+
:::info The Warehouse store
40
+
41
+
In this course, we'll scrape a real e-commerce site instead of artificial playgrounds or sandboxes. Shopify, a major e-commerce platform, has a demo store at [warehouse-theme-metal.myshopify.com](https://warehouse-theme-metal.myshopify.com/). It strikes a good balance between being realistic and stable enough for a tutorial.
42
+
43
+
:::
44
+
45
+
## Creating an Actor
10
46
11
-
The first lesson should use ChatGPT. It's mainstream. For chatting, it's free with limits, but many people might already pay for it, and for those it's gonna be without perceivable limits.
47
+
First, let's head to the [Installation page](https://docs.apify.com/cli/docs/installation) of the Apify CLI, a command line program, which works as a remote control for the Apiary platform.
12
48
13
-
The implementation language can be JS or Python, it doesn't matter. However, the apify create template for Python is outdated, so I think (with a tear in my eye) that we should generate JavaScript – compared to outdated Python stack, I expect less issues to run it locally.
49
+
On the page, choose an installation method suitable for you and run the necessary command(s) in your Terminal (macOS/Linux) or Command Prompt (Windows).
14
50
15
-
Start by explaining why running a scraper as an Actor makes sense (execution, storage, scheduling, history), and then install the CLI and use the template with those benefits in mind.
51
+
If you don't know what to do or get stuck, [instruct ChatGPT to read the installation page](https://chatgpt.com/?prompt=Read%20from%20https%3A%2F%2Fdocs.apify.com%2Fcli%2Fdocs%2Finstallation%20so%20I%20can%20ask%20questions%20about%20it.) and let it help you. Verify that you've successfully installed the tool by running this:
16
52
17
-
In lesson 1, students would work inside the template by copying/pasting from ChatGPT and focus on getting their first scraper working, without worrying too much about anything else.
53
+
```text
54
+
apify --version
55
+
```
56
+
57
+
You are ready if it prints something like the following:
58
+
59
+
```text
60
+
apify-cli/0.0.0 (1a2b3c4) running on ... with node-0.0.0, installed via ...
61
+
```
62
+
63
+
<!--
64
+
TODO Now let's setup the Actor… Find a suitable folder and run `apify create`
18
65
-->
19
66
20
67
:::note Course under construction
21
68
22
-
This page hasn't been written yet. Come later, please!
69
+
This section hasn't been written yet. Come later, please!
23
70
24
71
:::
25
72
73
+
## Running code
26
74
27
75
<!--
28
-
#### Creating first scraper
29
-
Prompt ChatGPT to get a simple JavaScript program which downloads https://warehouse-theme-metal.myshopify.com/collections/sales and lists the product names:
76
+
Save it to the template, setup Node/npm environment, run it, get results. If the student gets stuck setting up Node/npm, they ask ChatGPT. Roughly explaining what the program does, establishing basic terms.
77
+
-->
78
+
79
+
:::note Course under construction
80
+
81
+
This section hasn't been written yet. Come later, please!
30
82
31
-
> Create a scraper in JavaScript which downloads https://warehouse-theme-metal.myshopify.com/collections/sales, extracts all the products in Sales and saves a CSV file, which contains:
32
-
> - Product name
33
-
> - Product detail page URL
34
-
> - Price
83
+
:::
35
84
36
-
#### Running code
37
-
Save it as a scraper.js, setup Node/npm environment, run it, get results. If the student gets stuck setting up Node/npm, they ask ChatGPT. Roughly explaining what the program does, establishing basic terms.
38
85
#### Scraping stock units
86
+
87
+
<!--
39
88
Prompt ChatGPT to modify the program so that it scrapes stock units. Technically, modifying the program like this proves to be cumbersome, but doable. Run the program again, get better results.
40
89
41
-
Teaser: In the next lesson we'll get rid of copying and pasting and updating the files ourselves.
90
+
Teaser: In one of the next lessons we'll get rid of copying and pasting and updating the files ourselves, but first, let's see how we can deploy the scraper and run it periodically.
42
91
-->
92
+
93
+
:::note Course under construction
94
+
95
+
This section hasn't been written yet. Come later, please!
Copy file name to clipboardExpand all lines: sources/academy/platform/building_actors_with_ai/02_using_platform.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ unlisted: true
6
6
---
7
7
8
8
<!--
9
-
Deploying to Apify and reaping the benefits of the platform. Running the scraper periodically, adding support for proxies...
9
+
Deploying to Apify and reaping the benefits of the platform. Running the scraper periodically, adding support for proxies... execution, storage, scheduling, history
10
10
11
11
In lesson 2, students could already push the Actor to Apify and start seeing some of the platform benefits.
Copy file name to clipboardExpand all lines: sources/academy/platform/building_actors_with_ai/index.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ import DocCardList from '@theme/DocCardList';
14
14
15
15
---
16
16
17
-
In this course we'll use AI assistants to create an application for watching prices. It'll be able to scrape all product pages of an e-commerce website and record prices. Data from several runs of such program would be useful for seeing trends in price changes, detecting discounts, etc.
17
+
In this course we'll use AI assistants to create an application for watching prices. It'll be able to scrape product pages of an e-commerce website and record prices. Data from several runs of such program would be useful for seeing trends in price changes, detecting discounts, etc.
18
18
19
19
The end product will, unlike programs vibe-coded carelessly, reach the level of quality allowing for further extensibility and comfortable maintenance, so that it can be published to [Apify Store](https://apify.com/store).
0 commit comments