You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -181,205 +181,148 @@ When Cursor opens the Actor's project folder, we'll see something similar to the
181
181
182
182

183
183
184
-
Now, finally, let's do some agentic coding!
184
+
We can select files, and if we do so, we can browse and modify their content. The same as in the Web IDE. But as an addition, we now have an integrated AI agent which we can prompt and it'll do to the code at hand whatever we need.
185
+
186
+
Finally, onto some agentic coding!
185
187
186
188
## Modifying code with Cursor
187
189
188
-
:::note Course under construction
189
-
This section hasn't been written yet. Come later, please!
190
-
:::
190
+
First, let's simplify how we can run the Actor. This will be our prompt:
191
191
192
-
## Verifying changes
192
+
```text
193
+
Change the default input URL of the Actor
194
+
to https://warehouse-theme-metal.myshopify.com/collections/sales
195
+
```
193
196
194
-
:::note Course under construction
195
-
This section hasn't been written yet. Come later, please!
196
-
:::
197
+
After we submit the prompt, the Agent will start reading the code, planning, and working on completing the task. Before it runs commands, it'll ask us to approve them.
197
198
198
-
## Pushing Actor to Apify
199
+

199
200
200
-
:::note Course under construction
201
-
This section hasn't been written yet. Come later, please!
202
-
:::
201
+
When done, it'll print summary of its work and we'll be able to review all changes made.
203
202
204
-
## Wrapping up
203
+

205
204
206
-
<!--
207
-
208
-
## Creating an Actor
209
-
210
-
Now let's use the Apify CLI to help us kick off a new Actor:
205
+
We'll approve all changes and go to the command line to try out if the Actor now works as expected:
211
206
212
207
```text
213
-
apify create warehouse-scraper
208
+
apify run
214
209
```
215
210
216
-
It starts a wizard where we can choose from various options. For each option, let's press <kbd>↵</kbd> to accept the default:
211
+
We should see a scraper output like before, including the following line:
217
212
218
213
```text
219
-
✔ Choose the programming language of your new Actor: JavaScript
220
-
✔ Choose a template for your new Actor. You can check more information at https://apify.com/templates. Crawlee + Cheerio
221
-
✔ Almost done! Last step is to install dependencies. Install dependencies
222
-
223
-
...
224
-
225
-
Success: ✅ Actor 'warehouse-scraper' created successfully!
214
+
INFO CheerioCrawler: Processing page: https://warehouse-theme-metal.myshopify.com/collections/sales
215
+
```
226
216
227
-
Next steps:
217
+
That's our first successful change to the Actor with an AI agent! Without back-and-forth between the IDE and an AI chat like ChatGPT. Now before pushing this change back to Apify, let's do one more improvement to the scraper.
228
218
229
-
cd "warehouse-scraper"
230
-
apify run
219
+
## Scraping prices
231
220
232
-
💡 Tip: Use 'apify push' to deploy your Actor to the Apify platform
🌱 Git repository initialized in 'warehouse-scraper'. You can now commit and push your Actor to Git.
235
-
```
221
+
In the previous lesson, we noticed that the prices in our resulting dataset are in a rather raw shape:
236
222
237
-
Now that's a lot of output, but no worries, the important part is that we've successfully used a template to set up a new Actor project!
223
+
| name | url | price |
224
+
| --- | --- | --- |
225
+
| JBL Flip 4 Waterproof Portable Bluetooth Speaker |https://warehouse-theme-metal.myshopify.com/products/jbl-flip-4-waterproof-portable-bluetooth-speaker| Sale price$74.95 |
226
+
| Sony XBR-950G BRAVIA 4K HDR Ultra HD TV |https://warehouse-theme-metal.myshopify.com/products/sony-xbr-65x950g-65-class-64-5-diag-bravia-4k-hdr-ultra-hd-tv| Sale priceFrom $1,398.00 |
227
+
| Sony SACS9 10" Active Subwoofer |https://warehouse-theme-metal.myshopify.com/products/sony-sacs9-10-inch-active-subwoofer| Sale price$158.00 |
238
228
239
-
A new directory `warehouse-scraper` has been created for us, with a variety of files and directories inside. The output instructs us to go to this new project directory, so let's do it:
229
+
Let's change that. We'll prompt the agent like this, with a clear example of what we want:
240
230
241
231
```text
242
-
cd "warehouse-scraper"
232
+
Change the code so that the Actor saves prices as numbers.
233
+
Because some prices are "from", let's call the "price" field
234
+
"minPrice" instead, as in minimum price. Example follows.
235
+
236
+
Before:
237
+
Sale price$74.95
238
+
Sale priceFrom $1,398.00
239
+
Sale price$158.00
240
+
241
+
After:
242
+
74.95
243
+
1398.00
244
+
158.00
243
245
```
244
246
245
-
Now we can run commands which control this new project. We didn't change the template in any way though, so it won't scrape the Warehouse store for us yet.
246
-
247
-
Out of the box, the template includes a sample Actor that walks through the [crawlee.dev](https://crawlee.dev/) website and downloads all its pages. This process is called _crawling_, and the sample Actor uses a crawling tool called Crawlee, so its documentation is chosen as a sample target website. Let's see if we can run it:
247
+
When the agent is done, we'll approve the changes and verify in the command line that the Actor runs locally:
248
248
249
249
```text
250
250
apify run
251
251
```
252
252
253
-
If we see a flood of output mentioning something called `CheerioCrawler`, it means the template works and we can move on to editing its files so that it does what we want.
253
+
It runs, that's nice! But looking at the output, we can't really verify what exactly gets scraped! When we're at it, let's change that with another prompt:
254
254
255
255
```text
256
-
...
257
-
INFO CheerioCrawler: Starting the crawler.
258
-
INFO CheerioCrawler: enqueueing new URLs
259
-
INFO CheerioCrawler: Crawlee · Build reliable crawlers. Fast. {"url":"https://crawlee.dev/"}
260
-
...
261
-
INFO CheerioCrawler: Finished! Total 107 requests: 107 succeeded, 0 failed. {"terminal":true}
256
+
In the output of the scraper I want to see
257
+
how the items being saved look like.
262
258
```
263
259
264
-
We're done with commands for now, but do not close the Terminal or Command Prompt window yet, as we'll soon need it again.
265
-
266
-
:::caution Debugging
267
-
If we run into issues with the template wizard or the sample Actor, let's share this tutorial with [ChatGPT](https://chatgpt.com/), include the errors we saw, and ask for help debugging.
268
-
:::
269
-
270
-
## Scraping products
271
-
272
-
Now we're ready to get our own scraper done. We'll open the `src` directory inside the Actor project and find a file called `main.js`.
273
-
274
-
We'll open it in a _plain text editor_. Every operating system includes one: Notepad on Windows, TextEdit on macOS, and similar tools on Linux.
275
-
276
-
:::danger Avoid rich text editors
277
-
Let's not use a _rich text editor_, such as Microsoft Word. They're great for human-readable documents with rich formatting, but for code editing, we'll use either dedicated coding editors, or the simplest tool possible.
278
-
:::
279
-
280
-
In the editor, we can see JavaScript code. Let's select all the code and copy to our clipboard. Then we'll open a _new ChatGPT conversation_ and start with a prompt like this:
281
-
282
-
```text
283
-
I'm building an Apify Actor that will run on the Apify platform.
284
-
I need to modify a sample template project so it downloads
extracts all products in Sales, and returns data with
287
-
the following information for each product:
288
-
289
-
- Product name
290
-
- Product detail page URL
291
-
- Price
292
-
293
-
Before the program ends, it should log how many products it collected.
294
-
Code from main.js follows. Reply with a code block containing
295
-
a new version of that file.
296
-
```
297
-
298
-
We'll use <kbd>Shift+↵</kbd> to add a few empty lines, then paste the code from our clipboard. After submitting, the AI chat should return a large code block with a new version of `main.js`. Copy it, go back to our text editor, and replace the original `main.js` content.
299
-
300
-
:::info Code and colors
301
-
Code is plain text. Some tools color it to make it easier to read, and ChatGPT does this by default. Plain text editors usually show code in black and white, and that's completely fine.
302
-
:::
303
-
304
-
When we're done, we must not forget to _save the change_ with <kbd>Ctrl+S</kbd> or, on macOS, <kbd>Cmd+S</kbd>. Now let's see if the new code works. To run the program, let's go back to Terminal (macOS/Linux) or PowerShell (Windows) and use Apify CLI again:
260
+
We'll approve all changes and go to the command line again:
305
261
306
262
```text
307
263
apify run
308
264
```
309
265
310
-
If all goes well, the output should be similar to this:
266
+
Now, the output of the scraper contains the actual items being scraped and we can verify we've been successful with changing the format of the prices (they appear at the very end of each line):
311
267
312
268
```text
313
-
Run: npm run start
314
-
315
-
> warehouse-scraper@0.0.1 start
316
-
> node src/main.js
317
-
318
-
INFO System info {"apifyVersion":"3.6.0","apifyClientVersion":"2.22.2","crawleeVersion":"3.16.0","osType":"Darwin","nodeVersion":"v25.6.1"}
319
269
...
320
-
INFO CheerioCrawler: Starting the crawler.
321
270
INFO CheerioCrawler: Processing page: https://warehouse-theme-metal.myshopify.com/collections/sales
271
+
INFO CheerioCrawler: Saving dataset item {"name":"JBL Flip 4 Waterproof Portable Bluetooth Speaker","url":"https://warehouse-theme-metal.myshopify.com/products/jbl-flip-4-waterproof-portable-bluetooth-speaker","minPrice":74.95}
272
+
INFO CheerioCrawler: Saving dataset item {"name":"Sony XBR-950G BRAVIA 4K HDR Ultra HD TV","url":"https://warehouse-theme-metal.myshopify.com/products/sony-xbr-65x950g-65-class-64-5-diag-bravia-4k-hdr-ultra-hd-tv","minPrice":1398}
273
+
INFO CheerioCrawler: Saving dataset item {"name":"Sony SACS9 10\" Active Subwoofer","url":"https://warehouse-theme-metal.myshopify.com/products/sony-sacs9-10-inch-active-subwoofer","minPrice":158}
274
+
INFO CheerioCrawler: Saving dataset item {"name":"Sony PS-HX500 Hi-Res USB Turntable","url":"https://warehouse-theme-metal.myshopify.com/products/sony-ps-hx500-hi-res-usb-turntable","minPrice":398}
322
275
...
323
-
INFO CheerioCrawler: Finished!
324
-
INFO Total products collected: 24
325
276
```
326
277
327
-
This output says `Total products collected: 24`. The Sales page displays 24 products per page and contains 50 products in total.
278
+
Now let's push the changes back to Apify, so that our scheduled scraping happening on the platform can benefit from the improvements we've made locally on our computer.
279
+
280
+
:::tip Automatically approving changes
328
281
329
-
Depending on whether ChatGPT decided to walk through all pages or scrape just the first one, we might get 24 or more products. For now, any sign that it scraped products is good news.
282
+
If you'll grow tired of approvals, you can enable _auto-keep_. Go to **Cursor** → **Settings…** → **Cursor Settings** → **Agents** → **Applying Changes** and turn off **Inline Diffs**.
330
283
331
-
:::caution Debugging
332
-
If our program crashes instead, let's copy the error message, send it to our ChatGPT conversation, and ask for a fix.
333
284
:::
334
285
335
-
## Exporting to CSV
286
+
## Pushing Actor to Apify
336
287
337
-
Our program likely works, but we haven't seen the data yet. Let's add a CSV export. CSV is a format most data apps can read, including Microsoft Excel, Google Sheets, and Apple Numbers. Let's continue our ChatGPT conversation with:
288
+
To replace the Actor files living on the Apify platform with the ones we have locally, we can run the following command:
338
289
339
290
```text
340
-
Before the program ends, I want it to export all data
341
-
as "dataset.csv" in the current working directory.
291
+
apify push
342
292
```
343
293
344
-
ChatGPT should return a new code block with CSV export added. Let's replace `main.js` with that version and save our changes. Then let's run the scraper again:
294
+
The command can take a while to finish, because it also immediately triggers a build. Once it's done, the new version of the Actor is ready to be ran. The output of the command ends with these two lines:
345
295
346
296
```text
347
-
apify run
297
+
...
298
+
Actor detail https://console.apify.com/actors/EL7U7aNddXOzwEJ66
299
+
Success: Actor was deployed to Apify cloud and built there.
348
300
```
349
301
350
-
In the project directory, a new file called `dataset.csv` should emerge. We can use any of the programs mentioned earlier to check what's inside:
351
-
352
-
| productName | productUrl | price |
353
-
| --- | --- | --- |
354
-
| JBL Flip 4 Waterproof Portable Bluetooth Speaker | https://warehouse-theme-metal.myshopify.com/products/jbl-flip-4-waterproof-portable-bluetooth-speaker | Sale price$74.95 |
355
-
| Sony XBR-950G BRAVIA 4K HDR Ultra HD TV | https://warehouse-theme-metal.myshopify.com/products/sony-xbr-65x950g-65-class-64-5-diag-bravia-4k-hdr-ultra-hd-tv | Sale priceFrom $1,398.00 |
356
-
| Sony SACS9 10" Active Subwoofer | https://warehouse-theme-metal.myshopify.com/products/sony-sacs9-10-inch-active-subwoofer | Sale price$158.00 |
302
+
We'll follow the link to our browser and in the Apify interface, we'll click the **Start** button. Soon we should see items popping up in the **Output** section. For a full overwiew, let's switch to **All fields** again:
Well, does it? If we look closely, the prices include extra text, which isn't ideal. We'll improve this in one of the next lessons. We'll also improve the workflow so we don't have to keep copying and pasting.
306
+
We've done it, the prices save as numbers!
361
307
362
-
Despite a few flaws, we've successfully created a first working prototype of a price-watching app with no coding knowledge. And with a bit of extra command-line work, we now have something we can deploy to a platform where it can run regularly and reliably. In the next lesson, we'll do exactly that.
308
+
:::tip Specifying output schema
363
309
364
-
-->
310
+
If we didn't want to always click on **All fields** to see full items, we need to specify an [output schema](https://docs.apify.com/platform/actors/development/actor-definition/output-schema) so that the platform knows what it can expect and how it should display it in the interface. With Cursor, such change is just a single prompt away:
365
311
366
-
<!--
367
-
Explaining benefits (delegation and independent work, AGENTS.md). Getting environment ready, learning the ropes with a GUI/TUI. Using the `apify` CLI to start a project. Creating a basic scraper which does what we need.
312
+
```text
313
+
Change the output schema of the Actor
314
+
so that it represents the items being
315
+
saved the best way in the Apify interface.
316
+
```
368
317
369
-
In lesson 3, students would try to make changes via ChatGPT and see that it gets tedious, which leads to introducing an agent-based IDE to work inside the template more comfortably.
318
+
:::
370
319
371
-
The lesson should use Cursor (or Google Antigravity). Only if it truly scales to zero as they claim and it is not required to have a paid account to try an agent. Minimal friction, just install – beats any other decision factors.
320
+
## Wrapping up
372
321
373
-
If the paragraph above turns out being a wrong direction, we should use VS Code and tell people to spend $10 to try Copilot. VS Code is mainstream. Paying for Copilot is the cheapest agent offering, and it's quite powerful.
374
-
-->
322
+
We've been installing and setting up a lot, but once we got our environment ready, we could reap the benefits of fast changes to our scraper.
375
323
376
-
<!--
377
-
We'll choose [Cursor](https://cursor.com/), because it has a free plan and it's beginner-friendly.
324
+
With a single prompt we tackled a significant change in how our app stores the prices. And we still didn't need to know any coding.
378
325
379
-
#### Installing development environment
380
-
Explaining benefits (delegation and independent work, AGENTS.md). Getting environment ready. Use https://docs.apify.com/platform/actors/development/quick-start/build-with-ai
381
-
#### Scraping vendor names
382
-
Learning the ropes with a GUI/TUI, prompting the agent to update the code so that it scrapes vendor names. Run the program again, get better results.
326
+
To improve our project further, we ask the agent to perform a change, review and approve its work, then execute `apify run` in the command line to verify how it works, and finally `apify push` to upload our Actor files to Apify.
383
327
384
-
Teaser: Explain why this is fragile. In the next lesson we'll learn how to develop features of the scraper in a robust way by first specifying them as documentation.
385
-
-->
328
+
In the next lesson, we'll take a look at how we can develop our scraper by documenting how it should behave instead of prompting the AI agent feature by feature, without a track record of our intentions.
0 commit comments