The one that feels like cheating
Get the data a site loads behind the scenes
This is my favorite one, because it feels like cheating. A lot of websites don't actually keep their data in the page you see. They quietly download it in the background, in a neat and tidy form, and then dress it up into the nice-looking page. If you can catch that tidy version, you skip all the messy parts and get perfect data. Developers call this reverse engineering an API. You can just call it the shortcut.
No copying. No fighting with the layout. You just find the thing the website already grabs for itself, and grab the same thing.
You don't need to be technical for this
Every browser has a built-in panel that shows you what a website is doing behind the curtain. It sounds intimidating, but for this you only need to do one thing: open it, and watch.
1. Open the panel and watch
On the website, press F12 (or right-click and choose "Inspect"), then click the tab usually called "Network." Now use the page. Search, scroll, click a filter. You'll see a list of things the page is fetching. One of them is the actual data you want, almost always in a clean, tidy format.
2. Look at what it grabbed
Click the one that looks like data and you'll see it laid out neatly: a list of records, each with the same fields. Names, cities, categories, all already separated out. This is the website's own private, organized copy of its data. It's cleaner than anything you could pull off the visible page.
3. Hand it to Claude to grab the rest
Once you've found where the tidy data comes from, Claude can fetch it for you over and over, page after page, until it has the whole set. The website itself only ever loads one page at a time. Claude just keeps asking until it runs out.
Where the workhorse play exists because most pages are messy, this play exists for the happy times when the data is already perfectly organized and you just have to notice it.
| Field | Example | Why it's nice |
|---|---|---|
| name | Northwind Co | clean, no clutter |
| city | Austin | already separated out |
| category | wholesale | the site's own labels |
| total count | 3,700 | tells you how much there is |
One bit of manners: this is the same data path the website uses for every visitor, so be polite. Go slow, don't hammer it, and respect anything the site clearly asks you not to do.
How it works under the hood
What it uses
Worth knowing
Can't find the tidy data?
Send me the site. I'll find it.
If a website loads its data behind the scenes, I can usually catch it and pull the whole thing.
Free to do yourself. All the plays are right here.
Keep going