Skip to content

Commit c0c2425

Browse files
committed
feat: add Lightpanda guide and examples
1 parent 531614a commit c0c2425

File tree

6 files changed

+126
-0
lines changed

6 files changed

+126
-0
lines changed

apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam._index/route.tsx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -715,6 +715,7 @@ function HelpfulInfoHasTasks({ onClose }: { onClose: () => void }) {
715715
isExternal
716716
/>
717717
<LinkWithIcon to={docsPath("/examples/puppeteer")} description="Puppeteer" isExternal />
718+
<LinkWithIcon to={docsPath("/examples/lightpanda")} description="Lightpanda" isExternal />
718719
<LinkWithIcon to={docsPath("/examples/react-pdf")} description="React to PDF" isExternal />
719720
<LinkWithIcon
720721
to={docsPath("/examples/resend-email-sequence")}

docs/docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -359,6 +359,7 @@
359359
"guides/examples/fal-ai-realtime",
360360
"guides/examples/ffmpeg-video-processing",
361361
"guides/examples/firecrawl-url-crawl",
362+
"guides/examples/lightpanda",
362363
"guides/examples/libreoffice-pdf-conversion",
363364
"guides/examples/open-ai-with-retrying",
364365
"guides/examples/pdf-to-image",
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
---
2+
title: "Get a webpage's content using Lightpanda browser"
3+
sidebarTitle: "Lightpanda"
4+
description: "In these examples, we will show you how to crawl using Lightpanda browser and Trigger.dev."
5+
---
6+
7+
## Overview
8+
9+
Lightpanda is a purpose-built browser for AI and automation workflows. It is 10x faster, uses 10x less RAM than Chrome headless.
10+
11+
You will find here are a couple of examples of how to use Lightpanda with Trigger.dev.
12+
13+
<Warning>
14+
When using Lightpanda, we recommend that you respect robots.txt files and avoid high frequency requesting websites.
15+
DDOS could happen fast for small infrastructures.
16+
</Warning>
17+
18+
## Prerequisites
19+
20+
- A project with [Trigger.dev initialized](/quick-start)
21+
- A [Lightpanda](https://lightpanda.io/) cloud token (for the 1st example)
22+
23+
## Example \#1 - Get links from a website using Lightpanda cloud & Puppeteer
24+
25+
In this task, we use Lightpanda browser to get links from a provided URL.
26+
You will have to pass the URL as a payload when triggering the task.
27+
28+
```ts trigger/lightpanda-cloud-puppeteer.ts
29+
import puppeteer from "puppeteer"
30+
31+
export const lightpandaCloudPuppeteer = task({
32+
id: "lightpanda-cloud-puppeteer",
33+
run: async (payload: { url: string }) => {
34+
const { url } = payload
35+
36+
const browser = await puppeteer.connect({
37+
browserWSEndpoint: "wss://cloud.lightpanda.io/ws?browser=lightpanda&token=TOKEN",
38+
})
39+
const context = await browser.createBrowserContext()
40+
const page = await context.newPage()
41+
42+
// Dump all the links from the page.
43+
await page.goto(url)
44+
45+
const links = await page.evaluate(() => {
46+
return Array.from(document.querySelectorAll('a')).map(row => {
47+
return row.getAttribute('href')
48+
})
49+
})
50+
51+
await page.close()
52+
await context.close()
53+
await browser.disconnect()
54+
55+
return {
56+
links,
57+
}
58+
},
59+
})
60+
```
61+
### Proxies
62+
63+
Proxies can be used with your browser via the proxy query string parameter. By default, the proxy used is "datacenter" which is a pool of shared datacenter IPs.
64+
`datacenter` accepts an optional `country` query string parameter, an [ISO 3166-1 alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) country code.
65+
66+
_Example using a German IP :_
67+
68+
```wss://cloud.lightpanda.io/ws?proxy=datacenter&country=de&token=TOKEN```
69+
70+
71+
72+
### Session
73+
A session is alive until you close it or the connection is closed. The max time duration of a session is 15 min.
74+
75+
76+
## Example \#2 - Launch and use a Lightpanda CDP server
77+
78+
This task initialises a Lightpanda CDP server to allow you to scrape directly via Trigger.dev.
79+
80+
### Configuration
81+
82+
To use this example, you will need to add these build settings to your `trigger.config.ts` file:
83+
84+
```ts trigger.config.ts
85+
import { defineConfig } from "@trigger.dev/sdk/v3";
86+
import { lightpanda } from "@trigger.dev/build/extensions/lightpanda";
87+
88+
export default defineConfig({
89+
project: "<project ref>",
90+
// Your other config settings...
91+
build: {
92+
// This is required to use the Puppeteer library
93+
extensions: [lightpanda()],
94+
},
95+
});
96+
```
97+
That will set a `LIGHTPANDA_BROWSER_PATH` env variable that will be needed to get access to the binary.
98+
99+
### Task
100+
101+
Your task will have to launch a child process in order to have the websocket available to scrape using Puppeteer.
102+
103+
```ts trigger/lightpandaLaunch.ts
104+
import puppeteer from "puppeteer";
105+
106+
export const lightpandaLaunch = task({
107+
id: "lightpanda-launch",
108+
run: async (payload: { url: string }) => {
109+
110+
// use browserWSEndpoint to pass the Lightpanda's CDP server address.
111+
const browser = await puppeteer.connect({
112+
browserWSEndpoint: "ws://127.0.0.1:9222",
113+
})
114+
115+
const page = await browser.newPage();
116+
117+
return {
118+
data: scrapeResult,
119+
};
120+
},
121+
});
122+
```

docs/guides/introduction.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ Task code you can copy and paste to use in your project. They can all be extende
6969
| [FFmpeg video processing](/guides/examples/ffmpeg-video-processing) | Use FFmpeg to process a video in various ways and save it to Cloudflare R2. |
7070
| [Firecrawl URL crawl](/guides/examples/firecrawl-url-crawl) | Learn how to use Firecrawl to crawl a URL and return LLM-ready markdown. |
7171
| [LibreOffice PDF conversion](/guides/examples/libreoffice-pdf-conversion) | Convert a document to PDF using LibreOffice. |
72+
| [Lightpanda](/guides/examples/lightpanda) | Use Lightpanda browser (or cloud version) to get a webpage's content. |
7273
| [OpenAI with retrying](/guides/examples/open-ai-with-retrying) | Create a reusable OpenAI task with custom retry options. |
7374
| [PDF to image](/guides/examples/pdf-to-image) | Use `MuPDF` to turn a PDF into images and save them to Cloudflare R2. |
7475
| [Puppeteer](/guides/examples/puppeteer) | Use Puppeteer to generate a PDF or scrape a webpage. |

docs/images/intro-lightpanda.jpg

11.6 KB
Loading

docs/introduction.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ We provide everything you need to build and manage background tasks: a CLI and S
8383
<Card title="Supabase" img="/images/intro-supabase.jpg" href="/guides/examples/supabase-database-operations"/>
8484
<Card title="DALL•E" img="/images/intro-openai.jpg" href="/guides/examples/dall-e3-generate-image"/>
8585
<Card title="Firecrawl" img="/images/intro-firecrawl.jpg" href="/guides/examples/firecrawl-url-crawl"/>
86+
<Card title="Lightpanda" img="/images/intro-lightpanda.jpg" href="/guides/examples/lightpanda"/>
8687
</CardGroup>
8788

8889
## Explore by build extension

0 commit comments

Comments
 (0)