Introduction
This tool helps you crawl documentation websites incrementally, extract their content, and create a search index in Upstash Search.Usage
It is available both as a CLI tool and a library.CLI Usage
You can run the CLI directly usingnpx
(no installation required):
- Your Upstash Search URL
- Your Upstash Search token
- (Optional) Custom index name
- The documentation URL to crawl
What the Tool Does
- Discover all internal documentation links
- Crawl each page and extract content
- Track new or obsolete data
- Upsert the new records into your Upstash Search index
Library Usage
You can also use this as a library in your own code:Obtaining Upstash Credentials
- Go to your Upstash Console.
- Select your Search index. (See How to Create Search Index)
- Under the Details section, copy your
UPSTASH_SEARCH_REST_URL
andUPSTASH_SEARCH_REST_TOKEN
.--upstash-url
corresponds toUPSTASH_SEARCH_REST_URL
--upstash-token
corresponds toUPSTASH_SEARCH_REST_TOKEN