203 lines
6.2 KiB
Markdown
203 lines
6.2 KiB
Markdown
# Async HTTP Queue Fetch URLs
|
|
|
|
Lightweight, parallel HTTP fetching library for Emacs using `url-retrieve` with configurable concurrency limits.
|
|
|
|
## Why Use This Library?
|
|
|
|
While Emacs has several HTTP libraries, `async-http-queue-fetch-urls` fills a specific need: **high-level batch HTTP fetching with controlled concurrency**.
|
|
|
|
### Comparison with Existing Solutions
|
|
|
|
**Built-in `url-queue-retrieve`**
|
|
- ❌ Low-level API: Requires manual callback management per URL
|
|
- ❌ No batch processing: Must write your own loop and aggregation
|
|
- ❌ Global configuration: Uses global variables instead of per-call parameters
|
|
- ❌ Order not preserved: Results arrive in completion order, not request order
|
|
|
|
**Third-party libraries ([plz.el](https://github.com/alphapapa/plz.el), [request.el](https://github.com/tkf/emacs-request))**
|
|
- ❌ Single-request focused: Designed for one URL at a time
|
|
- ❌ No built-in queuing: Manual implementation needed for batch operations
|
|
- 🟡 External dependencies: Some require curl (though more performant)
|
|
|
|
**This library (`async-http-queue-fetch-urls`)**
|
|
- ✅ High-level batch API: One function call for multiple URLs
|
|
- ✅ Order preservation: Results vector matches input URL order
|
|
- ✅ Per-call configuration: Keyword arguments instead of global state
|
|
- ✅ Configurable parser: JSON by default, customizable or raw text
|
|
- ✅ Progress tracking: Automatic messages for large batches
|
|
- ✅ No external dependencies: Only built-in `url-retrieve`
|
|
- ✅ Clean callback pattern: Single callback with all results
|
|
|
|
### When to Use This Library
|
|
|
|
Use `async-http-queue-fetch-urls` when you need to:
|
|
- Fetch multiple URLs in parallel (API endpoints, RSS feeds, web scraping)
|
|
- Control concurrency to avoid overwhelming servers
|
|
- Maintain result order corresponding to input URLs
|
|
- Get all results in a single callback with simple error handling
|
|
- Parse responses consistently (JSON, XML, or custom formats)
|
|
|
|
For single requests or curl-based performance, consider [plz.el](https://github.com/alphapapa/plz.el) or [request.el](https://github.com/tkf/emacs-request) instead.
|
|
|
|
## Features
|
|
|
|
- Parallel downloads with configurable concurrency (default: 5)
|
|
- Automatic timeout handling (default: 10 seconds)
|
|
- Custom parser support (default: `json-parse-buffer`)
|
|
- Progress tracking for large batches
|
|
- Error handling per request
|
|
- Maintains original URL order in results
|
|
|
|
## Requirements
|
|
|
|
Emacs 28.1 or later.
|
|
|
|
## Installation
|
|
|
|
### use-package with :vc (Emacs 29+)
|
|
|
|
```el
|
|
(use-package async-http-queue-fetch-urls
|
|
:vc (:url "https://git.andros.dev/andros/async-http-queue-fetch-urls-el"
|
|
:rev :newest))
|
|
```
|
|
|
|
### use-package with :load-path
|
|
|
|
```el
|
|
(use-package async-http-queue-fetch-urls
|
|
:load-path "/path/to/async-http-queue-fetch-urls-el")
|
|
```
|
|
|
|
### Manual
|
|
|
|
Clone the repository and add to your `load-path`:
|
|
|
|
```bash
|
|
git clone https://git.andros.dev/andros/async-http-queue-fetch-urls-el.git
|
|
```
|
|
|
|
Then in your config:
|
|
|
|
```el
|
|
(add-to-list 'load-path "/path/to/async-http-queue-fetch-urls-el")
|
|
(require 'async-http-queue-fetch-urls)
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Basic JSON API Example
|
|
|
|
```el
|
|
(async-http-queue-fetch-urls
|
|
'("https://api.example.com/posts/1"
|
|
"https://api.example.com/posts/2"
|
|
"https://api.example.com/posts/3")
|
|
:callback (lambda (results)
|
|
(message "Got %d results" (length results))
|
|
(dolist (result results)
|
|
(when result
|
|
(message "Title: %s" (alist-get 'title result))))))
|
|
```
|
|
|
|
### Custom Concurrency and Timeout
|
|
|
|
```el
|
|
(async-http-queue-fetch-urls
|
|
my-url-list
|
|
:max-concurrent 10
|
|
:timeout 20
|
|
:callback (lambda (results)
|
|
(message "Fetched %d URLs" (length results))))
|
|
```
|
|
|
|
### Raw Text Instead of JSON
|
|
|
|
```el
|
|
(async-http-queue-fetch-urls
|
|
'("https://example.com/page1.html"
|
|
"https://example.com/page2.html")
|
|
:parser nil ; Return raw text
|
|
:callback (lambda (results)
|
|
(dolist (html results)
|
|
(when html
|
|
(message "Page length: %d chars" (length html))))))
|
|
```
|
|
|
|
### Custom Parser
|
|
|
|
```el
|
|
(async-http-queue-fetch-urls
|
|
'("https://example.com/data.xml")
|
|
:parser (lambda ()
|
|
(libxml-parse-xml-region (point) (point-max)))
|
|
:callback (lambda (results)
|
|
(message "Parsed XML: %S" results)))
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
```el
|
|
(async-http-queue-fetch-urls
|
|
my-urls
|
|
:callback (lambda (results)
|
|
(let ((successful (seq-filter #'identity results)))
|
|
(message "Successfully fetched %d/%d URLs"
|
|
(length successful)
|
|
(length results))))
|
|
:error-callback (lambda (url)
|
|
(message "Failed to fetch: %s" url)))
|
|
```
|
|
|
|
## API
|
|
|
|
### async-http-queue-fetch-urls
|
|
|
|
```
|
|
(async-http-queue-fetch-urls URLS &key CALLBACK ERROR-CALLBACK MAX-CONCURRENT TIMEOUT PARSER)
|
|
```
|
|
|
|
Fetch URLS asynchronously in parallel and call CALLBACK with results.
|
|
|
|
**Parameters:**
|
|
|
|
- `URLS` - List of URL strings to fetch
|
|
- `:callback` - Function called with vector of results when complete. Failed requests are represented as `nil`
|
|
- `:error-callback` - Optional function called for each failed URL with the URL as argument
|
|
- `:max-concurrent` - Maximum number of parallel downloads (default: 5)
|
|
- `:timeout` - Maximum time in seconds per request (default: 10)
|
|
- `:parser` - Function to parse response bodies (default: `json-parse-buffer`). Set to `nil` for raw text
|
|
|
|
**Returns:** Immediately (non-blocking). Results are delivered via callback.
|
|
|
|
## Performance
|
|
|
|
The library uses `url-retrieve` with controlled concurrency to avoid overwhelming servers or network connections. Default settings (5 concurrent requests) work well for most APIs.
|
|
|
|
For fast, reliable APIs, you can increase concurrency:
|
|
|
|
```el
|
|
:max-concurrent 10 ; or higher
|
|
```
|
|
|
|
For rate-limited APIs, decrease concurrency:
|
|
|
|
```el
|
|
:max-concurrent 2
|
|
```
|
|
|
|
## Contributing
|
|
|
|
Contributions are welcome! Please see the [contribution guidelines](https://git.andros.dev/andros/contribute) for instructions on how to submit issues or pull requests.
|
|
|
|
## License
|
|
|
|
Copyright (C) 2025 Andros Fenollosa
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation, either version 3 of the License, or
|
|
(at your option) any later version.
|
|
|
|
See LICENSE file for details.
|