Add README with features, setup, usage, and aria2 RPC details
This commit is contained in:
parent
9986140f26
commit
bc89a52d74
1 changed files with 130 additions and 2 deletions
132
README.md
132
README.md
|
|
@ -1,3 +1,131 @@
|
||||||
# archive-org-link-grabber
|
# Archive.org Link Grabber (Firefox WebExtension)
|
||||||
|
|
||||||
|
Add-on for Firefox that enhances archive.org download pages (https://archive.org/download/*) so you can filter file links (by type, name, size, or date), copy them to the clipboard (one URL per line), or send them directly to an aria2 RPC server.
|
||||||
|
|
||||||
|
> Status: README/spec first. Implementation details below describe the intended behavior and structure.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- Filter by file attributes: type/extension, name, size range, and date range.
|
||||||
|
- String or regex matching: name and type filters accept plain text (e.g., `mp4`) or regular expressions (e.g., `^(movie|clip).*\.mp4$`).
|
||||||
|
- Copy to clipboard: export filtered links, one per line.
|
||||||
|
- Send to aria2: forward links to a configured aria2 JSON-RPC endpoint using a secret token.
|
||||||
|
- Per-site parsing: targets archive.org collection download listings under `/download/*`.
|
||||||
|
- Persistent settings: stores filter presets and aria2 config in extension storage.
|
||||||
|
|
||||||
|
## Demo Workflow
|
||||||
|
|
||||||
|
1. Open an archive.org collection’s download page at `https://archive.org/download/<identifier>`.
|
||||||
|
2. Open the extension popup or page action.
|
||||||
|
3. Filter:
|
||||||
|
- Type/Name: enter plain strings (e.g., `mp4`, `subtitle`) or regex.
|
||||||
|
- Size: set min/max (e.g., `>= 100MB`, `<= 2GB`).
|
||||||
|
- Date: set from/to (uses the timestamp shown on the page when available).
|
||||||
|
4. Review the results list and count.
|
||||||
|
5. Choose an action:
|
||||||
|
- Copy: copies selected URLs to the clipboard, one per line.
|
||||||
|
- Send to aria2: pushes to your configured aria2 RPC server using `aria2.addUri`.
|
||||||
|
|
||||||
|
## Regex and Matching
|
||||||
|
|
||||||
|
- Plain strings: match anywhere in the value (case-insensitive by default, configurable).
|
||||||
|
- Regex: either toggle a “Use regex” option or enter values wrapped with `/.../` (optional flags like `i`, `m`).
|
||||||
|
- Type vs name: “type” typically refers to file extension; “name” is the full filename.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
|
||||||
|
- Type contains `mp4`
|
||||||
|
- Name regex `/^(movie|clip).*\.mp4$/i`
|
||||||
|
- Type regex `/^(mp4|mkv)$/`
|
||||||
|
|
||||||
|
## Aria2 Integration
|
||||||
|
|
||||||
|
- RPC method: `aria2.addUri` with `token:<SECRET>`.
|
||||||
|
- Batching: sends multiple links either individually or in small batches.
|
||||||
|
- Options (optional): directory, headers, and per-item output name can be supported via the UI.
|
||||||
|
|
||||||
|
Aria2 must be running with RPC enabled, for example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
aria2c \
|
||||||
|
--enable-rpc \
|
||||||
|
--rpc-listen-port=6800 \
|
||||||
|
--rpc-listen-all=false \
|
||||||
|
--rpc-secret=YOUR_SECRET_TOKEN
|
||||||
|
```
|
||||||
|
|
||||||
|
Extension settings include:
|
||||||
|
|
||||||
|
- RPC endpoint: protocol, host, port, path (default `/jsonrpc`).
|
||||||
|
- Secret token: the `--rpc-secret` value (stored in extension storage).
|
||||||
|
- Optional defaults: download directory and additional aria2 options.
|
||||||
|
|
||||||
|
Security notes:
|
||||||
|
|
||||||
|
- Keep your secret token private; do not commit it.
|
||||||
|
- If using a remote aria2, enable TLS/HTTPS and restrict access.
|
||||||
|
|
||||||
|
## Permissions
|
||||||
|
|
||||||
|
The extension will require:
|
||||||
|
|
||||||
|
- `https://archive.org/*` host permission to read and parse download pages.
|
||||||
|
- `storage` to persist settings and presets.
|
||||||
|
- `clipboardWrite` to copy links.
|
||||||
|
- Host permission for your aria2 endpoint (e.g., `http://localhost:6800/*` or your remote URL). Optional permissions may be requested at runtime.
|
||||||
|
|
||||||
|
## Parsing Strategy
|
||||||
|
|
||||||
|
- A content script runs on `https://archive.org/download/*` pages.
|
||||||
|
- It scrapes the file listing table/DOM and builds a dataset with name, URL, size, and date.
|
||||||
|
- Type is derived from filename extension (and may use content-type hints if available on the page).
|
||||||
|
- The popup UI queries this dataset, applies filters, and displays the results.
|
||||||
|
- A background script handles aria2 RPC calls to avoid CORS issues and keep secrets out of content scope.
|
||||||
|
|
||||||
|
## Installation (Temporary in Firefox)
|
||||||
|
|
||||||
|
1. Clone this repository.
|
||||||
|
2. In Firefox, open `about:debugging#/runtime/this-firefox`.
|
||||||
|
3. Click “Load Temporary Add-on…” and select the `manifest.json` in this repo.
|
||||||
|
4. Navigate to an archive.org download page and open the extension’s popup.
|
||||||
|
|
||||||
|
For development, make changes and click “Reload” in `about:debugging` to pick them up.
|
||||||
|
|
||||||
|
## Usage Tips
|
||||||
|
|
||||||
|
- Clipboard: copying usually requires a user gesture (click). The UI is designed to perform copy on button press.
|
||||||
|
- Case sensitivity: string filters default to case-insensitive; enable case-sensitive mode in settings if needed.
|
||||||
|
- Sizes: support common suffixes like `KB`, `MB`, `GB`. Ranges accept comparisons like `>= 100MB`.
|
||||||
|
- Dates: when the page provides timestamps, filters accept yyyy-mm-dd and ranges.
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
- No links found: ensure you are on a `/download/*` page (not the item overview). Try reloading after the page finishes loading.
|
||||||
|
- RPC errors: verify `aria2c` is running with `--enable-rpc` and that the secret/token matches. Check endpoint URL and port.
|
||||||
|
- CORS/Network: Extensions can call cross-origin endpoints with host permission. If using HTTPS with a self-signed cert, allow it in Firefox or use a valid cert.
|
||||||
|
- Clipboard blocked: confirm the browser allowed clipboard write; try clicking the button again or check site focus.
|
||||||
|
|
||||||
|
## Roadmap
|
||||||
|
|
||||||
|
- Optional per-file aria2 options (e.g., `out` for renaming).
|
||||||
|
- Smart batching and retry logic.
|
||||||
|
- Save/load named filter presets.
|
||||||
|
- Export/import settings.
|
||||||
|
- Support additional archive.org views if needed.
|
||||||
|
|
||||||
|
## Development Notes
|
||||||
|
|
||||||
|
- Tech stack: Standard WebExtension (manifest v3 when supported in Firefox; otherwise v2), with content script + background/service worker + popup UI.
|
||||||
|
- Storage: `browser.storage.local` for settings and aria2 configs; no analytics.
|
||||||
|
- Code style: keep dependencies minimal; prefer modern, framework-light UI for the popup.
|
||||||
|
|
||||||
|
## Contributing
|
||||||
|
|
||||||
|
Issues and PRs are welcome. If proposing new filters or aria2 options, please include example pages and expected behaviors.
|
||||||
|
|
||||||
|
## Disclaimer
|
||||||
|
|
||||||
|
This project is not affiliated with archive.org or aria2. Use responsibly and respect site terms of service.
|
||||||
|
|
||||||
Designed to grab multiple links from an archive.org page for passing to an aria2 download manager.
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue