- Add wildcard host permissions for RPC calls - Surface HTTPS-Only hint in Options test flow - Update Troubleshooting docs for HTTPS-Only and host perms
132 lines
6.1 KiB
Markdown
132 lines
6.1 KiB
Markdown
# Archive.org Link Grabber (Firefox WebExtension)
|
||
|
||
Add-on for Firefox that enhances archive.org download pages (https://archive.org/download/*) so you can filter file links (by type, name, size, or date), copy them to the clipboard (one URL per line), or send them directly to an aria2 RPC server.
|
||
|
||
> Status: README/spec first. Implementation details below describe the intended behavior and structure.
|
||
|
||
---
|
||
|
||
## Features
|
||
|
||
- Filter by file attributes: type/extension, name, size range, and date range.
|
||
- String or regex matching: name and type filters accept plain text (e.g., `mp4`) or regular expressions (e.g., `^(movie|clip).*\.mp4$`).
|
||
- Copy to clipboard: export filtered links, one per line.
|
||
- Send to aria2: forward links to a configured aria2 JSON-RPC endpoint using a secret token.
|
||
- Per-site parsing: targets archive.org collection download listings under `/download/*`.
|
||
- Persistent settings: stores filter presets and aria2 config in extension storage.
|
||
|
||
## Demo Workflow
|
||
|
||
1. Open an archive.org collection’s download page at `https://archive.org/download/<identifier>`.
|
||
2. Open the extension popup or page action.
|
||
3. Filter:
|
||
- Type/Name: enter plain strings (e.g., `mp4`, `subtitle`) or regex.
|
||
- Size: set min/max (e.g., `>= 100MB`, `<= 2GB`).
|
||
- Date: set from/to (uses the timestamp shown on the page when available).
|
||
4. Review the results list and count.
|
||
5. Choose an action:
|
||
- Copy: copies selected URLs to the clipboard, one per line.
|
||
- Send to aria2: pushes to your configured aria2 RPC server using `aria2.addUri`.
|
||
|
||
## Regex and Matching
|
||
|
||
- Plain strings: match anywhere in the value (case-insensitive by default, configurable).
|
||
- Regex: either toggle a “Use regex” option or enter values wrapped with `/.../` (optional flags like `i`, `m`).
|
||
- Type vs name: “type” typically refers to file extension; “name” is the full filename.
|
||
|
||
Examples:
|
||
|
||
- Type contains `mp4`
|
||
- Name regex `/^(movie|clip).*\.mp4$/i`
|
||
- Type regex `/^(mp4|mkv)$/`
|
||
|
||
## Aria2 Integration
|
||
|
||
- RPC method: `aria2.addUri` with `token:<SECRET>`.
|
||
- Batching: sends multiple links either individually or in small batches.
|
||
- Options (optional): directory, headers, and per-item output name can be supported via the UI.
|
||
|
||
Aria2 must be running with RPC enabled, for example:
|
||
|
||
```bash
|
||
aria2c \
|
||
--enable-rpc \
|
||
--rpc-listen-port=6800 \
|
||
--rpc-listen-all=false \
|
||
--rpc-secret=YOUR_SECRET_TOKEN
|
||
```
|
||
|
||
Extension settings include:
|
||
|
||
- RPC endpoint: protocol, host, port, path (default `/jsonrpc`).
|
||
- Secret token: the `--rpc-secret` value (stored in extension storage).
|
||
- Optional defaults: download directory and additional aria2 options.
|
||
|
||
Security notes:
|
||
|
||
- Keep your secret token private; do not commit it.
|
||
- If using a remote aria2, enable TLS/HTTPS and restrict access.
|
||
|
||
## Permissions
|
||
|
||
The extension will require:
|
||
|
||
- `https://archive.org/*` host permission to read and parse download pages.
|
||
- `storage` to persist settings and presets.
|
||
- `clipboardWrite` to copy links.
|
||
- Host permission for your aria2 endpoint (e.g., `http://localhost:6800/*` or your remote URL). Optional permissions may be requested at runtime.
|
||
|
||
## Parsing Strategy
|
||
|
||
- A content script runs on `https://archive.org/download/*` pages.
|
||
- It scrapes the file listing table/DOM and builds a dataset with name, URL, size, and date.
|
||
- Type is derived from filename extension (and may use content-type hints if available on the page).
|
||
- The popup UI queries this dataset, applies filters, and displays the results.
|
||
- A background script handles aria2 RPC calls to avoid CORS issues and keep secrets out of content scope.
|
||
|
||
## Installation (Temporary in Firefox)
|
||
|
||
1. Clone this repository.
|
||
2. In Firefox, open `about:debugging#/runtime/this-firefox`.
|
||
3. Click “Load Temporary Add-on…” and select the `manifest.json` in this repo.
|
||
4. Navigate to an archive.org download page and open the extension’s popup.
|
||
|
||
For development, make changes and click “Reload” in `about:debugging` to pick them up.
|
||
|
||
## Usage Tips
|
||
|
||
- Clipboard: copying usually requires a user gesture (click). The UI is designed to perform copy on button press.
|
||
- Case sensitivity: string filters default to case-insensitive; enable case-sensitive mode in settings if needed.
|
||
- Sizes: support common suffixes like `KB`, `MB`, `GB`. Ranges accept comparisons like `>= 100MB`.
|
||
- Dates: when the page provides timestamps, filters accept yyyy-mm-dd and ranges.
|
||
|
||
## Troubleshooting
|
||
|
||
- No links found: ensure you are on a `/download/*` page (not the item overview). Try reloading after the page finishes loading.
|
||
- RPC errors: verify `aria2c` is running with `--enable-rpc` and that the secret/token matches. Check endpoint URL and port.
|
||
- HTTPS-Only Mode: if your aria2 endpoint is `http://` on a non-local host, Firefox may upgrade it to `https://` and the request will fail. Use HTTPS on the aria2 RPC (preferred), add a site exception for the host, or disable HTTPS-Only Mode while testing.
|
||
- Host permissions: the extension needs host permission to reach non-local RPC endpoints. The manifest includes wildcard host permissions; if you self-package, ensure your manifest allows your RPC host(s).
|
||
- CORS/Network: Extensions can call cross-origin endpoints with host permission. If using HTTPS with a self-signed cert, allow it in Firefox or use a valid cert.
|
||
- Clipboard blocked: confirm the browser allowed clipboard write; try clicking the button again or check site focus.
|
||
|
||
## Roadmap
|
||
|
||
- Optional per-file aria2 options (e.g., `out` for renaming).
|
||
- Smart batching and retry logic.
|
||
- Save/load named filter presets.
|
||
- Export/import settings.
|
||
- Support additional archive.org views if needed.
|
||
|
||
## Development Notes
|
||
|
||
- Tech stack: Standard WebExtension (manifest v3 when supported in Firefox; otherwise v2), with content script + background/service worker + popup UI.
|
||
- Storage: `browser.storage.local` for settings and aria2 configs; no analytics.
|
||
- Code style: keep dependencies minimal; prefer modern, framework-light UI for the popup.
|
||
|
||
## Contributing
|
||
|
||
Issues and PRs are welcome. If proposing new filters or aria2 options, please include example pages and expected behaviors.
|
||
|
||
## Disclaimer
|
||
|
||
This project is not affiliated with archive.org or aria2. Use responsibly and respect site terms of service.
|