131 lines
		
	
	
	
		
			5.7 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			131 lines
		
	
	
	
		
			5.7 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
# Archive.org Link Grabber (Firefox WebExtension)
 | 
						||
 | 
						||
Add-on for Firefox that enhances archive.org download pages (https://archive.org/download/*) so you can filter file links (by type, name, size, or date), copy them to the clipboard (one URL per line), or send them directly to an aria2 RPC server.
 | 
						||
 | 
						||
> Status: README/spec first. Implementation details below describe the intended behavior and structure.
 | 
						||
 | 
						||
---
 | 
						||
 | 
						||
## Features
 | 
						||
 | 
						||
- Filter by file attributes: type/extension, name, size range, and date range.
 | 
						||
- String or regex matching: name and type filters accept plain text (e.g., `mp4`) or regular expressions (e.g., `^(movie|clip).*\.mp4$`).
 | 
						||
- Copy to clipboard: export filtered links, one per line.
 | 
						||
- Send to aria2: forward links to a configured aria2 JSON-RPC endpoint using a secret token.
 | 
						||
- Per-site parsing: targets archive.org collection download listings under `/download/*`.
 | 
						||
- Persistent settings: stores filter presets and aria2 config in extension storage.
 | 
						||
 | 
						||
## Demo Workflow
 | 
						||
 | 
						||
1. Open an archive.org collection’s download page at `https://archive.org/download/<identifier>`.
 | 
						||
2. Open the extension popup or page action.
 | 
						||
3. Filter:
 | 
						||
   - Type/Name: enter plain strings (e.g., `mp4`, `subtitle`) or regex.
 | 
						||
   - Size: set min/max (e.g., `>= 100MB`, `<= 2GB`).
 | 
						||
   - Date: set from/to (uses the timestamp shown on the page when available).
 | 
						||
4. Review the results list and count.
 | 
						||
5. Choose an action:
 | 
						||
   - Copy: copies selected URLs to the clipboard, one per line.
 | 
						||
   - Send to aria2: pushes to your configured aria2 RPC server using `aria2.addUri`.
 | 
						||
 | 
						||
## Regex and Matching
 | 
						||
 | 
						||
- Plain strings: match anywhere in the value (case-insensitive by default, configurable).
 | 
						||
- Regex: either toggle a “Use regex” option or enter values wrapped with `/.../` (optional flags like `i`, `m`).
 | 
						||
- Type vs name: “type” typically refers to file extension; “name” is the full filename.
 | 
						||
 | 
						||
Examples:
 | 
						||
 | 
						||
- Type contains `mp4`
 | 
						||
- Name regex `/^(movie|clip).*\.mp4$/i`
 | 
						||
- Type regex `/^(mp4|mkv)$/`
 | 
						||
 | 
						||
## Aria2 Integration
 | 
						||
 | 
						||
- RPC method: `aria2.addUri` with `token:<SECRET>`.
 | 
						||
- Batching: sends multiple links either individually or in small batches.
 | 
						||
- Options (optional): directory, headers, and per-item output name can be supported via the UI.
 | 
						||
 | 
						||
Aria2 must be running with RPC enabled, for example:
 | 
						||
 | 
						||
```bash
 | 
						||
aria2c \
 | 
						||
  --enable-rpc \
 | 
						||
  --rpc-listen-port=6800 \
 | 
						||
  --rpc-listen-all=false \
 | 
						||
  --rpc-secret=YOUR_SECRET_TOKEN
 | 
						||
```
 | 
						||
 | 
						||
Extension settings include:
 | 
						||
 | 
						||
- RPC endpoint: protocol, host, port, path (default `/jsonrpc`).
 | 
						||
- Secret token: the `--rpc-secret` value (stored in extension storage).
 | 
						||
- Optional defaults: download directory and additional aria2 options.
 | 
						||
 | 
						||
Security notes:
 | 
						||
 | 
						||
- Keep your secret token private; do not commit it.
 | 
						||
- If using a remote aria2, enable TLS/HTTPS and restrict access.
 | 
						||
 | 
						||
## Permissions
 | 
						||
 | 
						||
The extension will require:
 | 
						||
 | 
						||
- `https://archive.org/*` host permission to read and parse download pages.
 | 
						||
- `storage` to persist settings and presets.
 | 
						||
- `clipboardWrite` to copy links.
 | 
						||
- Host permission for your aria2 endpoint (e.g., `http://localhost:6800/*` or your remote URL). Optional permissions may be requested at runtime.
 | 
						||
 | 
						||
## Parsing Strategy
 | 
						||
 | 
						||
- A content script runs on `https://archive.org/download/*` pages.
 | 
						||
- It scrapes the file listing table/DOM and builds a dataset with name, URL, size, and date.
 | 
						||
- Type is derived from filename extension (and may use content-type hints if available on the page).
 | 
						||
- The popup UI queries this dataset, applies filters, and displays the results.
 | 
						||
- A background script handles aria2 RPC calls to avoid CORS issues and keep secrets out of content scope.
 | 
						||
 | 
						||
## Installation (Temporary in Firefox)
 | 
						||
 | 
						||
1. Clone this repository.
 | 
						||
2. In Firefox, open `about:debugging#/runtime/this-firefox`.
 | 
						||
3. Click “Load Temporary Add-on…” and select the `manifest.json` in this repo.
 | 
						||
4. Navigate to an archive.org download page and open the extension’s popup.
 | 
						||
 | 
						||
For development, make changes and click “Reload” in `about:debugging` to pick them up.
 | 
						||
 | 
						||
## Usage Tips
 | 
						||
 | 
						||
- Clipboard: copying usually requires a user gesture (click). The UI is designed to perform copy on button press.
 | 
						||
- Case sensitivity: string filters default to case-insensitive; enable case-sensitive mode in settings if needed.
 | 
						||
- Sizes: support common suffixes like `KB`, `MB`, `GB`. Ranges accept comparisons like `>= 100MB`.
 | 
						||
- Dates: when the page provides timestamps, filters accept yyyy-mm-dd and ranges.
 | 
						||
 | 
						||
## Troubleshooting
 | 
						||
 | 
						||
- No links found: ensure you are on a `/download/*` page (not the item overview). Try reloading after the page finishes loading.
 | 
						||
- RPC errors: verify `aria2c` is running with `--enable-rpc` and that the secret/token matches. Check endpoint URL and port.
 | 
						||
- CORS/Network: Extensions can call cross-origin endpoints with host permission. If using HTTPS with a self-signed cert, allow it in Firefox or use a valid cert.
 | 
						||
- Clipboard blocked: confirm the browser allowed clipboard write; try clicking the button again or check site focus.
 | 
						||
 | 
						||
## Roadmap
 | 
						||
 | 
						||
- Optional per-file aria2 options (e.g., `out` for renaming).
 | 
						||
- Smart batching and retry logic.
 | 
						||
- Save/load named filter presets.
 | 
						||
- Export/import settings.
 | 
						||
- Support additional archive.org views if needed.
 | 
						||
 | 
						||
## Development Notes
 | 
						||
 | 
						||
- Tech stack: Standard WebExtension (manifest v3 when supported in Firefox; otherwise v2), with content script + background/service worker + popup UI.
 | 
						||
- Storage: `browser.storage.local` for settings and aria2 configs; no analytics.
 | 
						||
- Code style: keep dependencies minimal; prefer modern, framework-light UI for the popup.
 | 
						||
 | 
						||
## Contributing
 | 
						||
 | 
						||
Issues and PRs are welcome. If proposing new filters or aria2 options, please include example pages and expected behaviors.
 | 
						||
 | 
						||
## Disclaimer
 | 
						||
 | 
						||
This project is not affiliated with archive.org or aria2. Use responsibly and respect site terms of service.
 | 
						||
 |