icons | ||
releases | ||
scripts | ||
src | ||
.env.example | ||
.gitignore | ||
AGENTS.md | ||
LICENSE | ||
manifest.json | ||
package-lock.json | ||
package.json | ||
README.md |
Archive.org Link Grabber (Firefox WebExtension)
Add-on for Firefox that enhances archive.org download pages (https://archive.org/download/*) so you can filter file links (by type, name, size, or date), copy them to the clipboard (one URL per line), or send them directly to an aria2 RPC server. It also adds context‑menu quick actions on archive.org pages.
Download
Grab the latest build from my self-hosted feed at add-ons.jordanwages.com
:
Permissions
This add-on asks for only the permissions it truly needs. It runs on archive.org download pages so it can read the list of files, lets you copy links to your clipboard, and can send them to an aria2 server only when you ask. Your settings (including any aria2 secret) stay on your computer in the browser’s local storage. The add-on does not track you or read data on other sites.
- https://archive.org/*: needed to read and parse archive.org download pages. The content script only runs on
https://archive.org/download/*
, and the background uses this to show the tab badge and context menu on those pages. - storage: saves your aria2 endpoint/secret and preferences in
browser.storage.local
. - clipboardWrite: lets the popup and quick actions copy filtered links to your clipboard when you click Copy.
- activeTab: allows on‑demand injection of the small collector script into the current tab when you open the popup or use the context menu. It does not run on other sites unless you interact.
- contextMenus: adds the “Collect links…” entry and dynamic quick actions (“Download/Copy All” and top file types) to the page context menu on archive.org pages.
- Host permissions for aria2 RPC endpoints:
- http://localhost:6800/* and https://localhost:6800/*: defaults for local aria2 instances.
- http:/// and https:///: enables connecting to non‑local aria2 endpoints you configure in Options. No network request is made unless you click Send or Test; your secret stays in extension storage, and RPC calls originate from the background script.
Features
- Filter by file attributes: type/extension, name, size range, and date range.
- String or regex matching: name and type filters accept plain text (e.g.,
mp4
) or regular expressions (e.g.,^(movie|clip).*\.mp4$
). - Copy to clipboard: export filtered links, one per line.
- Send to aria2: forward links to a configured aria2 JSON-RPC endpoint using a secret token.
- Per-site parsing: targets archive.org collection download listings under
/download/*
. - Persistent settings: stores filter presets and aria2 config in extension storage.
- Context menu: "Collect links…" with dynamic quick actions on archive.org pages. It offers "Download/Copy All" and type‑specific actions for the most common file types on the page.
Demo Workflow
- Open an archive.org collection’s download page at
https://archive.org/download/<identifier>
. - Open the extension popup or page action.
- Filter:
- Type/Name: enter plain strings (e.g.,
mp4
,subtitle
) or regex. - Size: set min/max (e.g.,
>= 100MB
,<= 2GB
). - Date: set from/to (uses the timestamp shown on the page when available).
- Type/Name: enter plain strings (e.g.,
- Review the results list and count.
- Choose an action:
- Copy: copies selected URLs to the clipboard, one per line.
- Send to aria2: pushes to your configured aria2 RPC server using
aria2.addUri
. - Or use the page’s context menu: right‑click → "Collect links…" for quick "All" or top‑type actions.
Tip: The browser action badge is per‑tab and reactive: on archive.org download pages it defaults to the total number of links; when you adjust filters in the popup, it shows the filtered count for that tab. It updates as you switch tabs.
Regex and Matching
- Plain strings: match anywhere in the value (case-insensitive by default, configurable).
- Regex: either toggle a “Use regex” option or enter values wrapped with
/.../
(optional flags likei
,m
). - Type vs name: “type” typically refers to file extension; “name” is the full filename.
Examples:
- Type contains
mp4
- Name regex
/^(movie|clip).*\.mp4$/i
- Type regex
/^(mp4|mkv)$/
Aria2 Integration
- RPC method:
aria2.addUri
withtoken:<SECRET>
. - Batching: sends multiple links either individually or in small batches.
- Options (optional): directory, headers, and per-item output name can be supported via the UI.
Aria2 must be running with RPC enabled, for example:
aria2c \
--enable-rpc \
--rpc-listen-port=6800 \
--rpc-listen-all=false \
--rpc-secret=YOUR_SECRET_TOKEN
Extension settings include:
- RPC endpoint: protocol, host, port, path (default
/jsonrpc
). - Secret token: the
--rpc-secret
value (stored in extension storage). - Optional defaults: download directory and additional aria2 options.
Security notes:
- Keep your secret token private; do not commit it.
- If using a remote aria2, enable TLS/HTTPS and restrict access.
Permissions
The extension will require:
https://archive.org/*
host permission to read and parse download pages.storage
to persist settings and presets.clipboardWrite
to copy links.- Host permission for your aria2 endpoint (e.g.,
http://localhost:6800/*
or your remote URL). The manifest currently includes wildcard entries to support non‑local RPC during development; when repackaging, consider scoping to intended hosts.
Parsing Strategy
- A content script runs on
https://archive.org/download/*
pages. - It scrapes the file listing table/DOM and builds a dataset with name, URL, size, and date.
- Type is derived from filename extension (and may use content-type hints if available on the page).
- The popup UI queries this dataset, applies filters, and displays the results.
- The background script handles aria2 RPC calls to avoid CORS issues and keep secrets out of content scope.
- If the content script isn’t present yet, the background injects it on demand and retries collection.
Options
Open the Options page (from the popup link or about:addons
):
- Default Action: choose whether quick actions Download via aria2 or Copy to clipboard.
- Include metadata in “All”: include archive.org’s metadata files in the “All” quick action.
- aria2 endpoint and secret: persisted to
browser.storage.local
. - Test Connection: calls
aria2.getVersion
and surfaces guidance for Firefox HTTPS‑Only Mode when using non‑local HTTP endpoints.
Installation (Temporary in Firefox)
- Clone this repository.
- In Firefox, open
about:debugging#/runtime/this-firefox
. - Click “Load Temporary Add-on…” and select the
manifest.json
in this repo. - Navigate to an archive.org download page and open the extension’s popup.
For development, make changes and click “Reload” in about:debugging
to pick them up.
Usage Tips
- Clipboard: copying usually requires a user gesture (click). The UI is designed to perform copy on button press.
- Case sensitivity: string filters default to case-insensitive; enable case-sensitive mode in settings if needed.
- Sizes: support common suffixes like
KB
,MB
,GB
. Ranges accept comparisons like>= 100MB
. - Dates: when the page provides timestamps, filters accept yyyy-mm-dd and ranges.
Troubleshooting
- No links found: ensure you are on a
/download/*
page (not the item overview). Try reloading after the page finishes loading. - RPC errors: verify
aria2c
is running with--enable-rpc
and that the secret/token matches. Check endpoint URL and port. - HTTPS-Only Mode: if your aria2 endpoint is
http://
on a non-local host, Firefox may upgrade it tohttps://
and the request will fail. Use HTTPS on the aria2 RPC (preferred), add a site exception for the host, or disable HTTPS-Only Mode while testing. - Host permissions: the extension needs host permission to reach non-local RPC endpoints. The manifest includes wildcard host permissions; if you self-package, ensure your manifest allows your RPC host(s).
- CORS/Network: Extensions can call cross-origin endpoints with host permission. If using HTTPS with a self-signed cert, allow it in Firefox or use a valid cert.
- Clipboard blocked: confirm the browser allowed clipboard write; try clicking the button again or check site focus.
Roadmap
- Optional per-file aria2 options (e.g.,
out
for renaming). - Smart batching and retry logic.
- Save/load named filter presets.
- Export/import settings.
- Support additional archive.org views if needed.
Development Notes
- Tech stack: Standard WebExtension. Currently Manifest V2 in Firefox, with content script + background script + popup UI.
- Storage:
browser.storage.local
for settings and aria2 configs; no analytics. - Code style: keep dependencies minimal; prefer modern, framework-light UI for the popup.
Architecture specifics
- Background is a non‑module MV2 script.
src/lib/aria2-bg.js
exposesaddUri
,addUrisBatch
, andgetVersion
onglobalThis
for background use. - Context menu parent: "Collect links…" appears on archive.org
/download/*
pages; submenu is rebuilt dynamically for top file types per page. - BrowserAction badges are per‑tab: default to total links on archive pages; switch to filtered counts when the popup is used on that tab. They also show counts and state after quick actions (copy/send) scoped to the current tab.
Contributing
Issues and PRs are welcome. If proposing new filters or aria2 options, please include example pages and expected behaviors.
Disclaimer
This project is not affiliated with archive.org or aria2. Use responsibly and respect site terms of service.
Release Workflow
- Stable ID: using
applications.gecko.id = "archive-org-link-grabber@jordanwages.com"
. If you self-host updates,applications.gecko.update_url
points tohttps://add-ons.jordanwages.com/archive-org-link-grabber/updates.json
. - Prepare (version bump + sync):
- Patch:
npm run release:prepare:patch
- Minor:
npm run release:prepare:minor
- Major:
npm run release:prepare:major
- Patch:
- Lint (Firefox):
npm run lint:fx
- Dev ZIP:
npm run build:dev
→ output indist/
- Sign (unlisted):
- Set environment secrets locally (do not commit):
AMO_JWT_ISSUER=... AMO_JWT_SECRET=...
- Run:
npm run release:sign
- Artifacts land in
releases/<version>/
- Set environment secrets locally (do not commit):
- Push (self-hosted):
npm run release:push
uploads the signed artifacts inreleases/<version>/
and prepares/uploadsreleases/updates.json
automatically (deriving theupdate_link
from the manifest’sapplications.gecko.update_url
). It also uploads a stable aliasarchive-org-link-grabber-latest.xpi
that always points to the most recent XPI. After upload, it commitsreleases/<version>/
,releases/updates.json
, and updatedicons/
(if changed). SetGIT_PUSH=true
to push that commit.- Ensure FTP env vars are set (see
.env.example
).
- Ensure FTP env vars are set (see
Notes: Keep AMO secrets local (see .env.example
). CI is optional. You can tag releases with git tag vX.Y.Z
and push tags if desired.