From 832a0a49dde1611dad0f39f8cb6221249dc9823b Mon Sep 17 00:00:00 2001 From: Jordan Wages Date: Sat, 19 Jul 2025 04:31:13 -0500 Subject: [PATCH] docs: refresh README and agent guide --- AGENTS.md | 12 +++++ README.md | 141 ++++++++++++++++++++++-------------------------------- 2 files changed, 70 insertions(+), 83 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 4cdfa62..7e7d3c5 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -24,6 +24,9 @@ This document outlines general practices and expectations for AI agents assistin The `run-import.sh` script can initialize this environment automatically. Always activate the virtual environment before running scripts or tests. +* Before committing code run `black` for consistent formatting and execute + the test suite with `pytest`. All tests should pass. + * Dependency management: Use `requirements.txt` or `pip-tools` * Use standard libraries where feasible (e.g., `sqlite3`, `argparse`, `datetime`) * Adopt `typer` for CLI command interface (if CLI ergonomics matter) @@ -89,6 +92,14 @@ ngxstat/ If uncertain, the agent should prompt the human for clarification before making architectural assumptions. +## Testing + +Use `pytest` for automated tests. Run the suite from an activated virtual environment and ensure all tests pass before committing: + +```bash +pytest -q +``` + --- ## Future Capabilities @@ -106,3 +117,4 @@ As the project matures, agents may also: * **2025-07-17**: Initial version by Jordan + ChatGPT * **2025-07-17**: Expanded virtual environment usage guidance + diff --git a/README.md b/README.md index acb1055..f641d96 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,16 @@ # ngxstat -Per-domain Nginx log analytics with hybrid static reports and live insights. -## Generating Reports +`ngxstat` is a lightweight log analytics toolkit for Nginx. It imports access +logs into an SQLite database and renders static dashboards so you can explore +per-domain metrics without running a heavy backend service. -Use the `generate_reports.py` script to build aggregated JSON and HTML snippet files from `database/ngxstat.db`. +## Requirements -Create a virtual environment and install dependencies: +* Python 3.10+ +* Access to the Nginx log files (default: `/var/log/nginx`) + +The helper scripts create a virtual environment on first run, but you can also +set one up manually: ```bash python3 -m venv .venv @@ -13,118 +18,88 @@ source .venv/bin/activate pip install -r requirements.txt ``` -Then run one or more of the interval commands: - -```bash -python scripts/generate_reports.py hourly -python scripts/generate_reports.py daily -python scripts/generate_reports.py weekly -python scripts/generate_reports.py monthly -``` - -Each command accepts optional flags to generate per-domain reports. Use -`--domain ` to limit output to a specific domain or `--all-domains` -to generate a subdirectory for every domain found in the database: - -```bash -# Hourly reports for example.com only -python scripts/generate_reports.py hourly --domain example.com - -# Weekly reports for all domains individually -python scripts/generate_reports.py weekly --all-domains -``` - -Reports are written under the `output/` directory. Each command updates the corresponding `.json` file and writes one HTML snippet per report. These snippets are loaded dynamically by the main dashboard using Chart.js and DataTables. - -### Configuring Reports - -Report queries are defined in `reports.yml`. Each entry specifies the `name`, -optional `label` and `chart` type, and a SQL `query` that must return `bucket` -and `value` columns. The special token `{bucket}` is replaced with the -appropriate SQLite `strftime` expression for each interval (hourly, daily, -weekly or monthly) so that a single definition works across all durations. -When `generate_reports.py` runs, every definition is executed for the requested -interval and creates `output//.json` plus a small HTML snippet -`output//.html` used by the dashboard. - -Example snippet: - -```yaml -- name: hits - chart: bar - query: | - SELECT {bucket} AS bucket, - COUNT(*) AS value - FROM logs - GROUP BY bucket - ORDER BY bucket -``` - -Add or modify entries in `reports.yml` to tailor the generated metrics. - ## Importing Logs -Use the `run-import.sh` script to set up the Python environment if needed and import the latest Nginx log entries into `database/ngxstat.db`. +Run the importer to ingest new log entries into `database/ngxstat.db`: ```bash ./run-import.sh ``` -This script is suitable for cron jobs as it creates the virtual environment on first run, installs dependencies and reuses the environment on subsequent runs. +Rotated logs are processed in order and only entries newer than the last +imported timestamp are added. -The importer handles rotated logs in order from oldest to newest so entries are -processed exactly once. If you rerun the script, it only ingests records with a -timestamp newer than the latest one already stored in the database, preventing -duplicates. +## Generating Reports -## Cron Report Generation - -Use the `run-reports.sh` script to run all report intervals in one step. The script sets up the Python environment the same way as `run-import.sh`, making it convenient for automation via cron. +To build the HTML dashboard and JSON data files use `run-reports.sh` which runs +all intervals in one go: ```bash ./run-reports.sh ``` -Running this script will create or update the hourly, daily, weekly and monthly reports under `output/`. It also detects all unique domains found in the database and writes per-domain reports to `output/domains//` alongside the aggregate data. After generation, open `output/index.html` in your browser to browse the reports. +The script calls `scripts/generate_reports.py` internally to create hourly, +daily, weekly and monthly reports. Per-domain reports are written under +`output/domains/` alongside the aggregate data. Open +`output/index.html` in a browser to view the dashboard. +If you prefer to run individual commands you can invoke the generator directly: -## Log Analysis +```bash +python scripts/generate_reports.py hourly +python scripts/generate_reports.py daily --all-domains +``` -The `run-analysis.sh` script runs helper routines that inspect the database. It -creates or reuses the virtual environment and then executes a set of analysis -commands to spot missing domains, suggest cache rules and detect potential -threats. +## Analysis Helpers + +`run-analysis.sh` executes additional utilities that examine the database for +missing domains, caching opportunities and potential threats. The JSON output is +saved under `output/analysis` and appears in the "Analysis" tab of the +dashboard. ```bash ./run-analysis.sh ``` -The JSON results are written under `output/analysis` and can be viewed from the -"Analysis" tab in the generated dashboard. -## Serving Reports with Nginx -To expose the generated HTML dashboards and JSON files over HTTP you can use a -simple Nginx server block. Point the `root` directive to the repository's -`output/` directory and optionally restrict access to your local network. +## Serving the Reports + +The generated files are static. You can serve them with a simple Nginx block: ```nginx server { listen 80; server_name example.com; - - # Path to the generated reports root /path/to/ngxstat/output; location / { try_files $uri $uri/ =404; } - - # Allow access only from private networks - allow 192.0.0.0/8; - allow 10.0.0.0/8; - deny all; } ``` -With this configuration the generated static files are served directly by -Nginx while connections outside of `192.*` and `10.*` are denied. +Restrict access if the reports should not be public. +## Running Tests + +Install the development dependencies and execute the suite with `pytest`: + +```bash +pip install -r requirements.txt +pytest -q +``` + +All tests must pass before submitting changes. + +## Acknowledgements + +ngxstat uses the following third‑party resources: + +* [Chart.js](https://www.chartjs.org/) for charts +* [DataTables](https://datatables.net/) and [jQuery](https://jquery.com/) for table views +* [Bulma CSS](https://bulma.io/) for styling +* Icons from [Free CC0 Icons](https://cc0-icons.jonh.eu/) by Jon Hicks (CC0 / MIT) +* [Typer](https://typer.tiangolo.com/) for the command-line interface +* [Jinja2](https://palletsprojects.com/p/jinja/) for templating + +The project is licensed under the GPLv3. Icon assets remain in the public domain +via the CC0 license.