<!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc -->
**Table of Contents**

- [test suite](#test-suite)
- [performance reviews](#performance-reviews)
    - [git-all-revs](#git-all-revs)
    - [strict](#strict)
    - [releases](#releases)
- [more performance notes](#more-performance-notes)

<!-- markdown-toc end -->

# test suite

This is the asncounter test suite. It is mostly held in a single file,
`test_asncounter.py`, and is driven by [pytest](https://docs.pytest.org/).

To run the test suite, try:

    python -m pytest

... from the parent directory.

The rest of the page describes the more complicated performance tests.

# performance reviews

This directory also holds a hodge-podge collections of tools, tests
and results I've collected over time from testing asncounter, mostly
by hand, mostly with [hyperfine](https://github.com/sharkdp/hyperfine).

Those tests need you to copy the fake AS data into a `cache` directory
before proceeding:

    cd test/data
    mkdir ../cache
    cp asnames.json ipasn_YYYYMMDD.HHMM.dat.gz rib.YYYYMMDD.HHMM.sux ../cache

This is necessary to test older releases where the `test/data` didn't
exist or was in a different location. Starting with some time after
0.4.0 (52568c6), this is not necessary, and you can run performance
tests with `--cache-dir=test/data` directly.

Note that the benchmarks here were performed on a Framework 13" Intel
"12th generation" laptop ("i5-1240P"). There was a bug in the
Framework BIOS that severely throttled the CPU and influenced the
benchmarks significantly, so they are not comparable with each
other. In general, when you see timings above 10 seconds for 10m
packets, that's before the upgrade, and below, after.

## git-all-revs

This is a synthetic performance test of parsing 10 million IP
addresses with the "line" input format, over the entire history of the
Git repository, between 3e9ff16 and 2a59e30, essentially.

The was generated with:

    hyperfine -i -w 3 --input test/data/sample-10m.ips \
      -L ref $(git log --reverse --format=ref asncounter.py | awk '{print $1}' | paste -sd, ) \
      --export-json results/mass.json --export-markdown results/mass.md \
      "git checkout {ref} && ./asncounter.py --cache-dir=test/cache"

Note that the files were renamed from `mass` to `git-all-revs`, and
the table below and JSON files were tweaked to only show the rev
(e.g. `3e9ff16`) instead of the full command ran (e.g. `git checkout
3e9ff16 && ./asncounter.py --cache-dir=test/data`).

Here is the graphic, and the full [markdown table](git-all-revs.md):

![](git-all-revs.png)

There you can see that the earlier implementations of asncounter were
running at around 23 seconds for 10M packet or 435kpps, then c742541
(skip missing ASN checks more noisily and earlier, 2025-05-23)
optimized that quite a bit, down to 16 seconds or 625 kpps. Then it
slowly creeps up to 18 seconds, with a huge jump to ~30 seconds while
I tested debug logging to settle back down to about 20 seconds. But
there was a regression somewhere in there...

Other points of regression to investigate in the future could include:

- 56d1a75 (isort, 2025-05-26), likely red herring, but that starts a
  trend of gaining about one second over the course of many commits
* 9bf01a8 (simplify display_results loop, 2025-05-28), another second gained

## strict

The above found a regression with the 4f38f63 (mypy --strict,
2025-06-03) commit, at which point a more specific benchmark was
crafted to test fix 1f83168 (remove cast() in hot loops, 2025-06-15):

    hyperfine -i -w 3 --input test/perf/data/sample-10m.ips \
        -L ref c742541,12032e5,4f38f63,2099ff6,92728fb,eeaf697 \
        --export-json test/perf/strict.json --export-markdown test/perf/strict.md \
        "git checkout {ref} && ./asncounter.py --cache-dir=test/cache"

The result to bring performance to a similar level than before the regression:

```
anarcat@angela:asncounter$ hyperfine -w 3 --input test/perf/data/sample-10m.ips         -L ref c742541,12032e5,4f38f63,2099ff6,92728fb,eeaf697         --export-json test/perf/strict2.json --export-markdown test/perf/strict2.md         "git checkout {ref} && ./asncounter.py --cache-dir=test/cache"
Benchmark 1: git checkout c742541 && ./asncounter.py --cache-dir=test/cache
  Time (mean ± σ):     23.567 s ±  0.985 s    [User: 23.508 s, System: 0.054 s]
  Range (min … max):   22.298 s … 25.416 s    10 runs
 
Benchmark 2: git checkout 12032e5 && ./asncounter.py --cache-dir=test/cache
  Time (mean ± σ):     19.143 s ±  0.595 s    [User: 19.037 s, System: 0.100 s]
  Range (min … max):   18.256 s … 19.959 s    10 runs
 
Benchmark 3: git checkout 4f38f63 && ./asncounter.py --cache-dir=test/cache
  Time (mean ± σ):     24.518 s ±  0.351 s    [User: 24.412 s, System: 0.098 s]
  Range (min … max):   24.230 s … 25.442 s    10 runs
 
Benchmark 4: git checkout 2099ff6 && ./asncounter.py --cache-dir=test/cache
  Time (mean ± σ):     29.694 s ±  0.852 s    [User: 29.567 s, System: 0.119 s]
  Range (min … max):   28.657 s … 31.353 s    10 runs
 
Benchmark 5: git checkout 92728fb && ./asncounter.py --cache-dir=test/cache
  Time (mean ± σ):     29.460 s ±  1.069 s    [User: 29.340 s, System: 0.113 s]
  Range (min … max):   27.893 s … 31.206 s    10 runs
 
Benchmark 6: git checkout eeaf697 && ./asncounter.py --cache-dir=test/cache
  Time (mean ± σ):     24.579 s ±  1.947 s    [User: 24.407 s, System: 0.151 s]
  Range (min … max):   23.270 s … 29.747 s    10 runs
 
Summary
  git checkout 12032e5 && ./asncounter.py --cache-dir=test/cache ran
    1.23 ± 0.06 times faster than git checkout c742541 && ./asncounter.py --cache-dir=test/cache
    1.28 ± 0.04 times faster than git checkout 4f38f63 && ./asncounter.py --cache-dir=test/cache
    1.28 ± 0.11 times faster than git checkout eeaf697 && ./asncounter.py --cache-dir=test/cache
    1.54 ± 0.07 times faster than git checkout 92728fb && ./asncounter.py --cache-dir=test/cache
    1.55 ± 0.07 times faster than git checkout 2099ff6 && ./asncounter.py --cache-dir=test/cache
```

The markdown table is in [strict.md](strict.md), here's a plot of the runs:

![](strict.png)

The commits here are:

* c742541 (skip missing ASN checks more noisily and earlier,
  2025-05-23), 23 seconds, early history, as a reference
* 12032e5 (asndb returns sets, 2025-06-03), 19 seconds, optimization
  of the above
* 4f38f63 (mypy --strict, 2025-06-03), 24 seconds, performance
  regression we're testing for
* 2099ff6 (switch to python -O and __debug__ checks, 2025-06-15),
  another baseline after lots of commits, 29 seconds, a separate
  regression we need to improve on
* 92728fb (support ip address on cli through a special mode,
  2025-06-13), last baseline before fix, still at 29 seconds
* eeaf697 (remove cast() in hot loops, 2025-06-15), fix, bringing
  performance down to 25 seconds or 400kpps

## releases

This generates a benchmark of all releases, from the first
"benchmarkable" commit (3e9ff16, retroactively tagged 0.0.1), to the
latest release at the time of writing:

    hyperfine -w 3 --input test/data/sample-10m.ips \
        -L ref $(git tag | paste -sd,) \
        --export-json test/releases.json \
        --export-markdown test/releases.md \
        "git checkout {ref} && ./asncounter.py --cache-dir=test/cache"

This is not automatically updated and might become out of date as new
releases trickle out.

The markdown table is in [releases.md](strict.md), here's a plot of the runs:

![](releases.png)

Here again we see the regressions introduced in 0.3.0, and
improvements done after, but not quite getting back the performance we
had lost. 0.4.0 and 0.5.0 still brought significant performance
improvements.

## scalability with packet counts

I have also performed tests with various packet counts, testing with
250, 1 thousand, 1 million, and 10 million packets:

    hyperfine -w 3 -L size 250,1k,1m,10m \
      --export-json test/sizes.json \
      --export-markdown test/sizes.md \
      "./asncounter.py --cache-dir=test/data --input=test/data/sample-{size}.ips"

The results show that we have roughly linear scalability with the
packet counts:

| Command                                                                  |       Mean [s] | Min [s] | Max [s] |     Relative |
|:-------------------------------------------------------------------------|---------------:|--------:|--------:|-------------:|
| `./asncounter.py --cache-dir=test/data --input=test/data/sample-250.ips` |  1.813 ± 0.165 |   1.677 |   2.160 |         1.00 |
| `./asncounter.py --cache-dir=test/data --input=test/data/sample-1k.ips`  |  1.911 ± 0.256 |   1.674 |   2.341 |  1.05 ± 0.17 |
| `./asncounter.py --cache-dir=test/data --input=test/data/sample-1m.ips`  |  3.820 ± 0.146 |   3.708 |   4.180 |  2.11 ± 0.21 |
| `./asncounter.py --cache-dir=test/data --input=test/data/sample-10m.ips` | 22.383 ± 0.500 |  21.852 |  23.482 | 12.35 ± 1.16 |

The first two show that we have a pretty solid 1.6s overhead,
regardless of the packet count. Then, going from 1k to 1m packets, we
only add 2 seconds, which we could argue matches a roughly 500k
packets per second rate.

Then, we shift another order of magnitude (1m to 10m packets) and we
go from 3.7s to 21.8s. If we discount a 1.7s overhead, this maps
pretty neatly to the same "2 second per million packet" rate
established with the 1m benchmark.

Most other tests on this page were done with 10 million packets
because it reduces the impact of that 1.6s startup overhead.

# more performance notes

At 400kpps, we can't actually saturate a "fast ethernet" (100mbps)
link, with the smallest (20 bytes):

```
> 400000/s * (20byte) to megabit/s

  (400000 / second) × (20 bytes) = 64 megabits/s
```

We'd need to raise the processing rate by half for that, to nearly
600kpps, assuming we reach saturation at 95mbps:

```
> 95 megabit/s = x / s * (20 byte)

  (95 megabits/second) = ((x / second) × (20 bytes))

  x = 593750
```

During normal operations, however, packets are likely to be larger,
here's the best case scenario, at 1500 bytes per packet, where we
almost saturate a gigabit link:

```
> 400000/s * (1500byte) to megabit/s

  (400000 / second) × (1500 bytes) = 4800 megabits/s
```

Our ideal target would likely be gigabit saturation in DDOS
conditions, which is about 6.25 million packets per second:

```
> 1 gigabit/s / 20byte

  (1 gigabit/second) / (20 bytes) = 6.25 MHz
```

... more than an order of magnitude faster than what we can currently
pull off. We could trim off some of that because of the overhead, of
course, but we should look at an objective of something around
2-5Mpps.

Also note that the above benchmarks include the time `hyperfine` takes
to run (1) a shell and (2) `git checkout`. Those are extremely fast,
however: they introduce at *most* 30ms, with very little variance
(±4ms), so they don't impact the results here. Neighbor noise is more
likely a factor, as I've had trouble reproducing exactly the same
results across multiple runs.

But at this point, our objective is *not* to optimize any further, as
we'd likely need to review the entire architecture of this (and
probably rewrite it in rust), but to keep stupidly large performance
regressions from kicking in.

Note that the 20 and 1500 bytes figures for packet sizes come from
[this article from APNIC](https://labs.apnic.net/index.php/2024/10/04/the-size-of-packets/).

# other ideas

This ad-hoc test suite works well for our purposes, but could be
improved. There are tools for pytest that enable benchmarking code
directly which we could use, for example:

- [pytest-benchmark](https://github.com/ionelmc/pytest-benchmark) (in Debian)
- [pytest-performancetotal](https://pypi.org/project/pytest-performancetotal/)
- [pytest-performance](https://pypi.org/project/pytest-performance/)
- [pytest-perf](https://pypi.org/project/pytest-perf/)

There are [other pytest plugins](https://docs.pytest.org/en/stable/reference/plugin_list.html) that I might not have spotted yet
as well.

We could also test the examples from the manual page directly, by
testing the embedded calls:

- [mdtest](https://github.com/elucent/mdtest/) has a fairly simple [sh runner](https://github.com/elucent/mdtest/blob/master/runners/sh) that we could take
  inspiration from
- [mktestdocs](https://github.com/koaning/mktestdoc) supports parsing Python code from Markdown, but what
  we want is more likely shell, not Python, as we test the manual page
  EXAMPLES
- [pytest-markdown-docs](https://pypi.org/project/pytest-markdown-docs/), similar
- [markdown-pytest](https://pypi.org/project/markdown-pytest/), similar
- [pytest-markdown](https://pypi.org/project/pytest-markdown/), similar
- [pytest-examples](https://github.com/pydantic/pytest-examples), similar, come on people, talk to each other already
