test: testapp bench by julienrbrt · Pull Request #3214 · evstack/ev-node

julienrbrt · 2026-03-30T20:25:03Z

Improve the command to stress-test evnode via testapp.

Summary by CodeRabbit

New Features
- Added /stats HTTP endpoint exposing injected/executed transaction and block metrics
- Stress-test tool: worker-based concurrency, target-RPS, live RPS reporting, and detailed summary output
- Expanded runtime metrics from the KV service (improved capacity and execution counters)
Documentation
- Revamped testapp README with 3-terminal quick start, CLI subcommand/flag reference, stress-test examples, and HTTP endpoint docs

coderabbitai · 2026-03-30T20:25:11Z

📝 Walkthrough

Walkthrough

Transforms testapp into a KV store rollup reference: adds stress-test tooling, HTTP /stats endpoint, executor metrics, updated defaults, and comprehensive README/CLI docs for running and benchmarking the KV server.

Changes

Cohort / File(s)	Summary
Documentation `apps/testapp/README.md`, `apps/testapp/kv/bench/README.md`	Rewrote docs from generic example to a KV rollup reference. Added 3-terminal quick start, build/run steps, CLI flag tables, stress-test usage, and HTTP endpoints (`/tx`, `/kv`, `/store`, `/stats`).
Init & Defaults `apps/testapp/cmd/init.go`, `apps/testapp/examples/passphrase.txt`	Set `Node.BlockTime` to 100ms and changed `ChainIDFlag` default to `ev-test`. Added example passphrase file (`foo:bar`).
Stress Test Tool `apps/testapp/kv/bench/main.go`	Replaced interval-based benchmark with worker pool stress-test: `-workers`, `-target-rps`, persistent TCP HTTP requests, per-second progress and peak RPS tracking, server `/stats` sampling, and final summary reporting.
HTTP Server & Tests `apps/testapp/kv/http_server.go`, `apps/testapp/kv/http_server_test.go`	Added `injectedTxs` atomic counter and `GET /stats` handler returning `injected_txs`, `executed_txs`, `blocks_produced`. `handleTx` increments injected counter. Minor test payload construction change.
Executor & Metrics `apps/testapp/kv/kvexecutor.go`	Increased mempool buffer to 100_000_000. Added atomic counters (`blocksProduced`, `totalExecutedTxs`), `ExecutorStats` type and `GetStats()` method. Increment stats on commit and optimized string/byte handling.
Runtime Param `apps/testapp/cmd/run.go`	Adjusted sequencer parameter from `1000` to `1_000_000` when constructing the sequencer.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant StressClient as Stress-Test
participant HTTP as testapp HTTPServer
participant Exec as KV Executor
participant Store as Persistent Store
StressClient->>HTTP: POST /tx (raw HTTP over TCP)
HTTP->>Exec: executor.InjectTx(tx)
Exec->>Exec: mempool enqueue / ExecuteTxs -> Commit
Exec->>Store: persist block, set final
Exec-->>HTTP: return acceptance (202)
HTTP-->>StressClient: 202 Accepted
StressClient->>HTTP: GET /stats (periodic)
HTTP->>Exec: executor.GetStats()
Exec-->>HTTP: {executed_txs, blocks_produced}
HTTP-->>StressClient: {injected_txs, executed_txs, blocks_produced}

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

randygrok
tac0turtle
Manav-Aggarwal

Poem

🐰 I hopped in with bytes and a tiny test,
Workers all scurrying, aim for the best,
I poke /tx doors and query /stats bright,
Blocks stack like carrots in soft evening light,
Hop on, roll up — the KV sings tonight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The description is brief but vague; it states the goal without explaining the implementation details, changes made, or technical rationale behind the improvements.	Expand the description to explain what was changed (stress-test tool rewrite, KV server enhancements, queue size increases) and why these changes improve testapp's stress-testing capabilities.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'test: testapp bench' directly reflects the PR's core objective of improving stress-testing capabilities for testapp, following semantic commit conventions.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch julien/testapp-bench

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-30T20:25:15Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Mar 30, 2026, 9:33 PM

claude · 2026-03-30T20:25:18Z

Claude finished @julienrbrt's task in 1m 44s —— View job

Code Review: `test: testapp bench`

Gather context and understand the PR
Review apps/testapp/kv/bench/main.go
Review apps/testapp/kv/http_server.go
Review apps/testapp/kv/kvexecutor.go
Review apps/testapp/kv/http_server_test.go
Review documentation and other files
Post final review

Good set of improvements overall — the persistent-TCP approach in the bench tool is the right call for maximizing throughput, the /stats endpoint is clean, and the docs are now genuinely useful. A few issues worth fixing before merge:

Critical

txChannelBufferSize = 100_000_000 causes extreme memory allocation
kvexecutor.go:25

txChannelBufferSize = 100_000_000

A Go channel with 100M capacity pre-allocates internal storage proportional to element size. For []byte (a slice header = 24 bytes), this is ~2.4 GB just for the channel's internal buffer array, before any data is written. This will OOM most development machines and CI runners. The original intent was likely to prevent InjectTx from blocking under extreme load, but even at 10M req/s the bench runs for only 10s — 100M capacity is 10x more than needed. A value of 10_000_000 (or even 1_000_000) would be plenty and far more reasonable. Fix this →

Medium

handleKV fallback DB lookup uses wrong key space
http_server.go:139

_, err := hs.executor.db.Get(r.Context(), ds.NewKey(key))

GetStoreValue searches height-prefixed keys (/height/{h}/{key}). But this fallback calls db.Get with the raw key (no height prefix). Since all KV data is stored under /height/…, a raw lookup will always return ds.ErrNotFound — even when the real cause is a DB I/O error. The error discrimination between "key not found" and "internal error" is effectively broken; both will return 404. Fix this →

TestHTTPServerStartStop is nearly a no-op
http_server_test.go:171-213

The test creates an httptest.NewServer that's immediately unused by the code under test, then creates a bare testServer struct just to do _ = testServer.Start. It verifies no real behavior. This is misleading — a reader expects a Start/Stop test to actually start and stop the server. Either extend it or remove it (the context-cancellation test below it covers real behavior). The time.Sleep(10 * time.Millisecond) on line 70 is also flaky.

Port TOCTOU race in TestHTTPServerContextCancellation
http_server_test.go:222-230

listener, err := net.Listen("tcp", "127.0.0.1:0")
// ...
listener.Close()
// ... port may be grabbed here by another process
server := NewHTTPServer(exec, serverAddr)

A small but real race: another process can bind the port between listener.Close() and server.Start(ctx). The cleaner approach is to pass the already-open listener to the server (or use httptest.NewServer).

Minor

fetchStats silently returns zero on any error
bench/main.go:173-182

func fetchStats(addr string) serverStats {
    resp, err := http.Get("http://" + addr + "/stats")
    if err != nil {
        return serverStats{}  // error silently dropped
    }
    defer resp.Body.Close()
    var s serverStats
    json.NewDecoder(resp.Body).Decode(&s)  // decode error also dropped
    return s
}

If the /stats fetch fails (e.g. network hiccup after the bench), the before/after delta is meaningless (zero). A brief warning to stderr would help users diagnose misleading results.

formatNum allocates a string per character
bench/main.go:185-197

result.WriteString(string(c))  // allocates a new string for each rune

c is a rune, so string(c) allocates. Use result.WriteRune(c) instead. Trivial in context but easy to fix.

blocksProduced incremented even on zero-tx blocks
kvexecutor.go:310

k.blocksProduced.Add(1) is called unconditionally, even when validTxCount == 0. This means the bench's "avg txs per block" stat will be diluted by empty blocks. Whether this is intentional (a block is a block regardless of content) is fine — just worth a comment since empty blocks will skew the bench output.

What's Good

Raw TCP with HTTP/1.1 persistent connections in worker() is the right approach for this benchmark — avoids TLS and connection-setup overhead per request.
Atomic counters for blocksProduced/totalExecutedTxs/injectedTxs are all correctly used with no data races.
Live progress line (\r overwrite) with per-second RPS tracking is a nice UX touch.
Context cancellation flow in HTTPServer.Start (goroutine on ctx.Done → Shutdown) is correct.
examples/passphrase.txt is a sensible addition for the quick-start flow.
Docs are clear and the quick-start guide accurately reflects the 3-terminal workflow.

codecov · 2026-03-30T20:29:27Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.41%. Comparing base (8d68f9d) to head (b8dc969).

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #3214   +/-   ##
=======================================
  Coverage   61.41%   61.41%           
=======================================
  Files         120      120           
  Lines       12474    12474           
=======================================
  Hits         7661     7661           
  Misses       3953     3953           
  Partials      860      860

Flag	Coverage Δ
combined	`61.41% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (1)

apps/testapp/kv/kvexecutor.go (1)
45-54: Unexport ExecutorStats and GetStats() since they're package-private.

ExecutorStats and GetStats() are only consumed internally in http_server.go within the same package. Unexporting them to executorStats and getStats() keeps the API surface minimal. If they need to remain exported, they require Go doc comments per the coding guidelines.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/testapp/kv/kvexecutor.go` around lines 45 - 54, Rename the exported
types and methods to unexported identifiers: change ExecutorStats to
executorStats and GetStats() to getStats() in KVExecutor, update the return type
and constructor usages accordingly, and update all internal references (e.g., in
http_server.go) to use executorStats and getStats(); ensure method receiver
signature remains the same (func (k *KVExecutor) getStats() executorStats) and
run `go build`/tests to catch any missed references.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/testapp/cmd/init.go`:
- Line 38: The init command currently unconditionally sets cfg.Node.BlockTime =
rollconf.DurationWrapper{Duration: 100 * time.Millisecond}, overwriting any
value rollconf.Load(cmd) parsed (e.g., --evnode.node.block_time=1s); change this
so you only apply 100ms when the user did not supply the flag or instead set the
flag's default to 100ms. Locate the init command logic around rollconf.Load(cmd)
and cfg.Node.BlockTime and either (a) detect whether the flag was provided (use
the command/flag API to check if evnode.node.block_time was set) and only assign
the 100ms default when not provided, or (b) set the CLI flag default for
evnode.node.block_time to 100ms and remove the unconditional assignment to
cfg.Node.BlockTime.

In `@apps/testapp/kv/bench/main.go`:
- Around line 173-181: The fetchStats function currently swallows HTTP/JSON
errors and returns an all-zero serverStats which can cause underflow when
callers subtract uint64 deltas; change fetchStats to return (serverStats, error)
instead of serverStats, propagate errors from http.Get and json.Decode
(including non-200 responses), and have callers check the error and skip
server-side delta calculations/reporting when a snapshot cannot be trusted (use
the presence of an error to avoid subtracting snapshots or to re-sample). Ensure
you update the signature and all call sites accordingly (references: fetchStats,
serverStats).
- Line 28: The flag targetRPS (variable targetRPS) is only reported and not
enforced; either implement pacing or rename it to avoid confusion — to implement
pacing, add a token-bucket or ticker driven rate limiter in the request send
path (e.g., inside the send loop function used to launch workers / sendRequests)
that consumes tokens at the rate specified by *targetRPS and blocks or sleeps
when tokens are exhausted, or alternatively rename the flag (and all uses in
banner/logging) to something like target-rps-goal or desired-rps to make it
clear it’s only informational; update the worker launch/send loop (the function
that iterates and issues requests referenced at lines 56-57 and 103-108) to
either use the limiter or to remove/rename the misleading references so behavior
matches the flag name.
- Around line 121-149: The worker currently uses net.DialTimeout and then
performs conn.Write and http.ReadResponse without timeouts, which can block
forever; replace the dial with a context-aware dial (use net.Dialer.DialContext
with ctx) and before each network operation set per-operation deadlines on conn
(via conn.SetDeadline(time.Now().Add(timeout))) so the conn.Write(rawReq) and
http.ReadResponse(br, nil) calls will time out and return errors that you handle
by calling failures.Add(1), closing conn, and breaking the loop; ensure the
ctx.Done() check remains and choose a sensible timeout value (e.g., a short
per-op timeout) so workers can exit promptly on server stalls or context
cancellation.

In `@apps/testapp/kv/bench/README.md`:
- Around line 16-20: The README uses an incorrect startup command: it shows
./build/testapp but the build artifacts are ./stress-test and ./testapp; update
the Terminal 1 command to start the built testapp binary (use ./testapp start
--kv-endpoint localhost:9090) so it matches the documented build output and the
Terminal 2 stress-test command; ensure the symbols shown (./build/testapp,
./testapp, ./stress-test, --kv-endpoint) are updated consistently in the README
entry.

In `@apps/testapp/kv/http_server.go`:
- Line 112: The injectedTxs counter is incremented unconditionally at
hs.injectedTxs.Add(1) even when KVExecutor.InjectTx drops transactions; change
the flow so InjectTx (in apps/testapp/kv/kvexecutor.go) returns a success
boolean or error indicating whether the tx was enqueued, then in the HTTP
handler only call hs.injectedTxs.Add(1) and return 202 when InjectTx reports
success—otherwise return an error status (e.g., 429/500) and do not increment
injectedTxs so dropped transactions are not counted.

In `@apps/testapp/kv/kvexecutor.go`:
- Line 25: The txChannelBufferSize constant is far too large and will
preallocate gigabytes when used in make(chan []byte, txChannelBufferSize) inside
NewKVExecutor; change txChannelBufferSize to a much smaller default (e.g., 1_000
or similar) and make it configurable (via constructor param or config/env) so
callers can tune it, or replace the large buffered channel with a bounded/ring
buffer implementation or an unbuffered channel plus worker backpressure; update
NewKVExecutor to accept the buffer size parameter (or config) and use it instead
of the hardcoded 100_000_000, and add a comment about memory implications.

---

Nitpick comments:
In `@apps/testapp/kv/kvexecutor.go`:
- Around line 45-54: Rename the exported types and methods to unexported
identifiers: change ExecutorStats to executorStats and GetStats() to getStats()
in KVExecutor, update the return type and constructor usages accordingly, and
update all internal references (e.g., in http_server.go) to use executorStats
and getStats(); ensure method receiver signature remains the same (func (k
*KVExecutor) getStats() executorStats) and run `go build`/tests to catch any
missed references.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5360d9f5-19ba-4c64-bdd9-9121413b1041

📥 Commits

Reviewing files that changed from the base of the PR and between 8d68f9d and 5581431.

📒 Files selected for processing (8)

apps/testapp/README.md
apps/testapp/cmd/init.go
apps/testapp/examples/passphrase.txt
apps/testapp/kv/bench/README.md
apps/testapp/kv/bench/main.go
apps/testapp/kv/http_server.go
apps/testapp/kv/http_server_test.go
apps/testapp/kv/kvexecutor.go

coderabbitai · 2026-03-30T21:42:56Z

apps/testapp/cmd/init.go

 			// we use load in order to parse all the flags
 			cfg, _ := rollconf.Load(cmd)
 			cfg.Node.Aggregator = aggregator
+			cfg.Node.BlockTime = rollconf.DurationWrapper{Duration: 100 * time.Millisecond}


⚠️ Potential issue | 🟠 Major

Don't overwrite the parsed block-time value.

Line 38 clobbers whatever rollconf.Load(cmd) just read, so testapp init --evnode.node.block_time=1s still writes 100ms. If the goal is a faster default for this command, set the flag default or only apply 100ms when the user did not pass the flag; otherwise this also drifts from apps/testapp/cmd/init_test.go, which still expects block_time: 1s.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/testapp/cmd/init.go` at line 38, The init command currently unconditionally sets cfg.Node.BlockTime = rollconf.DurationWrapper{Duration: 100 * time.Millisecond}, overwriting any value rollconf.Load(cmd) parsed (e.g., --evnode.node.block_time=1s); change this so you only apply 100ms when the user did not supply the flag or instead set the flag's default to 100ms. Locate the init command logic around rollconf.Load(cmd) and cfg.Node.BlockTime and either (a) detect whether the flag was provided (use the command/flag API to check if evnode.node.block_time was set) and only assign the 100ms default when not provided, or (b) set the CLI flag default for evnode.node.block_time to 100ms and remove the unconditional assignment to cfg.Node.BlockTime.

coderabbitai · 2026-03-30T21:42:56Z

apps/testapp/kv/bench/main.go

+	addr := flag.String("addr", "localhost:9090", "server host:port")
+	duration := flag.Duration("duration", 10*time.Second, "test duration")
+	workers := flag.Int("workers", 1000, "concurrent workers")
+	targetRPS := flag.Uint64("target-rps", 10_000_000, "target requests per second")


⚠️ Potential issue | 🟠 Major

-target-rps is only reported, not enforced.

The new flag looks like a rate-control knob, but nothing in the worker launch or send loop uses it to pace requests. Right now it only changes the pass/fail banner, so runs with -target-rps=100 and -target-rps=10000000 generate the same load. Either add pacing/token-bucket logic or rename the flag so it is clearly just a goal.

Also applies to: 56-57, 103-108

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/testapp/kv/bench/main.go` at line 28, The flag targetRPS (variable targetRPS) is only reported and not enforced; either implement pacing or rename it to avoid confusion — to implement pacing, add a token-bucket or ticker driven rate limiter in the request send path (e.g., inside the send loop function used to launch workers / sendRequests) that consumes tokens at the rate specified by *targetRPS and blocks or sleeps when tokens are exhausted, or alternatively rename the flag (and all uses in banner/logging) to something like target-rps-goal or desired-rps to make it clear it’s only informational; update the worker launch/send loop (the function that iterates and issues requests referenced at lines 56-57 and 103-108) to either use the limiter or to remove/rename the misleading references so behavior matches the flag name.

coderabbitai · 2026-03-30T21:42:56Z

apps/testapp/kv/bench/main.go

+		conn, err := net.DialTimeout("tcp", addr, time.Second)
+		if err != nil {
+			failures.Add(1)
+			continue
+		}
+
+		br := bufio.NewReaderSize(conn, 512)
+
+		for {
+			select {
+			case <-ctx.Done():
+				conn.Close()
+				return
+			default:
 			}

-			return
+			if _, err := conn.Write(rawReq); err != nil {
+				failures.Add(1)
+				conn.Close()
+				break
+			}

-		case <-ticker.C:
-			// Send a batch of transactions
-			for i := 0; i < txPerInterval; i++ {
-				wg.Add(1)
-				go func() {
-					defer wg.Done()
-					var currentTxData string
-					// Generate random key-value pair
-					key := randomString(8)
-					value := randomString(16)
-					currentTxData = fmt.Sprintf("%s=%s", key, value)
-					success := sendTransaction(url, currentTxData)
-					if success {
-						atomic.AddUint64(&successCount, 1)
-					} else {
-						atomic.AddUint64(&failureCount, 1)
-					}
-				}()
+			resp, err := http.ReadResponse(br, nil)
+			if err != nil {
+				failures.Add(1)
+				conn.Close()
+				break
 			}
+			io.Copy(io.Discard, resp.Body)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n apps/testapp/kv/bench/main.go | head -200

Repository: evstack/ev-node

Length of output: 5905

Set deadlines on TCP operations to prevent workers from blocking indefinitely.

Line 121 only limits the connection establishment time. After that, conn.Write and http.ReadResponse have no timeouts and can block forever. If the server is slow or stalled, workers get stuck in these operations, never checking ctx.Done(), and the main goroutine hangs waiting for them to drain the done channel.

Use DialContext to respect context cancellation during connection, and set per-operation deadlines with SetDeadline so Write and ReadResponse calls are bounded:

Suggested direction

- conn, err := net.DialTimeout("tcp", addr, time.Second) + d := net.Dialer{Timeout: time.Second} + conn, err := d.DialContext(ctx, "tcp", addr) if err != nil { failures.Add(1) continue } @@ - if _, err := conn.Write(rawReq); err != nil { + if err := conn.SetDeadline(time.Now().Add(time.Second)); err != nil { + failures.Add(1) + conn.Close() + break + } + if _, err := conn.Write(rawReq); err != nil { failures.Add(1) conn.Close() break } resp, err := http.ReadResponse(br, nil)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

conn, err := net.DialTimeout("tcp", addr, time.Second)

if err != nil {

failures.Add(1)

continue

}

br := bufio.NewReaderSize(conn, 512)

for {

select {

case <-ctx.Done():

conn.Close()

return

default:

}

return

if _, err := conn.Write(rawReq); err != nil {

failures.Add(1)

conn.Close()

break

}

case <-ticker.C:

// Send a batch of transactions

for i := 0; i < txPerInterval; i++ {

wg.Add(1)

go func() {

defer wg.Done()

var currentTxData string

// Generate random key-value pair

key := randomString(8)

value := randomString(16)

currentTxData = fmt.Sprintf("%s=%s", key, value)

success := sendTransaction(url, currentTxData)

if success {

atomic.AddUint64(&successCount, 1)

} else {

atomic.AddUint64(&failureCount, 1)

}

}()

resp, err := http.ReadResponse(br, nil)

if err != nil {

failures.Add(1)

conn.Close()

break

}

io.Copy(io.Discard, resp.Body)

d := net.Dialer{Timeout: time.Second}

conn, err := d.DialContext(ctx, "tcp", addr)

if err != nil {

failures.Add(1)

continue

}

br := bufio.NewReaderSize(conn, 512)

for {

select {

case <-ctx.Done():

conn.Close()

return

default:

}

if err := conn.SetDeadline(time.Now().Add(time.Second)); err != nil {

failures.Add(1)

conn.Close()

break

}

if _, err := conn.Write(rawReq); err != nil {

failures.Add(1)

conn.Close()

break

}

resp, err := http.ReadResponse(br, nil)

if err != nil {

failures.Add(1)

conn.Close()

break

}

io.Copy(io.Discard, resp.Body)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/testapp/kv/bench/main.go` around lines 121 - 149, The worker currently uses net.DialTimeout and then performs conn.Write and http.ReadResponse without timeouts, which can block forever; replace the dial with a context-aware dial (use net.Dialer.DialContext with ctx) and before each network operation set per-operation deadlines on conn (via conn.SetDeadline(time.Now().Add(timeout))) so the conn.Write(rawReq) and http.ReadResponse(br, nil) calls will time out and return errors that you handle by calling failures.Add(1), closing conn, and breaking the loop; ensure the ctx.Done() check remains and choose a sensible timeout value (e.g., a short per-op timeout) so workers can exit promptly on server stalls or context cancellation.

coderabbitai · 2026-03-30T21:42:56Z

apps/testapp/kv/bench/main.go

+func fetchStats(addr string) serverStats {
+	resp, err := http.Get("http://" + addr + "/stats")
+	if err != nil {
+		return serverStats{}
 	}
 	defer resp.Body.Close()
+	var s serverStats
+	json.NewDecoder(resp.Body).Decode(&s)
+	return s


⚠️ Potential issue | 🟠 Major

Don't collapse /stats failures into zero counters.

Line 176 turns any fetch failure into an all-zero snapshot, and the caller later subtracts those snapshots as uint64. If the initial sample succeeds and the final sample fails, or the server restarts between samples, the delta math underflows and the summary prints enormous bogus block/tx counts. Return an error from fetchStats and skip server-side delta reporting when a snapshot cannot be trusted.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/testapp/kv/bench/main.go` around lines 173 - 181, The fetchStats function currently swallows HTTP/JSON errors and returns an all-zero serverStats which can cause underflow when callers subtract uint64 deltas; change fetchStats to return (serverStats, error) instead of serverStats, propagate errors from http.Get and json.Decode (including non-200 responses), and have callers check the error and skip server-side delta calculations/reporting when a snapshot cannot be trusted (use the presence of an error to avoid subtracting snapshots or to re-sample). Ensure you update the signature and all call sites accordingly (references: fetchStats, serverStats).

coderabbitai · 2026-03-30T21:42:56Z

apps/testapp/kv/bench/README.md

+# Terminal 1: start the testapp with KV endpoint
+./build/testapp start --kv-endpoint localhost:9090

-### List all key-value pairs in the store
-
-```bash
-./txclient -list [-addr <url>]
+# Terminal 2: run the stress test
+./stress-test --addr localhost:9090 --duration 10s --workers 1000


⚠️ Potential issue | 🟡 Minor

The testapp startup command doesn't match the documented build output.

Line 17 runs ./build/testapp, but the documented build commands produce ./stress-test here and ./testapp in apps/testapp/README.md. Following the steps as written will not create ./build/testapp.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/testapp/kv/bench/README.md` around lines 16 - 20, The README uses an incorrect startup command: it shows ./build/testapp but the build artifacts are ./stress-test and ./testapp; update the Terminal 1 command to start the built testapp binary (use ./testapp start --kv-endpoint localhost:9090) so it matches the documented build output and the Terminal 2 stress-test command; ensure the symbols shown (./build/testapp, ./testapp, ./stress-test, --kv-endpoint) are updated consistently in the README entry.

coderabbitai · 2026-03-30T21:42:56Z

apps/testapp/kv/http_server.go

 	}

 	hs.executor.InjectTx(body)
+	hs.injectedTxs.Add(1)


⚠️ Potential issue | 🟠 Major

Don't count dropped transactions as injected.

KVExecutor.InjectTx in apps/testapp/kv/kvexecutor.go still drops when txChan is full, but Line 112 increments injectedTxs unconditionally and Line 113 still returns 202. Under load, /stats will over-report injections and callers will see success for work that was discarded. Have InjectTx return a bool/error and only increment/respond 202 on a successful enqueue.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/testapp/kv/http_server.go` at line 112, The injectedTxs counter is incremented unconditionally at hs.injectedTxs.Add(1) even when KVExecutor.InjectTx drops transactions; change the flow so InjectTx (in apps/testapp/kv/kvexecutor.go) returns a success boolean or error indicating whether the tx was enqueued, then in the HTTP handler only call hs.injectedTxs.Add(1) and return 202 when InjectTx reports success—otherwise return an error status (e.g., 429/500) and do not increment injectedTxs so dropped transactions are not counted.

coderabbitai · 2026-03-30T21:42:56Z

apps/testapp/kv/kvexecutor.go

 	finalizedHeightKey    = ds.NewKey("/finalizedHeight")
 	// Define a buffer size for the transaction channel
-	txChannelBufferSize = 10000
+	txChannelBufferSize = 100_000_000


⚠️ Potential issue | 🔴 Critical

This mempool size will allocate gigabytes at startup.

Line 25 drives make(chan []byte, txChannelBufferSize) in NewKVExecutor. With 100_000_000 slots, a buffered chan []byte needs roughly 24 bytes per slot just for slice headers on 64-bit Go, so the queue alone is already well over 2 GB before any payload bytes are stored. That is likely to OOM the process or starve everything else of memory. As per coding guidelines, "Consider memory allocation in hot paths of Go code".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/testapp/kv/kvexecutor.go` at line 25, The txChannelBufferSize constant is far too large and will preallocate gigabytes when used in make(chan []byte, txChannelBufferSize) inside NewKVExecutor; change txChannelBufferSize to a much smaller default (e.g., 1_000 or similar) and make it configurable (via constructor param or config/env) so callers can tune it, or replace the large buffered channel with a bounded/ring buffer implementation or an unbuffered channel plus worker backpressure; update NewKVExecutor to accept the buffer size parameter (or config) and use it instead of the hardcoded 100_000_000, and add a comment about memory implications.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/testapp/cmd/run.go (1)
140-149: ⚠️ Potential issue | 🟠 Major

Avoid hardcoding a 1,000,000 sequencer queue cap.

Line 146 sets an extremely high maxQueueSize with no local validation/clamping. Given BatchQueue grows via append and Prepend bypasses the cap, this can cause severe memory pressure/OOM during stress or recovery paths. Please make this value configurable and enforce a bounded limit before passing it to single.NewSequencer.
Suggested direction
-	sequencer, err := single.NewSequencer(
+	// TODO: source from config/flag with validation instead of hardcoding.
+	maxQueueSize := 10_000
+	sequencer, err := single.NewSequencer(
 		logger,
 		datastore,
 		daClient,
 		nodeConfig,
 		[]byte(genesis.ChainID),
-		1_000_000,
+		maxQueueSize,
 		genesis,
 		executor,
 	)
As per coding guidelines, "Consider memory allocation in hot paths of Go code."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/testapp/cmd/run.go` around lines 140 - 149, The call to
single.NewSequencer currently passes a hardcoded maxQueueSize (1_000_000) which
risks unbounded memory use; make this value configurable (e.g., add a CLI flag
or a field on nodeConfig such as MaxSequencerQueueSize) and validate/clamp it to
a safe upper bound (e.g., MAX_ALLOWED_QUEUE_SIZE constant) before calling
single.NewSequencer in run.go; ensure the clamping logic is applied where
sequencer is constructed so both normal append and any Prepend/BatchQueue paths
cannot exceed the enforced limit.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@apps/testapp/cmd/run.go`:
- Around line 140-149: The call to single.NewSequencer currently passes a
hardcoded maxQueueSize (1_000_000) which risks unbounded memory use; make this
value configurable (e.g., add a CLI flag or a field on nodeConfig such as
MaxSequencerQueueSize) and validate/clamp it to a safe upper bound (e.g.,
MAX_ALLOWED_QUEUE_SIZE constant) before calling single.NewSequencer in run.go;
ensure the clamping logic is applied where sequencer is constructed so both
normal append and any Prepend/BatchQueue paths cannot exceed the enforced limit.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 30c00489-4bfe-407d-8107-9189f8023f9d

📥 Commits

Reviewing files that changed from the base of the PR and between 5581431 and b8dc969.

📒 Files selected for processing (1)

apps/testapp/cmd/run.go

julienrbrt added 2 commits March 30, 2026 22:07

test: testapp bench

fad4f5f

improve test

1fc283c

github-actions bot assigned julienrbrt Mar 30, 2026

updates

5581431

julienrbrt marked this pull request as ready for review March 30, 2026 21:30

increase sequencer queue

b8dc969

coderabbitai bot reviewed Mar 30, 2026

View reviewed changes

Conversation

julienrbrt commented Mar 30, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

github-actions bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: test: testapp bench

Critical

Medium

Minor

What's Good

Uh oh!

codecov bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

julienrbrt commented Mar 30, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 30, 2026 •

edited

Loading

github-actions bot commented Mar 30, 2026 •

edited

Loading

claude bot commented Mar 30, 2026 •

edited

Loading

Code Review: `test: testapp bench`

codecov bot commented Mar 30, 2026 •

edited

Loading