feat: initial pass at a single base binary#1786
Conversation
🟡 Heimdall Review Status
|
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
bin/base/src/node.rs
Outdated
| let engine_config = EngineConfig { | ||
| config: Arc::new(rollup_config.clone()), | ||
| l2_url: auth_http_url, | ||
| l2_jwt_secret: auth_jwt.secret().clone(), |
There was a problem hiding this comment.
JwtSecret is Copy (it's a [u8; 32] wrapper), so .clone() is misleading — but more importantly, this line moves auth_jwt.secret() by value (via .clone()) even though JwtSecret is Copy. The real issue is that auth_jwt is consumed here conceptually, but the InMemoryAuthJwt struct (and its _file: NamedTempFile) must remain alive for the entire lifetime of the process to keep the temp file on disk. If auth_jwt were accidentally dropped (e.g. by a future refactor), the OS could reclaim the temp file and the auth handshake would fail silently. Consider adding a comment documenting this invariant, or hold the InMemoryAuthJwt explicitly until shutdown.
5206771 to
8cf0548
Compare
|
|
||
| fn create_in_memory_auth_jwt() -> eyre::Result<InMemoryAuthJwt> { | ||
| let secret = JwtSecret::random(); | ||
| let mut file = NamedTempFile::new().wrap_err("failed to create auth JWT temp file")?; |
There was a problem hiding this comment.
tempfile is a production dependency used to write the JWT secret to a temporary file for the auth IPC handshake. If the OS reclaims the temp file (e.g. via /tmp reaping on long-running nodes), the auth path becomes stale and reconnection will fail silently.
Consider writing the JWT file into the node's datadir instead (which is already under the operator's control) rather than relying on the OS temp directory. This also avoids the file permission concern flagged separately.
8cf0548 to
c8b8808
Compare
| let rollup_config = resolved.rollup_config; | ||
| let engine_config = EngineConfig { | ||
| config: Arc::new(rollup_config.clone()), |
There was a problem hiding this comment.
rollup_config is cloned at line 147 to wrap in Arc for EngineConfig, then the original is moved into RollupNodeBuilder::new() at line 155, which wraps it in another Arc inside build(). Consider wrapping in Arc once and sharing:
let rollup_config = Arc::new(resolved.rollup_config);
let engine_config = EngineConfig {
config: Arc::clone(&rollup_config),
// ...
};This requires RollupNodeBuilder::new to accept Arc<RollupConfig> instead of RollupConfig (or add a builder method for it), but avoids cloning the full config struct.
| match request_execution_shutdown(node_exit_future.as_mut(), execution_shutdown) | ||
| .await | ||
| { | ||
| Ok(el_exit) => { | ||
| if let Err(ref e) = el_exit { | ||
| warn!(target: "base", error = %e, "Execution layer exited with error during shutdown"); | ||
| } | ||
| } | ||
| Err(e) => { | ||
| warn!(target: "base", error = %e, "Execution layer shutdown mechanism failed"); | ||
| } | ||
| } |
There was a problem hiding this comment.
In the ExitReason::Consensus branch, the EL shutdown result is logged but never propagated. If the EL exits with an error during shutdown (line 244) or the shutdown mechanism itself fails (line 248), only a warn! is emitted. The function then proceeds to check consensus_exit_is_unexpected and potentially returns Ok(()) even though the EL failed.
This means operators won't get a non-zero exit code when the EL has a critical failure during CL-triggered shutdown. Consider at least propagating the EL error when the consensus exited cleanly:
Ok(el_exit) => {
el_exit?; // propagate EL errors
}Or store it and return whichever error is most actionable.
| let filter = tracing_subscriber::EnvFilter::builder() | ||
| .with_default_directive(log_config.global_level.into()) | ||
| .from_env_lossy() | ||
| .add_directive("discv5=error".parse().expect("valid directive")); |
There was a problem hiding this comment.
"discv5=error" is added as a directive here AND also hardcoded in LogConfig::init_tracing_subscriber() at crates/utilities/cli/src/tracing.rs:84. Since init_logging calls init_tracing_subscriber_with_layers (which skips the default init_tracing_subscriber), the duplicate here is correct — but it's fragile. If someone later switches to init_tracing_subscriber(), the directive gets doubled. Consider extracting the common noise-suppression directives into a shared helper to avoid drift.
| async move { | ||
| rollup_node | ||
| .start_with_shutdown(consensus_shutdown) | ||
| .await | ||
| .map_err(|e| eyre::eyre!("failed to start rollup node service: {e}")) |
There was a problem hiding this comment.
start_with_shutdown returns Result<(), String>, and the {e} in the format string just re-embeds the string error message into a new eyre::Report. This works, but the resulting error report will say "failed to start rollup node service: <actual error>" as a single flat string with no causal chain.
This is a consequence of the upstream Result<(), String> API — not something this PR can easily fix. But it's worth noting that when the consensus service migrates to a proper error type, this can be improved to use .wrap_err() to preserve the chain.
Review SummarySolid prototype for the unified EL+CL binary. The CLI design is clean, the shutdown orchestration is well-structured, and the IPC-based EL/CL link avoids network overhead. Findings1. rollup_config cloned to create two separate Arcs (bin/base/src/node.rs:145-155) - The config is cloned at line 147 for EngineConfig Arc, then the original is moved into RollupNodeBuilder which wraps it in another Arc. Wrap in Arc once and share. 2. EL shutdown errors swallowed in Consensus exit branch (bin/base/src/node.rs:240-251) - When the consensus layer exits first, the EL shutdown result is only logged but never propagated. Operators will not get a non-zero exit code if EL fails during CL-triggered shutdown. 3. Upstream String error API (bin/base/src/node.rs:174-178) - start_with_shutdown returns a flat String error, so map_err wrapping produces a single-layer error with no causal chain. Worth improving when the consensus service migrates to a proper error type. 4. Duplicated discv5=error noise suppression (bin/base/src/cli.rs:278) - This directive is added manually here and also hardcoded in LogConfig::init_tracing_subscriber(). The duplication is correct but fragile. Consider extracting into a shared helper. Notes (no action needed)
|
Description
Setup a prototype single
basebinary that starts both an EL and CL client in process. Initially this binary just connects them over IPC.This is a very rough first pass to get something going. There are two major things to iterate on: