[CSM-1857] Stabilize Azure detector hash_v2 with deterministic iteration#4846
Draft
dipto-truffle wants to merge 1 commit intomainfrom
Draft
[CSM-1857] Stabilize Azure detector hash_v2 with deterministic iteration#4846dipto-truffle wants to merge 1 commit intomainfrom
dipto-truffle wants to merge 1 commit intomainfrom
Conversation
ProcessData in the Azure Entra v2 detector iterated Go maps non-deterministically, causing the same credential to produce different (clientId, tenantId) pairings across scanner runs. This yielded different RawV2 values, different hash_v2 hashes, and duplicate secret rows in the database (CSM-1857 secondary issue). Iterate map keys in sorted order via slices.Sorted(maps.Keys(...)) and clone caller maps with maps.Clone before verification-driven mutations. Same inputs now always produce the same RawV2/hash_v2. Made-with: Cursor
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| origClientLen := len(clientIDs) | ||
| origTenantLen := len(tenantIDs) | ||
|
|
||
| _ = ProcessData(context.Background(), secrets, clientIDs, tenantIDs, false, nil) |
There was a problem hiding this comment.
Caller-map-immutability test never exercises mutation code path
Low Severity
TestProcessData_DoesNotMutateCallerMaps passes verify=false, but every delete call that could mutate maps is inside the if verify block. The test always passes regardless of whether maps.Clone is used, providing false confidence that the cloning fix works. If the cloning were removed in a future refactor, this test would not catch the regression.
Additional Locations (1)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
ProcessDatain the Azure Entra v2 detector iterated Go maps (clientSecrets,clientIds,tenantIds) non-deterministically, causing the same credential to produce different(clientId, tenantId)pairings across scanner runs. This yielded differentRawV2values, differenthash_v2hashes, and duplicate secret rows in the database (the secondary issue in CSM-1857).slices.Sorted(maps.Keys(...))and clone caller maps withmaps.Clonebefore verification-driven mutations. Same inputs now always produce the sameRawV2/hash_v2.Corresponds to trufflesecurity/thog#5936.
Test plan
go test ./pkg/detectors/azure_entra/serviceprincipal/v2/ -v— all 8 tests passTestProcessData_DeterministicRawV2confirms identical RawV2 across 50 repeated calls with the same inputsTestProcessData_DoesNotMutateCallerMapsconfirms caller maps are not modifiedTestProcessData_RawV2DependsOnIDCountconfirms RawV2 is populated only with unambiguous IDsTestProcessData_SameSecretDifferentRawV2documents the RawV2 divergence when chunk context differsgo test ./pkg/detectors/azure_entra/...— all passMade with Cursor
Note
Medium Risk
Changes secret result generation order and verification-side pruning behavior in the Azure Entra v2 detector, which could affect which
(tenantId, clientId)pairing is chosen and thushash_v2/dedupe behavior. Logic is well-scoped and covered by new determinism and non-mutation tests.Overview
Stabilizes Azure Entra Service Principal v2 detection output by making
ProcessDataiterateclientSecrets,clientIds, andtenantIdsin sorted key order and by cloning the input maps before verification-driven deletions.Adds unit tests to lock in deterministic
RawV2generation across repeated runs, document whenRawV2is omitted due to ambiguous IDs, and ensure caller-provided maps are not mutated.Written by Cursor Bugbot for commit 3461a39. This will update automatically on new commits. Configure here.