perf: Element-wise comparison only for tolerance-requiring data types by MariusMerkleQC · Pull Request #26 · Quantco/diffly

Marius Merkle (MariusMerkleQC) · 2026-03-27T22:44:19Z

Motivation

Changes

Introduce a function _needs_element_wise_comparison that checks whether element-wise comparison needs to be performed; this is the case for

(1) float vs numeric columns -> absolute and relative tolerances apply (-> _is_float_numeric_pair())
(2) temporal columns -> absolute temporal tolerance applies (-> _is_temporal_pair())
(3) Different enums
(4) Enum vs categorical comparison

In all other cases, naive comparison suffices, and this shortcut is taken if the above helper returns False. This avoids the expensive _compare_sequence_columns(). The performance improvement can be seen in the benchmark test.

codecov · 2026-03-27T22:45:42Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (ff8439c) to head (8dfe919).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##              main       #26   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           10        10           
  Lines          758       776   +18     
=========================================
+ Hits           758       776   +18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR optimizes condition_equal_columns for nested list/array columns by avoiding the expensive element-wise comparison path when tolerances/special handling aren’t needed, and updates the performance benchmark accordingly.

Changes:

Add _needs_element_wise_comparison() (plus helpers) to decide when list/array columns require element-wise comparison.
Shortcut list/array comparisons to eq_missing() when element-wise handling is deemed unnecessary.
Update the performance test to assert comparable performance for list<i64> comparisons.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`diffly/_conditions.py`	Introduces dtype-based gating to skip element-wise list/array comparisons unless tolerances/special handling are needed.
`tests/test_performance.py`	Updates benchmark expectations to ensure the optimized path is not significantly slower than direct `eq_missing()`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

diffly/_conditions.py

Oliver Borchert (borchero)

Nice, thanks

diffly/_conditions.py

Marius Merkle (MariusMerkleQC) added 4 commits March 27, 2026 23:27

test: Benchmark slowdown of element-wise list comparison

ad7237b

readd fixtures

f4286e9

perf: Element-wise comparison only for tolerance-requiring data types

0618ad2

perf: Element-wise comparison only for tolerance-requiring data types

3e0d66d

Marius Merkle (MariusMerkleQC) self-assigned this Mar 27, 2026

github-actions bot added the performance label Mar 27, 2026

Marius Merkle (MariusMerkleQC) changed the base branch from main to benchmark March 27, 2026 22:44

Marius Merkle (MariusMerkleQC) mentioned this pull request Mar 27, 2026

feat: Tolerances for inner lists and arrays #21

Merged

Marius Merkle (MariusMerkleQC) mentioned this pull request Mar 27, 2026

test: Benchmark slowdown of element-wise list comparison #25

Merged

Marius Merkle (MariusMerkleQC) added 4 commits March 27, 2026 23:57

feedback copilot

4ae5323

fix

8e0e64f

Merge branch 'benchmark' into optimize

a4f0225

merge base

77cd4da

Marius Merkle (MariusMerkleQC) requested a review from Copilot March 27, 2026 23:02

Copilot started reviewing on behalf of Marius Merkle (MariusMerkleQC) March 27, 2026 23:03 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

diffly/_conditions.py Show resolved Hide resolved

diffly/_conditions.py Outdated Show resolved Hide resolved

diffly/_conditions.py Show resolved Hide resolved

Marius Merkle (MariusMerkleQC) added 2 commits March 28, 2026 00:30

feedback copilot

9815b75

feedback copilot

83e79b4

Marius Merkle (MariusMerkleQC) marked this pull request as ready for review March 27, 2026 23:45

Marius Merkle (MariusMerkleQC) requested review from EgeKaraismailogluQC and Oliver Borchert (borchero) as code owners March 27, 2026 23:45

Marius Merkle (MariusMerkleQC) commented Mar 28, 2026

View reviewed changes

diffly/_conditions.py Show resolved Hide resolved

Marius Merkle (MariusMerkleQC) and others added 3 commits March 28, 2026 08:16

add test

55766c4

add test for struct columns

ef9aa25

Merge branch 'benchmark' into optimize

9fd6079

Oliver Borchert (borchero) approved these changes Mar 30, 2026

View reviewed changes

diffly/_conditions.py Outdated Show resolved Hide resolved

Marius Merkle (MariusMerkleQC) added 2 commits March 30, 2026 13:28

drop struct performance test

c6e108f

Merge branch 'benchmark' into optimize

6e9fed2

Base automatically changed from benchmark to main March 30, 2026 11:30

Marius Merkle (MariusMerkleQC) added 3 commits March 30, 2026 13:31

feedback OB

e30f139

merge base

53567bf

fix merge

8dfe919

Marius Merkle (MariusMerkleQC) merged commit a6992fa into main Mar 30, 2026
16 checks passed

Marius Merkle (MariusMerkleQC) deleted the optimize branch March 30, 2026 11:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Element-wise comparison only for tolerance-requiring data types#26

perf: Element-wise comparison only for tolerance-requiring data types#26
Marius Merkle (MariusMerkleQC) merged 18 commits intomainfrom
optimize

Marius Merkle (MariusMerkleQC) commented Mar 27, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Oliver Borchert (borchero) left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Marius Merkle (MariusMerkleQC) commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Changes

Uh oh!

codecov bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Oliver Borchert (borchero) left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Marius Merkle (MariusMerkleQC) commented Mar 27, 2026 •

edited

Loading

codecov bot commented Mar 27, 2026 •

edited

Loading