Draft
Conversation
Add arrow_convertor.py with convert_arrow_table_to_dataframe(), which turns the GoodData /binary execution endpoint response into a pandas DataFrame matching the JSON-path output (MultiIndex rows/columns, totals uppercased, transposition handled via x-gdc-view-v1.isTransposed). Wire it into DataFrameFactory.for_exec_def_arrow() replacing the previous .to_pandas() stub.
Add compute_row_totals_indexes() which derives row_totals_indexes from the Arrow table metadata and execution dimension headers, matching the DataFrameMetadata produced by the JSON path. Update for_exec_def_arrow() to return (DataFrame, DataFrameMetadata) for API parity with for_exec_def(), making the two paths interchangeable. primary_labels_from_index/columns are empty dicts as the Arrow path does not support use_primary_labels_in_attributes.
- Compute primary attribute labels from Arrow table metadata and populate DataFrameMetadata.primary_labels_from_index/columns (was returning empty dicts previously) - Add DataFrameFactory.for_arrow_table() for callers who hold a pa.Table already (raw export REST path, future Flight RPC); accepts optional BareExecutionResponse for accurate row_totals_indexes - Make DataFrameMetadata.execution_response Optional to support the no-execution-response path - Export convert_arrow_table_to_dataframe from gooddata_pandas.__init__
Add a thin method that polls the raw export endpoint for an already-submitted export and returns Arrow IPC bytes. Reuses the existing _get_exported_content() polling loop; the caller is responsible for submitting the execution and export request.
- read_result_arrow(): drain response into BytesIO before releasing the connection, eliminating the fragile try/finally ordering - Remove unused _FULL_TYPES_MAPPER dead code from arrow_convertor.py - Remove stale docstring claiming primary_labels are always empty dicts (compute_primary_labels() now populates them) - Add for_arrow_table() to DataFrameFactory class docstring - convert_arrow_table_to_dataframe stub in __init__.py now raises ImportError with a helpful pyarrow install hint instead of being silently absent
[project.optional-dependencies] was incorrectly placed between dependencies and classifiers in both gooddata-pandas and gooddata-sdk pyproject.toml files, causing TOML to parse classifiers as belonging to optional-dependencies instead of [project]. This caused hatchling to fail with "Dependency of option 'classifiers' is invalid". Regenerate uv.lock after fixing the TOML structure. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The keyword form (executionResponse=) breaks the generated API client at runtime — __init__ expects execution_response as a positional arg. Revert to the original call and suppress the ty false-positive on the __new__ signature with type: ignore[invalid-argument-type].
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #1489 +/- ##
==========================================
+ Coverage 77.32% 77.52% +0.19%
==========================================
Files 227 229 +2
Lines 14768 15032 +264
==========================================
+ Hits 11420 11653 +233
- Misses 3348 3379 +31 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
gooddata_api_client is installed as a wheel in CI so ty cannot resolve its .pyi stubs, making the suppressions both ineffective and unnecessary there. Remove them to keep the code clean.
… paths - Parametrize test_compute_primary_labels against all 36 fixtures: verifies compute_primary_labels output against ground truth stored in meta.json, covering _compute_primary_labels_from_inline (identity branch) and _compute_primary_labels_from_fields across all cases. - Add test_primary_labels_from_inline_separate_column: exercises the branch where primaryLabelId != labelId and a separate column exists. - Add test_primary_labels_from_inline_fallback_identity: exercises the fallback branch where the primary label column is absent. - Add test_primary_labels_from_fields_skips_non_string: exercises the continue branch for non-string label_values / primary_label_values. - Add test_for_arrow_table_without_execution_response: tests DataFrameFactory.for_arrow_table with no execution_response, covering the no-server-needed path and verifying empty metadata fields. - Add test_arrow_converter_unknown_types_mapper: tests the ValueError raised for unrecognised types_mapper values.
hkad98
reviewed
Mar 30, 2026
| ] | ||
|
|
||
| [project.optional-dependencies] | ||
| arrow = ["pyarrow>=16.1.0"] |
Contributor
There was a problem hiding this comment.
Nitpick: this is a new dependency. Consider setting the threshold higher e.g., pyarrow>=23.0.1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.