-
Notifications
You must be signed in to change notification settings - Fork 4
Small fixes to CSV df and Configuration token #599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
# most users don't supply their own df, so this is mostly a sanity check | ||
# for when an advanced user has done filtering and have a IntervalIndex. | ||
if not isinstance(self.df.index, pd.RangeIndex): | ||
raise Exception( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise Exception( | |
raise TypeError( |
else: | ||
if df is not None: | ||
raise Exception("Dask requires a csv passed as a filename") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise Exception("Dask requires a csv passed as a filename") | |
raise ValueError("Dask requires a csv passed as a filename") |
@@ -75,7 +75,8 @@ def get_configurations(file: str): | |||
username=config["username"], | |||
password=config["password"], | |||
use_rest=config["use_rest"], | |||
use_ssl=config["use_ssl"]) | |||
use_ssl=config["use_ssl"], | |||
token=config["token"] if "token" in config else None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
token=config["token"] if "token" in config else None) | |
token=config.get("token"), | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a sanity check for DataFrame indexes in the CSV parser and ensures the token
field is preserved when loading configurations from file.
- Enforce that CSVParser only accepts pandas DataFrames with a RangeIndex.
- Pass the
token
value into the Configuration constructor when reading from the CLI config file. - Improve error handling around Dask input for CSVParser.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
aperturedb/cli/configure.py | Include token argument when constructing Configuration from config. |
aperturedb/CSVParser.py | Validate DataFrame index is a RangeIndex; raise exceptions for invalid CSV/Dask inputs. |
Comments suppressed due to low confidence (2)
aperturedb/CSVParser.py:69
- Add unit tests to verify that providing a non-RangeIndex DataFrame raises the expected exception in CSVParser.
if not isinstance(self.df.index, pd.RangeIndex):
aperturedb/cli/configure.py:79
- Add tests for loading and saving configurations to confirm the
token
is correctly carried through and defaults toNone
when absent.
token=config["token"] if "token" in config else None)
@@ -75,7 +75,8 @@ def get_configurations(file: str): | |||
username=config["username"], | |||
password=config["password"], | |||
use_rest=config["use_rest"], | |||
use_ssl=config["use_ssl"]) | |||
use_ssl=config["use_ssl"], | |||
token=config["token"] if "token" in config else None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider using config.get("token")
to simplify this expression and default to None
when the key is missing.
token=config["token"] if "token" in config else None) | |
token=config.get("token", None)) |
Copilot uses AI. Check for mistakes.
# for when an advanced user has done filtering and have a IntervalIndex. | ||
if not isinstance(self.df.index, pd.RangeIndex): | ||
raise Exception( | ||
f"CSVParser requires a RangeIndex. the supplied DataFrame has a {type(self.df.index)} index.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a more specific exception type like ValueError
and capitalize the sentence in the message (e.g., "The supplied DataFrame...").
Copilot uses AI. Check for mistakes.
else: | ||
if df is not None: | ||
raise Exception("Dask requires a csv passed as a filename") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace the generic Exception
with ValueError
and clarify the message, for example: "Dask mode requires a CSV filename; DataFrame inputs are not supported."
raise Exception("Dask requires a csv passed as a filename") | |
raise ValueError("Dask mode requires a CSV filename; DataFrame inputs are not supported.") |
Copilot uses AI. Check for mistakes.
CSVLoader actually wants an RangeIndex, since everything wants to use
start
and only RangeIndex has it. It's better to catch it earlier.I missed passing token to Configuration when loading from the config file, so if you save a aperturedb key, it will lose the token when it loads.