-
Notifications
You must be signed in to change notification settings - Fork 29
Description
Issue
CSV files may contain data that look like numbers, but should not be treated as such. For example, a column of US ZIP codes might contain e.g. 02210
, but this should not be treated as the number 2210.
Data Wrangler currently uses pandas' default behavior for inferring column types from CSV files. So in the above example, pandas will infer the raw value 02210
as an integer and convert it to 2210 in the loaded dataframe.
Workaround
This can be worked around by directly modifying the code in the loading step after the data is first loaded and adding the argument dtype={'<column name here>':'S'}
to interpret the affected column as a string:
Feature request
This should not require directly modifying the code. There should be some simpler way to configure this from the UI and it should also be accessible from Viewing mode.