String columns for labels are not identified as discrete when dtype is `object`

With the following example, the using `bulk_labels` from the `.obs` attribute works fine, because the labels here are correctly identified as categorical. 

```
import scanpy as sc
from concordex.utils._labels import Labels

# Categorical labels
ad = sc.datasets.pbmc68k_reduced()
labels = Labels("bulk_labels")
labels.extract(ad)
print(labels.labeltype)
```


...but if we update the column so that the dtype is object, the labels are incorrectly described as continuous
```
# Object labels
ad.obs['bulk_labels'] = ad.obs['bulk_labels'].astype(object)
labels = Labels("bulk_labels")
labels.extract(ad)
print(labels.labeltype)
```

This will almost certainly be a problem if a pandas reader (e.g. `pd.read_csv`) is used to read in metadata from a file. I'm wondering if I should do the conversion internally, with warning, or stop with error. I'm guessing that continuous columns with string representations of NULL/NaN will also be read in as object, so internal conversion in this case would be the wrong thing to do here. We could implement some of the R logic here and do a proper "guess" of the column type,  but I'd like to avoid checking each item of the column, to confirm that object vs string. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

String columns for labels are not identified as discrete when dtype is `object` #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

String columns for labels are not identified as discrete when dtype is object #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

String columns for labels are not identified as discrete when dtype is `object` #6