Convert categorical data to labels

LabelEncoder is just another scikit-learn estimator with a fit() method and a transform() method.

Imports

import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder

Create data

data = {'label': ['dog', 'cat', 'catdog', 'dog', 'catdog']}
df = pd.DataFrame(data, columns = ["label"])
df

label
0 dog
1 cat
2 catdog
3 dog
4 catdog

Initialise LabelEncoder

le = LabelEncoder()

Fit LabelEncoder

le.fit(df['label'])
LabelEncoder()

View labels

list(le.classes_)
['cat', 'catdog', 'dog']

Transform label into number

le.transform(df['label'])
array([2, 0, 1, 2, 1])

Transform number back into label

list(le.inverse_transform([0, 1, 2]))
['cat', 'catdog', 'dog']

Source