Friday, February 28, 2020

Assign bin labels to new values during model inference

In model development:

import pandas as pd
import numpy as np
np.random.seed(42)

bins = [0, 10, 15, 20, 25, 30, np.inf]
labels = bins[1:]
ages = list(range(5, 90, 5))
df = pd.DataFrame({"user_age": ages})
df["user_age_bin"] = pd.cut(df["user_age"], bins=bins, labels=False)

# sort by age 
print(df.sort_values('user_age'))


In production, I will need to put individual age values to its corresponding bins. Here's how to do it:

# a new age value
new_age=30

# use this right=True and '-1' trick to make the bins match
print(np.digitize(new_age, bins=bins, right=True) -1)

my-alpine and docker-compose.yml

 ``` version: '1' services:     man:       build: .       image: my-alpine:latest   ```  Dockerfile: ``` FROM alpine:latest ENV PYTH...