Monday, October 9, 2017

Sanitize U.S. States Names


# load data
StateData = read.csv('65States.pip', sep="|", col.names = c("FullName", "Abbr"))
StateFullName = toupper(StateData$FullName)
StateAbbr = as.vector(StateData$Abbr)

# define a function
sanitizeState = function(inputcol, StateFullName, StateAbbr){
  match = amatch(inputcol, StateFullName, maxDist=1)
  inputcol[!is.na(match)] = StateAbbr[na.omit(match)]
  return (inputcol)
}

# use the function
df$State = sanitizeState(gls$State,StateFullName, StateAbbr )

No comments:

Post a Comment

Any comments?

my-alpine and docker-compose.yml

 ``` version: '1' services:     man:       build: .       image: my-alpine:latest   ```  Dockerfile: ``` FROM alpine:latest ENV PYTH...