Monday, February 26, 2018

Find the speed limiting part of your Python code

Say you have a Python code, 'simple.py' like this:
import time
def func_A():
    time.sleep(1)
def func_B():
    time.sleep(10)
    
if __name__=='__mian__':
    func_A()
    func_B()
and you want to figure out which part takes the most time to finish. What you can do is:
python -m cProfile simple.py
However, the output is kinda too complicated to understand.
A better way is to use 'cprofilev', which you can install from here.
python -m cProfilev simple.py
Output would look like:
[cProfileV]: cProfile output available at http://127.0.0.1:4000
Just go to http://127.0.0.1:4000 and all the information are right there waiting for you.
## https://github.com/ymichael/cprofilev

Tuesday, February 6, 2018

Round to thousands | AWK

function rounding(var1){
    a = var1+500
    b = int(a/1000)
    return b
}

BEGIN{
    FS="|"
    OFS="|"
}
{   
     print $1,$2, $3, rounding($4), rounding($5), rounding($6), rounding($7), rounding($8),rounding($9),rounding($10),rounding($11),rounding($12)
}

Monday, January 29, 2018

Restart Postgres Service

Go to :
C:\Program Files\PostgreSQL\10\bin
Run:
pg_ctl restart -D "C:\Program Files\PostgreSQL\10\data"

Thursday, January 11, 2018

A self-defined algorithm to group strings | Python

def group_names(x):
    d1 = dict()
    for wx in x:
        dist_list = [Levenshtein.distance(wx, w2) for w2 in x]
        indx = [d<=4 for d in dist_list]
        sub_lst = list(compress(x, indx))
        list_new = [e for e in x if e not in sub_lst]
        x = list_new
        print len(x)
        if len(sub_lst)>1:
            for i in sub_lst[1:]:
                d1[i] = sub_lst[0]
    return d1


The problem is that when the input list (x) is too long, it takes quite a while to finish.

Wednesday, January 10, 2018

Replace multiple characters in a string | Tableau

Just use nested REPLACE() function:

replace(replace(replace(replace(replace(lower([Employer Name]), "inc",""), "-",""), "&",""), " co","")," llc","")

Monday, October 30, 2017

Concatenate multiple files with same headers (and only keep one header line in the output file)

awk '
FNR==1 && NR!=1 { while (/^<header>/) getline; }
1 {print}
'
file*.txt >all.txt
Note: the /^<header>/ part need to be changed to adapt to whatever the actual header is.

my-alpine and docker-compose.yml

 ``` version: '1' services:     man:       build: .       image: my-alpine:latest   ```  Dockerfile: ``` FROM alpine:latest ENV PYTH...