chunks=pd.read_table('filename', chunksize=500000)
df=pd.DataFrame()
df=pd.concat((chunk==1) for chunk in chunks)
MATLAB applications, tutorials, examples, tricks, resources,...and a little bit of everything I learned ...
Wednesday, September 20, 2017
remove deplicates
To remove duplicated rows:
awk '!seen[$0]++' <filename>
To remove rows with duplicated field (say $1 is ID and need to remove the entire row if ID is duplicated):
awk '!seen[$1]++' <filename>
awk '!seen[$0]++' <filename>
To remove rows with duplicated field (say $1 is ID and need to remove the entire row if ID is duplicated):
awk '!seen[$1]++' <filename>
Tuesday, September 19, 2017
filter a file based on tokens in another file
BEGIN{
FS="|"
OFS="|"
while ((getline < (“Token_list_file.csv")) > 0) {
id[$1]=$1;
}
}
{
appid = $1;
if(appid in id) {print $0;}
}
Subscribe to:
Posts (Atom)
my-alpine and docker-compose.yml
``` version: '1' services: man: build: . image: my-alpine:latest ``` Dockerfile: ``` FROM alpine:latest ENV PYTH...
-
It took me a while to figure out how to insert a space in Mathtype equations. This is especially useful when you write an equation with mult...
-
Recently I read post from Dr. Doug Hull's blog: http://blogs.mathworks.com/videos/2009/10/23/basics-volume-visualization-19-defining-s...
-
To get the slope of a pair of x and y, usually I first plot the curve and then add the trend line. Actually there are two functions i...