Wednesday, June 10, 2009

Dealing with huge data file

Recently I downloaded a huge text file, which is larger than 1GB...

The Notepad and Excel can't fully open that text file for me. Even Matlab can't do that for me. There's always memory overflow. Then I googled a lot and found that Matlab can only handle a certain size of memories. It won't help if you increase the RAM or the page file size on your computer.

After struggling a few days on that, my solution end up like this:

Use a file splitter (such as TextMaster) to split the text file into small pieces first. I set each piece contains 50,000 rows because Excel can open about 65,000 rows.

Then read the first piece, get the info I need and put them aside, then go and read the second pieces. For loop was used to let me open each and every file.

No comments:

Post a Comment

Any comments?

my-alpine and docker-compose.yml

 ``` version: '1' services:     man:       build: .       image: my-alpine:latest   ```  Dockerfile: ``` FROM alpine:latest ENV PYTH...