Big Data
Pandas
VM
CS
Itamar Turner-TrauringLast
16GB
Aardvarks
No matching tags
No matching tags
GB
No matching tags
Instead you can load it into memory in chunks, processing the data one chunk at time (or as we’ll discuss in a future article, multiple chunks in parallel).Let’s say, for example, that you want to find the largest word in a book. So now you can read those pages, and those pages only, which is much faster.This works because the index is much smaller than the full book, so loading the index into memory to lookup the relevant data is much easier.The simplest and most common way to implement indexing is by naming files in a directory:If you want the data for March 2019, you just load 2019-Mar.csv—no need to load data for February, July, or any other month.The easiest solution to lack of RAM is spending money to get more RAM.
As said here by Itamar Turner-Trauring