Data Processing
Most of us have 8 or 16 GBs of RAM on our laptops, which is enough to open a few million rows. Anything beyond this will end up using all the RAM, start "swapping" (i.e. using storage instead of memory - super slow!) and eventually crash your computer...

To load and process file of *any* size, load the file as smaller chunks using the chunksize parameter in Pandas:

df_chunks = pd.read_csv('data.csv', chunksize=1000000)

df_chunks is now an iterator and every iteration will return a data frame with the next million rows. You can read each chunk using a for-loop:

for chunk in df_chunks:
     chunk # do something with each chunk
     
     
Interesting, you have very interesting pandas styles that should be looked at:


def color_negative_red(val):
"""
Takes a scalar and returns a string with
the css property 'color: red' for negative
strings, black otherwise.
"""
color = 'red' if val < 0 else 'black'
return 'color: %s' % color
matrix.style.applymap(color_negative_red)

Confusion Matrix 

import model_evaluation_utils as meu
meu.display_confusion_matrix_pretty(true_labels=sentiment_category, 
                                    predicted_labels=sentiment_category_tb, 
                                    classes=['negative', 'neutral', 'positive'])


Multi Processing:

‘’
%timeit
from multiprocessing import Pool

language=df["Title"].copy()
dete = language.copy()

def langafranca(x):
    try:
        d =detect(dete.iloc[x])
        return d
    except:
        pass