Friday, June 3, 2022

AI/ML: Seaborn correlation map - some useful options

sns.heatmap(df.corr(),annot=True);

When cannot is set to True, it will basically display the correlated value in the cell 

plt.figure(figsize=(16, 6))

When fig size is specified, the display is bound within this.

# Store heatmap object in a variable to easily access it when you want to include more features (such as title).

# Set the range of values to be displayed on the colormap from -1 to 1, and set the annotation to True to display the correlation values on the heatmap.

heatmap = sns.heatmap(dataframe.corr(), vmin=-1, vmax=1, annot=True)

# Give a title to the heatmap. Pad defines the distance of the title from the top of the heatmap.

heatmap.set_title('Correlation Heatmap', fontdict={'fontsize':12}, pad=12);

A diverging color palette that has markedly different colors at the two ends of the value-range with a pale, almost colorless midpoint, works much better with correlation heatmaps than the default colormap. 

plt.figure(figsize=(16, 6))

heatmap = sns.heatmap(dataframe.corr(), vmin=-1, vmax=1, annot=True, cmap='BrBG')

heatmap.set_title('Correlation Heatmap', fontdict={'fontsize':18}, pad=12);

# save heatmap as .png file

# dpi - sets the resolution of the saved image in dots/inches

# bbox_inches - when set to 'tight' - does not allow the labels to be cropped

plt.savefig('heatmap.png', dpi=300, bbox_inches='tight')


Triangle Correlation Heatmap

Take a look at any of the correlation heatmaps above. If you cut away half of it along the diagonal line marked by 1-s, you would not lose any information. Let’s cut the heatmap in half, then, and keep only the lower triangle.



The Seaborn heatmap ‘mask’ argument comes in handy when we want to cover part of the heatmap.


Mask — takes a boolean array or a dataframe as an argument; when defined, cells become invisible for values where the mask is True



Let’s use the np.triu() numpy function to isolate the upper triangle of a matrix while turning all the values in the lower triangle into 0. (The np.tril() function would do the same, only for the lower triangle.) Using the np.ones_like() function will change all the isolated values into 1.

np.triu(np.ones_like(dataframe.corr()))


plt.figure(figsize=(16, 6))

# define the mask to set the values in the upper triangle to True

mask = np.triu(np.ones_like(dataframe.corr(), dtype=np.bool))

heatmap = sns.heatmap(dataframe.corr(), mask=mask, vmin=-1, vmax=1, annot=True, cmap='BrBG')

heatmap.set_title('Triangle Correlation Heatmap', fontdict={'fontsize':18}, pad=16);


references:

https://medium.com/@szabo.bibor/how-to-create-a-seaborn-correlation-heatmap-in-python-834c0686b88e




No comments:

Post a Comment