Certificate/data science-IBM

pie chart, box chart, scatter chart, subplot

Olivia-BlackCherry 2023. 5. 9. 18:41

pie chart

pie chart를 만들기 위해서는 split, apply, combine 과정을 거친다.

df_continents = df_can.groupby('Continent', axis=0).sum()

 

 

파이차트를 효과적으로 만들어줄 것들이 있다. 

  • autopct - is a string or function used to label the wedges with their numeric value. The label will be placed inside the wedge. If it is a format string, the label will be fmt%pct. - % 표시
  • startangle - rotates the start of the pie chart by angle degrees counterclockwise from the x-axis. -시작각도
  • shadow - Draws a shadow beneath the pie (to give a 3D feel).- 그림자
  • legend - 범례추가
  • pctdistance 파이차트 바깥에 %넣기
  • color 
  • explode
# autopct create %, start angle represent starting point
df_continents['Total'].plot(kind='pie',
                            figsize=(5, 6),
                            autopct='%1.1f%%', # add in percentages
                            startangle=90,     # start angle 90° (Africa)
                            shadow=True,       # add shadow      
                            )

plt.title('Immigration to Canada by Continent [1980 - 2013]')
plt.axis('equal') # Sets the pie chart to look like a circle.

plt.show()

 

colors_list = ['gold', 'yellowgreen', 'lightcoral', 'lightskyblue', 'lightgreen', 'pink']
explode_list = [0.1, 0, 0, 0, 0.1, 0.1] # ratio for each continent with which to offset each wedge.

df_continents['Total'].plot(kind='pie',
                            figsize=(15, 6),
                            autopct='%1.1f%%', 
                            startangle=90,    
                            shadow=True,       
                            labels=None,         # turn off labels on pie chart
                            pctdistance=1.12,    # the ratio between the center of each pie slice and the start of the text generated by autopct 
                            colors=colors_list,  # add custom colors
                            explode=explode_list # 'explode' lowest 3 continents
                            )

# scale the title up by 12% to match pctdistance
plt.title('Immigration to Canada by Continent [1980 - 2013]', y=1.12) 

plt.axis('equal') 

# add legend
plt.legend(labels=df_continents.index, loc='upper left') 

plt.show()

 

 

 

 

box plots

 

# horizontal box plots
df_CI.plot(kind='box', figsize=(10, 7), color='blue', vert=False)

plt.title('Box plots of Immigrants from China and India (1980 - 2013)')
plt.xlabel('Number of Immigrants')

plt.show()

 

 

scatter plots

기본 산점도 plot이다.

df_tot.plot(kind='scatter', x='year', y='total', figsize=(10, 6), color='darkblue')

여기에 numpy의 polyfit() 함수를 써보자. 

degree는 차수를 뜻한다. 

polynomial적인 의미이다. 

polyfit 함수로 학습을 하면 1차 함수의 a, x 값을 도출한다. 

x = df_tot['year']      # year on x-axis
y = df_tot['total']     # total on y-axis
fit = np.polyfit(x, y, deg=1)

 

여기에 annotate를 달고, 직선을 그려보자. 

plt.plot(x, fit[0] * x + fit[1], color='red') # recall that x is the Years
plt.annotate('y={0:.0f} x + {1:.0f}'.format(fit[0], fit[1]), xy=(2000, 150000))

plt.show()

 

subplot

여러 개의 plot을 만들고 싶을 때 figure() 함수를 쓰는데, 여기는 artist layer이다. 

전형적인 문장은

fig = plt.figure() #create figure

ax=fig.add_subplot(nrows, ncols, plot_number) #create sublot

 

nrows, ncols는 행과 열의 전체 개수를 뜻한다. nrows*ncols 가 subplot의 전체개수

plot number는 해당 플랏이 몇 번째 순서인지를 확인한다.

subplot(211)==subplot(2,1,1) 똑같은 의미이다.

 

 

fig = plt.figure() # create figure

ax0 = fig.add_subplot(1, 2, 1) # add subplot 1 (1 row, 2 columns, first plot)
ax1 = fig.add_subplot(1, 2, 2) # add subplot 2 (1 row, 2 columns, second plot). See tip below**

# Subplot 1: Box plot
df_CI.plot(kind='box', color='blue', vert=False, figsize=(20, 6), ax=ax0) # add to subplot 1
ax0.set_title('Box Plots of Immigrants from China and India (1980 - 2013)')
ax0.set_xlabel('Number of Immigrants')
ax0.set_ylabel('Countries')

# Subplot 2: Line plot
df_CI.plot(kind='line', figsize=(20, 6), ax=ax1) # add to subplot 2
ax1.set_title ('Line Plots of Immigrants from China and India (1980 - 2013)')
ax1.set_ylabel('Number of Immigrants')
ax1.set_xlabel('Years')

plt.show()