pie chart, box chart, scatter chart, subplot

Certificate/data science-IBM

pie chart, box chart, scatter chart, subplot

Olivia-BlackCherry 2023. 5. 9. 18:41

pie chart
box plots
scatter plots

pie chart

pie chart를 만들기 위해서는 split, apply, combine 과정을 거친다.

df_continents = df_can.groupby('Continent', axis=0).sum()

파이차트를 효과적으로 만들어줄 것들이 있다.

autopct - is a string or function used to label the wedges with their numeric value. The label will be placed inside the wedge. If it is a format string, the label will be fmt%pct. - % 표시
startangle - rotates the start of the pie chart by angle degrees counterclockwise from the x-axis. -시작각도
shadow - Draws a shadow beneath the pie (to give a 3D feel).- 그림자
legend - 범례추가
pctdistance 파이차트 바깥에 %넣기
color
explode

# autopct create %, start angle represent starting point
df_continents['Total'].plot(kind='pie',
                            figsize=(5, 6),
                            autopct='%1.1f%%', # add in percentages
                            startangle=90,     # start angle 90° (Africa)
                            shadow=True,       # add shadow      
                            )

plt.title('Immigration to Canada by Continent [1980 - 2013]')
plt.axis('equal') # Sets the pie chart to look like a circle.

plt.show()

colors_list = ['gold', 'yellowgreen', 'lightcoral', 'lightskyblue', 'lightgreen', 'pink']
explode_list = [0.1, 0, 0, 0, 0.1, 0.1] # ratio for each continent with which to offset each wedge.

df_continents['Total'].plot(kind='pie',
                            figsize=(15, 6),
                            autopct='%1.1f%%', 
                            startangle=90,    
                            shadow=True,       
                            labels=None,         # turn off labels on pie chart
                            pctdistance=1.12,    # the ratio between the center of each pie slice and the start of the text generated by autopct 
                            colors=colors_list,  # add custom colors
                            explode=explode_list # 'explode' lowest 3 continents
                            )

# scale the title up by 12% to match pctdistance
plt.title('Immigration to Canada by Continent [1980 - 2013]', y=1.12) 

plt.axis('equal') 

# add legend
plt.legend(labels=df_continents.index, loc='upper left') 

plt.show()

box plots

# horizontal box plots
df_CI.plot(kind='box', figsize=(10, 7), color='blue', vert=False)

plt.title('Box plots of Immigrants from China and India (1980 - 2013)')
plt.xlabel('Number of Immigrants')

plt.show()

scatter plots

기본 산점도 plot이다.

df_tot.plot(kind='scatter', x='year', y='total', figsize=(10, 6), color='darkblue')

여기에 numpy의 polyfit() 함수를 써보자.

degree는 차수를 뜻한다.

polynomial적인 의미이다.

polyfit 함수로 학습을 하면 1차 함수의 a, x 값을 도출한다.

x = df_tot['year']      # year on x-axis
y = df_tot['total']     # total on y-axis
fit = np.polyfit(x, y, deg=1)

여기에 annotate를 달고, 직선을 그려보자.

plt.plot(x, fit[0] * x + fit[1], color='red') # recall that x is the Years
plt.annotate('y={0:.0f} x + {1:.0f}'.format(fit[0], fit[1]), xy=(2000, 150000))

plt.show()

subplot

여러 개의 plot을 만들고 싶을 때 figure() 함수를 쓰는데, 여기는 artist layer이다.

전형적인 문장은

fig = plt.figure() #create figure

ax=fig.add_subplot(nrows, ncols, plot_number) #create sublot

nrows, ncols는 행과 열의 전체 개수를 뜻한다. nrows*ncols 가 subplot의 전체개수

plot number는 해당 플랏이 몇 번째 순서인지를 확인한다.

subplot(211)==subplot(2,1,1) 똑같은 의미이다.

fig = plt.figure() # create figure

ax0 = fig.add_subplot(1, 2, 1) # add subplot 1 (1 row, 2 columns, first plot)
ax1 = fig.add_subplot(1, 2, 2) # add subplot 2 (1 row, 2 columns, second plot). See tip below**

# Subplot 1: Box plot
df_CI.plot(kind='box', color='blue', vert=False, figsize=(20, 6), ax=ax0) # add to subplot 1
ax0.set_title('Box Plots of Immigrants from China and India (1980 - 2013)')
ax0.set_xlabel('Number of Immigrants')
ax0.set_ylabel('Countries')

# Subplot 2: Line plot
df_CI.plot(kind='line', figsize=(20, 6), ax=ax1) # add to subplot 2
ax1.set_title ('Line Plots of Immigrants from China and India (1980 - 2013)')
ax1.set_ylabel('Number of Immigrants')
ax1.set_xlabel('Years')

plt.show()

저작자표시 비영리 변경금지

'Certificate > data science-IBM' 카테고리의 다른 글

dashboard (0)	2023.05.11
waffle chart, word clouds, regplot, folium, choropleth maps (0)	2023.05.10
area plot, histogram, bar chart, annotate (0)	2023.05.09
data visualization with python, matplotlib architecture, %matplotlib inline (0)	2023.05.05
Model Evaluation, refinement, overfitting, underfiiting, grid search, hyperparameters, ridge regression, polynomial transform (0)	2023.05.05

현재글pie chart, box chart, scatter chart, subplot

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

올리비아 코딩스쿨