Notice

Recent Posts

Recent Comments

Link

도개진 Git

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Archives

Today

Total

관리 메뉴

도찐개찐

[데이터시각화] 04. 막대그래프 본문

PYTHON/데이터분석

[데이터시각화] 04. 막대그래프

도개진 2023. 1. 2. 12:34

conda create -n ipywidgets_problem jupyterlab ipywidgets -y# 명목형 데이터 시각화

빈도에 따른 막대그래프
빈도비율에 따른 원그래프
빈도는 value_counts 또는 crosstab를 이용

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
# fontpath = '/usr/share/fonts/NanumGothic.ttf'

fontpath = '/home/bigdata/py39/lib/python3.9/site-packages/matplotlib/mpl-data/fonts/ttf/NanumGothic.ttf'
fname = mpl.font_manager.FontProperties(fname=fontpath).get_name()

mpl.rcParams['font.family'] = 'NanumGothic'
mpl.rcParams['font.size'] = 12
mpl.rcParams['axes.unicode_minus'] = False

# sns.set(rc = {'font.family':'NanumGothic'})

사원들의 직책 시각화

emps = pd.read_csv('../data/employees.csv')

# 사원들의 직책에 대한 빈도 추출
# emps.JOB_ID.value_counts() # 결과는 Series 객체로 return
emps_df = pd.DataFrame(emps.JOB_ID.value_counts())
emps_df

	JOB_ID
SA_REP	30
ST_CLERK	20
SH_CLERK	20
SA_MAN	5
IT_PROG	5
FI_ACCOUNT	5
PU_CLERK	5
ST_MAN	5
AD_VP	2
MK_REP	1
AC_MGR	1
PR_REP	1
HR_REP	1
AD_PRES	1
MK_MAN	1
AD_ASST	1
PU_MAN	1
FI_MGR	1
AC_ACCOUNT	1

# 빈도를 막대그래프로 시각화
print(emps_df.index, emps_df.JOB_ID)

Index(['SA_REP', 'ST_CLERK', 'SH_CLERK', 'SA_MAN', 'IT_PROG', 'FI_ACCOUNT',
       'PU_CLERK', 'ST_MAN', 'AD_VP', 'MK_REP', 'AC_MGR', 'PR_REP', 'HR_REP',
       'AD_PRES', 'MK_MAN', 'AD_ASST', 'PU_MAN', 'FI_MGR', 'AC_ACCOUNT'],
      dtype='object') SA_REP        30
ST_CLERK      20
SH_CLERK      20
SA_MAN         5
IT_PROG        5
FI_ACCOUNT     5
PU_CLERK       5
ST_MAN         5
AD_VP          2
MK_REP         1
AC_MGR         1
PR_REP         1
HR_REP         1
AD_PRES        1
MK_MAN         1
AD_ASST        1
PU_MAN         1
FI_MGR         1
AC_ACCOUNT     1
Name: JOB_ID, dtype: int64

# plt.bar(x축, 크기, 옵션)
colors = ['red', 'orange', 'yellow', 'green', 'blue', 'navy', 'purple']
plt.bar(emps_df.index, emps_df.JOB_ID, color=colors)
plt.xticks(rotation='vertical')
plt.show()

plt.bar(emps_df.index, emps_df.JOB_ID, color=colors)
plt.xticks(rotation='vertical')
plt.show()

Pandas에서 바로 시각화하기

객체명.plot.bar

emps_df.plot.bar()
plt.show()

seaborn으로 빈도 시각화

barplot
countplot

seaborn에서 제공하는 팔레트는 총 6가지

deep, bright, dark, muted, pastel, colorblind (내장)
palette = sns.color_palette({plaettetype})
별도 커스텀(템플릿) 색상 적용 필요시 참고 URL : matplotlib

palette = sns.color_palette('Paired', 12)

sns.barplot(emps_df, x=emps_df.JOB_ID.index, y=emps_df.JOB_ID, palette=palette);
plt.xticks(rotation='vertical')
plt.show()

sns.countplot(emps, x='JOB_ID', palette=palette)
plt.xticks(rotation='vertical')
plt.show()

교통사고 현황 데이터 시각화

발생지시도, 요일별 건수

car = pd.read_csv('../data/car_accient2016.csv')

Pandas 에서 특정 컬럼추출

Pandas 객체에서 특정컬럼들을 선택할때는 리스트 slicing 또는 iloc/loc 이용
객체명.iloc[:, [컬럼인덱스...]]
객체명.loc[:, [컬럼명...]]

car = car.loc[:, ['발생지시도', '요일']]
car.head()

	발생지시도	요일
0	경기	금
1	서울	일
2	충북	일
3	경북	월
4	경남	수

# car2 = car.iloc[:, [10, 4]]

# 1
df = pd.DataFrame({'sido': car['발생지시도'].value_counts()})
plt.bar(df.index, df.sido)
plt.xticks(rotation='vertical')
plt.title('제목쓰~')
plt.show()

sns.countplot(data=car, x='발생지시도')

<AxesSubplot:xlabel='발생지시도', ylabel='count'>

# 2
df = pd.DataFrame({'week': car['요일'].value_counts()})
plt.bar(df.index, df.week)
plt.xticks(rotation='vertical')
plt.title('제목쓰~')
plt.show()

sns.countplot(data=car, x='요일')

<AxesSubplot:xlabel='요일', ylabel='count'>

728x90

저작자표시

'PYTHON > 데이터분석' 카테고리의 다른 글

[데이터시각화] 07. 산점도 (0)	2023.01.02
[데이터시각화] 06. 선그래프 (0)	2023.01.02
[데이터분석] 03. 데이터 시각화 (0)	2023.01.02
[데이터분석] 02. 통계와 데이터 (0)	2023.01.02
[데이터 분석] 01. 통계와 데이터 (0)	2023.01.02

'PYTHON/데이터분석' Related Articles

Comments