1. 2024 기상데이터 전처리

Author

이상민

Published

March 24, 2025

import pandas as pd

- 데이터 불러오기

wt= pd.read_csv("OBS_ASOS_TIM_20250322224121.csv",encoding="cp949")
wt
지점 지점명 일시 기온(°C) 강수량(mm) 풍속(m/s) 습도(%) 일사(MJ/m2)
0 146 전주 2024-01-01 01:00 3.8 NaN 1.5 80 NaN
1 146 전주 2024-01-01 02:00 3.9 NaN 0.2 79 NaN
2 146 전주 2024-01-01 03:00 3.5 NaN 0.4 84 NaN
3 146 전주 2024-01-01 04:00 1.9 NaN 1.1 92 NaN
4 146 전주 2024-01-01 05:00 1.4 NaN 1.5 94 NaN
... ... ... ... ... ... ... ... ...
8755 146 전주 2024-12-30 20:00 7.6 NaN 1.4 71 NaN
8756 146 전주 2024-12-30 21:00 7.5 NaN 1.7 69 NaN
8757 146 전주 2024-12-30 22:00 7.2 NaN 1.2 70 NaN
8758 146 전주 2024-12-30 23:00 7.2 NaN 1.7 71 NaN
8759 146 전주 2024-12-31 00:00 7.4 NaN 2.8 70 NaN

8760 rows × 8 columns

- null값 확인

wt.isnull().sum()
지점              0
지점명             0
일시              0
기온(°C)          0
강수량(mm)      7822
풍속(m/s)         0
습도(%)           0
일사(MJ/m2)    3967
dtype: int64
wt.columns
Index(['지점', '지점명', '일시', '기온(°C)', '강수량(mm)', '풍속(m/s)', '습도(%)',
       '일사(MJ/m2)'],
      dtype='object')

- 필요없는 변수 제거

wt=wt.drop(columns=['지점명','지점'])
wt.columns
Index(['일시', '기온(°C)', '강수량(mm)', '풍속(m/s)', '습도(%)', '일사(MJ/m2)'], dtype='object')

- column명 쉽게 변경

wt.columns=['일시','기온','강수량','풍속','습도','일사']
wt
일시 기온 강수량 풍속 습도 일사
0 2024-01-01 01:00 3.8 NaN 1.5 80 NaN
1 2024-01-01 02:00 3.9 NaN 0.2 79 NaN
2 2024-01-01 03:00 3.5 NaN 0.4 84 NaN
3 2024-01-01 04:00 1.9 NaN 1.1 92 NaN
4 2024-01-01 05:00 1.4 NaN 1.5 94 NaN
... ... ... ... ... ... ...
8755 2024-12-30 20:00 7.6 NaN 1.4 71 NaN
8756 2024-12-30 21:00 7.5 NaN 1.7 69 NaN
8757 2024-12-30 22:00 7.2 NaN 1.2 70 NaN
8758 2024-12-30 23:00 7.2 NaN 1.7 71 NaN
8759 2024-12-31 00:00 7.4 NaN 2.8 70 NaN

8760 rows × 6 columns

- null값 제거

- 강수량, 일사 변수만 null값이 있기때문에 비가 안올 때, 밤에 0으로 측정이 안되었다고 판단

- 0으로 채움

wt[['강수량', '일사']] = wt[['강수량', '일사']].fillna(0)
wt
일시 기온 강수량 풍속 습도 일사
0 2024-01-01 01:00 3.8 0.0 1.5 80 0.0
1 2024-01-01 02:00 3.9 0.0 0.2 79 0.0
2 2024-01-01 03:00 3.5 0.0 0.4 84 0.0
3 2024-01-01 04:00 1.9 0.0 1.1 92 0.0
4 2024-01-01 05:00 1.4 0.0 1.5 94 0.0
... ... ... ... ... ... ...
8755 2024-12-30 20:00 7.6 0.0 1.4 71 0.0
8756 2024-12-30 21:00 7.5 0.0 1.7 69 0.0
8757 2024-12-30 22:00 7.2 0.0 1.2 70 0.0
8758 2024-12-30 23:00 7.2 0.0 1.7 71 0.0
8759 2024-12-31 00:00 7.4 0.0 2.8 70 0.0

8760 rows × 6 columns

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# 그래프 크기 설정
fig, axes = plt.subplots(3, 2, figsize=(12, 12))
axes = axes.flatten()

# 각 변수별 시각화
for i, col in enumerate(wt.columns[1:]):
    sns.lineplot(data=wt[col], marker='o', ax=axes[i])
    axes[i].set_title(col)
    axes[i].set_ylabel(col)

# 빈 그래프 삭제
fig.delaxes(axes[-1])  

plt.tight_layout()
plt.show()
/root/anaconda3/envs/pypy/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/root/anaconda3/envs/pypy/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/root/anaconda3/envs/pypy/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/root/anaconda3/envs/pypy/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/root/anaconda3/envs/pypy/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/root/anaconda3/envs/pypy/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/root/anaconda3/envs/pypy/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/root/anaconda3/envs/pypy/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/root/anaconda3/envs/pypy/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):
/root/anaconda3/envs/pypy/lib/python3.10/site-packages/seaborn/_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
  with pd.option_context('mode.use_inf_as_na', True):

- csv 생성

wt.to_csv('weather2024.csv', index=False, encoding='utf-8-sig')