Subscribe

Basic Python

Status
Python
Assignee
  • Jaenoo
Created by
  • Jaenoo
๐Ÿ’ฌ
์ƒˆํ•ด๋ฅผ ๋งž์ดํ•˜์—ฌ ๊ธฐ๋ณธ์— ์ถฉ์‹คํ•˜๊ธฐ ์œ„ํ•œ ํ•œ๋‹ฌ back to basic til project

์ค€๋น„

โ€ข
setting
โ—ฆ
Python 3.xx
โ—ฆ
Jupyter Notebook(or JupyterLab)
โ—ฆ
๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ: numpy pandas requests beautifulsoup4
โ€ข
ํด๋”๊ตฌ์กฐ
โ—ฆ
notebooks/
โ—ฆ
src/
โ—ฆ
data/raw, data/processed
โ—ฆ
reports

road map

1.
๊ธฐ์ดˆ ๋ฌธ๋ฒ• & ํƒ€์ž… & ๋ฌธ์ž์—ด & ์ฃผํ”ผํ„ฐ
2.
๋ฐ์ดํ„ฐ ๊ตฌ์กฐ(list/tuple/dict/set
3.
์กฐ๊ฑด/๋ฃจํ”„/ํ•จ์ˆ˜/์˜ˆ์™ธ/OOP
4.
ํŒŒ์ผ ์ž…์ถœ๋ ฅ + Numpy/Pandas
5.
REST API + ์›น ์Šคํฌ๋ž˜ํ•‘ + ์ข…ํ•ฉ

practice

1 week

20260120 - ํŒŒ์ด์ฌ ๊ธฐ์ดˆ + ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ
โ€ข
๋ชฉํ‘œ : laptop์—์„œ ์ฝ”๋“œ ์‹คํ–‰์ด ์ž์—ฐ์Šค๋Ÿฝ๊ณ  ํƒ€์ž…/๋ณ€์ˆ˜/์—ฐ์‚ฐ์— ์ต์ˆ™ํ•ด์ง€๊ธฐ!
์ •์ˆ˜/์‹ค์ˆ˜/๋ฌธ์ž์—ด/bool ํƒ€์ž… ๋ณ€ํ™˜
code
def show(title: str) -> None: print("\n" + "=" * 10, title, "=" * 10) def main() -> None: show("type check") a = 10 b = 3.14 c = "123" d = True print(a, type(a)) print(b, type(b)) print(c, type(c)) print(d, type(d)) show("type casting") print(int("123"), type(int("123"))) print(float("3.14"), type(float("3.14"))) print(str(10), type(str(10))) show("bool rules") print(bool(""), bool("0"), bool("hello")) print(bool(0), bool(1), bool(-1)) print(bool([]), bool([1])) print(bool({}), bool({"a": 1})) show("mission 1: ' 003 ' -> 3") s = " 003 " x = int(s.strip()) print(x, type(x)) show("mission 2: '3.0' -> 3") y = int(float("3.0")) print(y, type(y)) if __name__ == "__main__": main()
์—ฐ์‚ฐ์ž ์šฐ์„ ์ˆœ์œ„ ๊ณต๋ถ€
code
def show(title: str) -> None: print("\n" + "=" * 10, title, "=" * 10) def main() -> None: show("precedence basics") print("10 + 2 * 3 =", 10 + 2 * 3) print("(10 + 2) * 3 =", (10 + 2) * 3) print("2 ** 3 ** 2 =", 2 ** 3 ** 2) # ์˜ค๋ฅธ์ชฝ ๊ฒฐํ•ฉ show("division") print("7 / 2 =", 7 / 2) print("7 // 2 =", 7 // 2) print("7 % 2 =", 7 % 2) show("comparisons + logic") a, b = 12, 3 print("a > 10 and b < 5 =", a > 10 and b < 5) print("not (a > 10 and b < 5) =", not (a > 10 and b < 5)) print("a == 12 or b == 5 =", a == 12 or b == 5) show("mission: classify number") n = 0 if n > 0: print("positive") elif n == 0: print("zero") else: print("negative") if __name__ == "__main__": main()
print formatting
code
def main() -> None: name = "thsqh" price = 12900 qty = 3 total = price * qty print("\n1) f-string basics") print(f"name={name}") print(f"price={price}, qty={qty}, total={total}") print("\n2) float formatting") latency_ms = 123 print(f"latency={latency_ms}ms, latency_sec={latency_ms/1000:.3f}s") print("\n3) alignment") for x in [3, 30, 300]: print(f"x right aligned: |{x:>6}|") print(f"x left aligned: |{x:<6}|") print("\n4) repr (!r) - ๊ณต๋ฐฑ/์ด์Šค์ผ€์ดํ”„ ํ™•์ธ์šฉ") s = " hello\tworld\n" print(f"raw: {s!r}") if __name__ == "__main__": main()
20260121 - ๋ฌธ์ž์—ด ์ง‘์ค‘
โ€ข
๋ชฉํ‘œ : ์ธ๋ฑ์‹ฑ/์Šฌ๋ผ์ด์‹ฑ/escape/formatting ์ฒด๋“
์Šฌ๋ผ์ด์‹ฑ์œผ๋กœ ์ด๋ฉ”์ผ์—์„œ ๋„๋ฉ”์ธ ์ถ”์ถœ
code
def show(title: str) -> None: print("\n" + "=" * 10, title, "=" * 10) def extract_user_domain(email: str) -> tuple[str, str]: email = email.strip() at = email.find("@") if at == -1: raise ValueError(f"Invalid email: {email!r}") user = email[:at] domain = email[at + 1 :] return user, domain def clean_csv_like(s: str) -> list[str]: # " a, b ,c " -> ["a", "b", "c"] s = s.strip() parts = [p.strip() for p in s.split(",")] return parts def parse_log(line: str) -> dict: # ์˜ˆ: '2026-01-21 09:12:33 INFO user_id=42 status=200 path="/api/v1/auth"' line = line.strip() date, time, level, rest = line.split(" ", maxsplit=3) out = {"date": date, "time": time, "level": level} # path="...": ๋”ฐ์˜ดํ‘œ ๊ฐ’ ๋จผ์ € ์ถ”์ถœ key = 'path="' if key in rest: start = rest.find(key) + len(key) end = rest.find('"', start) out["path"] = rest[start:end] rest = (rest[: rest.find(key)] + rest[end + 1 :]).strip() for kv in rest.split(): if "=" not in kv: continue k, v = kv.split("=", 1) v = v.strip() out[k] = int(v) if v.isdigit() else v return out def main() -> None: show("indexing / slicing") e = " admin@company.co.kr " user, domain = extract_user_domain(e) print(f"email={e!r} -> user={user!r}, domain={domain!r}") print("domain first char:", domain[0]) print("domain last char:", domain[-1]) show("split / join / strip / replace") s = " a, b ,c " parts = clean_csv_like(s) print("parts:", parts) joined = "|".join(parts) print("joined:", joined) print("replaced:", joined.replace("b", "B")) show("escape + formatting") msg = "He said: \"hello\"\nNext\tTabbed" print(msg) print(f"repr: {msg!r}") show("log parsing") log = '2026-01-21 09:12:33 INFO user_id=42 action=login latency_ms=123 status=200 path="/api/v1/auth"' print(parse_log(log)) if __name__ == "__main__": main()
split/join/strip/replace
code
def parse_log(line: str) -> dict: line = line.strip() date, time, level, rest = line.split(" ", maxsplit=3) out = {"date": date, "time": time, "level": level} i = 0 n = len(rest) while i < n: # ๊ณต๋ฐฑ ์Šคํ‚ต while i < n and rest[i] == " ": i += 1 if i >= n: break # key ์ฝ๊ธฐ eq = rest.find("=", i) if eq == -1: break key = rest[i:eq] i = eq + 1 # value ์ฝ๊ธฐ (๋”ฐ์˜ดํ‘œ๋ฉด "..." ๋๊นŒ์ง€, ์•„๋‹ˆ๋ฉด ๊ณต๋ฐฑ ์ „๊นŒ์ง€) if i < n and rest[i] == '"': i += 1 endq = rest.find('"', i) if endq == -1: raise ValueError("Unclosed quote in log value") val = rest[i:endq] i = endq + 1 else: # ๊ณต๋ฐฑ ์ „๊นŒ์ง€ j = i while j < n and rest[j] != " ": j += 1 val = rest[i:j] i = j out[key] = int(val) if val.isdigit() else val return out
๊ฐ„๋‹จํ•œ ๋กœ๊ทธ ํ•œ ์ค„ ํŒŒ์‹ฑ
code
# src/day2_strings.py import json from pathlib import Path from asyncio import log def show(title: str) -> None: print("\n" + "=" * 10, title, "=" * 10) def extract_user_domain(email: str) -> tuple[str, str]: email = email.strip() at = email.find("@") if at == -1: raise ValueError(f"Invalid email: {email!r}") user = email[:at] domain = email[at + 1:] return user, domain def clean_csv_like(s: str) -> list[str]: # " a, b ,c " -> ["a", "b", "c"] s = s.strip() parts = [p.strip() for p in s.split(",")] return parts def parse_log(line: str) -> dict: line = line.strip() date, time, level, rest = line.split(" ", maxsplit=3) out = {"date": date, "time": time, "level": level} i = 0 n = len(rest) while i < n: # ๊ณต๋ฐฑ ์Šคํ‚ต while i < n and rest[i] == " ": i += 1 if i >= n: break # key ์ฝ๊ธฐ eq = rest.find("=", i) if eq == -1: break key = rest[i:eq] i = eq + 1 # value ์ฝ๊ธฐ (๋”ฐ์˜ดํ‘œ๋ฉด "..." ๋๊นŒ์ง€, ์•„๋‹ˆ๋ฉด ๊ณต๋ฐฑ ์ „๊นŒ์ง€) if i < n and rest[i] == '"': i += 1 endq = rest.find('"', i) if endq == -1: raise ValueError("Unclosed quote in log value") val = rest[i:endq] i = endq + 1 else: # ๊ณต๋ฐฑ ์ „๊นŒ์ง€ j = i while j < n and rest[j] != " ": j += 1 val = rest[i:j] i = j out[key] = int(val) if val.isdigit() else val return out def validate(d: dict) -> list[str]: errors = [] status = d.get("status") user_id = d.get("user_id") path = d.get("path") if not (isinstance(status, int) and 100 <= status <= 599): errors.append(f"bad status={status}") if not (isinstance(user_id, int) and user_id > 0): errors.append(f"bad user_id={user_id}") if not (isinstance(path, str) and path.startswith("/")): errors.append(f"bad path={path!r}") return errors def main() -> None: raw_path = Path("data/raw/sample.log") ok_rows = [] bad_rows = [] for idx, line in enumerate(raw_path.read_text(encoding="utf-8").splitlines(), start=1): if not line.strip(): continue d = parse_log(line) errs = validate(d) if errs: bad_rows.append({"line_no": idx, "errors": errs, "row": d}) else: ok_rows.append(d) Path("data/processed/parsed_logs.json").write_text( json.dumps(ok_rows, ensure_ascii=False, indent=2), encoding="utf-8" ) Path("data/processed/parsed_logs_errors.json").write_text( json.dumps(bad_rows, ensure_ascii=False, indent=2), encoding="utf-8" ) print(f"OK={len(ok_rows)}, BAD={len(bad_rows)}") show("indexing / slicing") e = " admin@company.co.kr " user, domain = extract_user_domain(e) print(f"email={e!r} -> user={user!r}, domain={domain!r}") print("domain first char:", domain[0]) print("domain last char:", domain[-1]) show("split / join / strip / replace") s = " a, b ,c " parts = clean_csv_like(s) print("parts:", parts) joined = "|".join(parts) print("joined:", joined) print("replaced:", joined.replace("b", "B")) show("escape + formatting") msg = "He said: \"hello\"\nNext\tTabbed" print(msg) print(f"repr: {msg!r}") show("log parsing") log = '2026-01-21 09:12:33 INFO user_id=42 status=200 path="/api v1/auth" msg="login failed: bad password"' print(parse_log(log)) d = parse_log(log) assert 100 <= d.get("status", 0) < 599, f"bad status: {d.get('status')}" assert d.get("user_id", 0) > 0, f"bad user_id: {d.get('user_id')}" assert isinstance(d.get("path"), str) and d["path"].startswith( "/"), f"bad path: {d.get('path')}" print("quality checks: OK") print(d) out_path = Path("data/processed/parsed_log.json") out_path.write_text(json.dumps( d, ensure_ascii=False, indent=2), encoding="utf-8") print(f"saved -> {out_path}") if __name__ == "__main__": main()
20260122 - list/tuple
โ€ข
๋ชฉํ‘œ : ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ด๊ณ  ๊ฐ€๊ณตํ•˜๋Š” ์†๋ง› ์ตํžˆ๊ธฐ
append/extend/insert/pop/remove
์Šฌ๋ผ์ด์‹ฑ, ์ •๋ น key์‚ฌ์šฉ
list comprehension 10๋ฌธ์ œ ์ž๋ฌธ์ž๋‹ต
20260123 - dict/set
โ€ข
๋ชฉํ‘œ : key value ๊ธฐ๋ฐ˜ ์ฒ˜๋ฆฌ + ์ค‘๋ณต์ œ๊ฑฐ/์ง‘ํ•ฉ์—ฐ์‚ฐ
์นด์šดํŒ… ๋งŒ๋“ค๊ธฐ : dict
get, items, defaultdict
set์œผ๋กœ ์ค‘๋ณต์ œ๊ฑฐ + ๊ต์ง‘ํ•ฉ/์ฐจ์ง‘ํ•ฉ

โœ…1 week mini project - ํ…์ŠคํŠธ ๋ถ„์„๊ธฐ

โ€ข
๋ชฉํ‘œ :
โ—ฆ
์ž…๋ ฅ : ํ…์ŠคํŠธ ํŒŒ์ผ
โ—ฆ
์ถœ๋ ฅ : ๋‹จ์–ด ๋นˆ๋„ Top 20, ๋ฌธ์žฅ ์ˆ˜, ํ‰๊ท  ๋‹จ์–ด ๊ธธ์ด
โ—ฆ
์ €์žฅ : ๊ฒฐ๊ณผ๋ฌผ result.json์œผ๋กœ ์ €์žฅ

2 weeks

20260126 - ์กฐ๊ฑด/๋ถ„๊ธฐ (if)
โ€ข
๋ชฉํ‘œ : ๋น„๊ต/๋…ผ๋ฆฌ ์—ฐ์‚ฐ์œผ๋กœ ํ”„๋กœ๊ทธ๋žจ ํ๋ฆ„ ์ œ์–ด ์ตํžˆ๊ธฐ
โ–ก ๋น„๊ต ์—ฐ์‚ฐ์ž(==, !=, >, >=, <, <=) ์—ฐ์Šต
โ–ก ๋…ผ๋ฆฌ ์—ฐ์‚ฐ์ž(and/or/not) + ์šฐ์„ ์ˆœ์œ„ ํ™•์ธ
โ–ก if/elif/else๋กœ ๊ตฌ๊ฐ„๋ณ„ ์กฐ๊ฑด ์ฒ˜๋ฆฌ(์š”๊ธˆ/๋“ฑ๊ธ‰ ๊ณ„์‚ฐ๊ธฐ)
โ–ก ์กฐ๊ฑด์‹ ๋‹จ์ถ• ํ‘œํ˜„(์‚ผํ•ญ ์—ฐ์‚ฐ์ž) 3๋ฌธ์ œ
20260127 - ๋ฃจํ”„ (for/while)
โ€ข
๋ชฉํ‘œ : ๋ฐ˜๋ณต ํŒจํ„ด(๋ˆ„์ /ํƒ์ƒ‰/ํ•„ํ„ฐ)์„ ์†์— ์ตํžˆ๊ธฐ
โ–ก for ๊ธฐ๋ณธ(๋ฆฌ์ŠคํŠธ/๋ฌธ์ž์—ด/dict items ์ˆœํšŒ)
โ–ก while ๊ธฐ๋ณธ + break/continue ์‚ฌ์šฉ
โ–ก ๋ˆ„์ ํ•ฉ/์ตœ๋Œ“๊ฐ’/์ตœ์†Ÿ๊ฐ’ ์ง์ ‘ ๊ตฌํ˜„(๋‚ด์žฅํ•จ์ˆ˜ ์—†์ด 1๋ฒˆ)
โ–ก ์ค‘์ฒฉ ๋ฃจํ”„ 1๋ฌธ์ œ(2์ค‘๊นŒ์ง€๋งŒ)
20260128 - ํ•จ์ˆ˜ (functions)
โ€ข
๋ชฉํ‘œ : ์ž…๋ ฅ/์ถœ๋ ฅ ์„ค๊ณ„ + ์ฝ”๋“œ ์žฌ์‚ฌ์šฉ ์Šต๊ด€ ๋งŒ๋“ค๊ธฐ
โ–ก ํ•จ์ˆ˜ ์ •์˜/ํ˜ธ์ถœ, return ์ดํ•ด
โ–ก default/keyword arguments ์˜ˆ์ œ 5๊ฐœ
โ–ก "์œ ํšจ์„ฑ ๊ฒ€์‚ฌ ํ•จ์ˆ˜" 2๊ฐœ ๋งŒ๋“ค๊ธฐ(์˜ˆ: ์ด๋ฉ”์ผ/์ˆซ์ž๋ฒ”์œ„)
โ–ก ์ž‘์€ ๋กœ์ง 3๊ฐœ๋ฅผ ํ•จ์ˆ˜๋กœ ์ชผ๊ฐœ์„œ ๋ฆฌํŒฉํ† ๋ง
20260129 - ์˜ˆ์™ธ ์ฒ˜๋ฆฌ (try/except)
โ€ข
๋ชฉํ‘œ : "์•ˆ ์ฃฝ๋Š” ์ฝ”๋“œ" ๋งŒ๋“ค๊ธฐ(์‹คํŒจ ์ผ€์ด์Šค ๋‹ค๋ฃจ๊ธฐ)
โ–ก try/except/else/finally ํ๋ฆ„ ์ •๋ฆฌ
โ–ก FileNotFoundError / ValueError ์ฒ˜๋ฆฌ ์‹ค์Šต
โ–ก ์‚ฌ์šฉ์ž ์ž…๋ ฅ ์žฌ์‹œ๋„ ๋กœ์ง(์ตœ๋Œ€ 3ํšŒ)
โ–ก raise๋กœ ์˜ˆ์™ธ ๋ฐœ์ƒ์‹œํ‚ค๋Š” ์ผ€์ด์Šค 2๊ฐœ ๋งŒ๋“ค๊ธฐ
20260130 - OOP ๊ธฐ์ดˆ (class)
โ€ข
๋ชฉํ‘œ : ๊ฐ์ฒด/ํด๋ž˜์Šค๋กœ ํ˜„์‹ค ๋ฌธ์ œ ๋ชจ๋ธ๋ง ๊ฐ ์žก๊ธฐ
โ–ก class / init / ๋ฉ”์„œ๋“œ / ์†์„ฑ ๊ฐœ๋… ์ •๋ฆฌ
โ–ก Dataset ํด๋ž˜์Šค ๋งŒ๋“ค๊ธฐ(load/save/summary)
โ–ก ๊ฐ์ฒด 2๊ฐœ ์ƒ์„ฑํ•ด์„œ ์ƒํƒœ(state) ๋ณ€ํ™” ํ™•์ธ
โ–ก (์„ ํƒ) repr ๊ตฌํ˜„ํ•ด์„œ ๋ณด๊ธฐ ์ข‹๊ฒŒ ์ถœ๋ ฅ

โœ…2 weeks mini project - CSV์ •๋ฆฌ ๋„๊ตฌ

โ€ข
๋ชฉํ‘œ : ํŒŒ์ผ I/O + ์˜ˆ์™ธ์ฒ˜๋ฆฌ + ํ•จ์ˆ˜(๋˜๋Š” ํด๋ž˜์Šค)๋กœ "์ž‘์€ ๋„๊ตฌ" ์™„์„ฑ
โ–ก input CSV ์ฝ๊ธฐ(read_csv ๋Œ€์‹  csv ๋ชจ๋“ˆ๋กœ 1๋ฒˆ ํ•ด๋ณด๊ธฐ ์„ ํƒ)
โ–ก ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ ๊ทœ์น™ ์ •ํ•˜๊ธฐ(์‚ญ์ œ/๋Œ€์ฒด)
โ–ก ์ˆซ์ž/๋‚ ์งœ ํ˜• ๋ณ€ํ™˜ + ์‹คํŒจ ์ผ€์ด์Šค ์˜ˆ์™ธ ์ฒ˜๋ฆฌ
โ–ก ๊ฒฐ๊ณผ๋ฅผ cleaned.csv๋กœ ์ €์žฅ + ์š”์•ฝ ๋ฆฌํฌํŠธ(report.txt) ์ƒ์„ฑ
โ–ก README์— "์‚ฌ์šฉ๋ฒ•/์ž…๋ ฅ/์ถœ๋ ฅ/๊ทœ์น™" ๊ฐ„๋‹จํžˆ ์ž‘์„ฑ

3 weeks

20260202 - NumPy ํ•ต์‹ฌ
โ€ข
๋ชฉํ‘œ : ๋ฐฐ์—ด ์‚ฌ๊ณ ๋ฐฉ์‹ + ๋ฒกํ„ฐํ™” ๋ง›๋ณด๊ธฐ
โ–ก ndarray ์ƒ์„ฑ/shape/dtype ๋‹ค๋ค„๋ณด๊ธฐ
โ–ก ์ธ๋ฑ์‹ฑ/์Šฌ๋ผ์ด์‹ฑ/boolean mask 5๋ฌธ์ œ
โ–ก ๊ธฐ๋ณธ ํ†ต๊ณ„(mean/std/min/max) + ์กฐ๊ฑด ํ•„ํ„ฐ๋ง
โ–ก (์„ ํƒ) ๋ธŒ๋กœ๋“œ์บ์ŠคํŒ… ์˜ˆ์ œ 2๊ฐœ
20260203 - Pandas I (์ž…์ถœ๋ ฅ/์กฐํšŒ/์ •์ œ)
โ€ข
๋ชฉํ‘œ : ํ‘œ ๋ฐ์ดํ„ฐ ๊ธฐ๋ณธ ์กฐ์ž‘ ๋ฃจํ‹ด ๋งŒ๋“ค๊ธฐ
โ–ก read_csv/read_json + to_csv ์ €์žฅ
โ–ก loc/iloc ์ฐจ์ด ์˜ˆ์ œ๋กœ ์ •๋ฆฌ
โ–ก ํ•„ํ„ฐ๋ง(์กฐ๊ฑด ์—ฌ๋Ÿฌ๊ฐœ) 5๋ฌธ์ œ
โ–ก ๊ฒฐ์ธก์น˜ isna/fillna/dropna ๋น„๊ต ์ •๋ฆฌ
20260204 - Pandas II (groupby/merge)
โ€ข
๋ชฉํ‘œ : ๋ถ„์„์˜ ํ•ต์‹ฌ(์ง‘๊ณ„/์กฐ์ธ) ํŒจํ„ด ์ตํžˆ๊ธฐ
โ–ก groupby ์ง‘๊ณ„ 5๋ฌธ์ œ(count/sum/mean)
โ–ก pivot_table 1๊ฐœ ๋งŒ๋“ค์–ด๋ณด๊ธฐ(์„ ํƒ)
โ–ก merge๋กœ ์กฐ์ธ 2๋ฒˆ(๊ธฐ๋ณธํ‚ค/๋ณตํ•ฉํ‚ค ๋А๋‚Œ)
โ–ก "์ „์ฒ˜๋ฆฌ โ†’ ์ง‘๊ณ„ โ†’ ๊ฒฐ๊ณผ ์ €์žฅ" ํ๋ฆ„ 1๋ฒˆ ์™„์ฃผ
20260205 - REST API (requests)
โ€ข
๋ชฉํ‘œ : API์—์„œ JSON ๋ฐ›์•„์„œ ์ €์žฅ/๋ถ„์„๊นŒ์ง€
โ–ก requests.get + status_code/timeout ์ฒ˜๋ฆฌ
โ–ก params(querystring) ๋„ฃ์–ด์„œ ํ˜ธ์ถœ
โ–ก JSON ํŒŒ์‹ฑ โ†’ DataFrame ๋ณ€ํ™˜ โ†’ csv ์ €์žฅ
โ–ก ์‹คํŒจ ์ผ€์ด์Šค(429/500 ๋“ฑ) ๋Œ€๋น„ํ•ด์„œ ์˜ˆ์™ธ ์ฒ˜๋ฆฌ
20260206 - ์›น ์Šคํฌ๋ž˜ํ•‘ (BeautifulSoup)
โ€ข
๋ชฉํ‘œ : HTML์—์„œ ์›ํ•˜๋Š” ๋ฐ์ดํ„ฐ ๋ฝ‘์•„์˜ค๊ธฐ
โ–ก soup ๋งŒ๋“ค๊ธฐ + find/select ์ฐจ์ด ์ •๋ฆฌ
โ–ก ๋ฆฌ์ŠคํŠธ/ํ‘œ(table)์—์„œ ๋ฐ์ดํ„ฐ ์ถ”์ถœ
โ–ก ์ถ”์ถœ ๊ฒฐ๊ณผ๋ฅผ DataFrame์œผ๋กœ ๋ณ€ํ™˜ ํ›„ ์ €์žฅ
โ–ก robots.txt/์š”์ฒญ ๊ฐ„๊ฒฉ(๋งค๋„ˆ) ์ฒดํฌ ๋ฉ”๋ชจ ๋‚จ๊ธฐ๊ธฐ

โœ…3 weeks capstone : ์›น ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ โ†’ ์ •์ œ โ†’ ๋ถ„์„

โ€ข
๋ชฉํ‘œ : "API + ์Šคํฌ๋ž˜ํ•‘ + Pandas" ํŒŒ์ดํ”„๋ผ์ธ 1๊ฐœ ์™„์„ฑ
โ–ก API ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘(JSON) โ†’ raw ์ €์žฅ
โ–ก ์Šคํฌ๋ž˜ํ•‘ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘(HTML) โ†’ raw ์ €์žฅ
โ–ก ์ •์ œ ํ›„ merge + groupby๋กœ ์š”์•ฝ ํ…Œ์ด๋ธ” ์ƒ์„ฑ
โ–ก processed.csv ์ €์žฅ + ๋…ธํŠธ๋ถ ๋ฆฌํฌํŠธ(TIL) ์ž‘์„ฑ
โ–ก ์ตœ์†Œ ์กฐ๊ฑด: merge 1ํšŒ, groupby 1ํšŒ, ์˜ˆ์™ธ์ฒ˜๋ฆฌ 1ํšŒ ํฌํ•จ

4 weeks

20260209 - ๋ฆฌํŒฉํ† ๋ง(ํ•จ์ˆ˜/๋ชจ๋“ˆํ™”) + ๊ตฌ์กฐ ์žก๊ธฐ
โ€ข
๋ชฉํ‘œ : "๋…ธํŠธ๋ถ๋งŒ ์žˆ๋Š” ํ”„๋กœ์ ํŠธ" โ†’ "์žฌ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์ฝ”๋“œ"๋กœ ์ •๋ฆฌ
โ–ก notebooks/ src/ data/ reports/ ๊ตฌ์กฐ๋กœ ์ •๋ฆฌ
โ–ก ํ•ต์‹ฌ ๋กœ์ง์„ src๋กœ ์ด๋™(์ˆ˜์ง‘/์ •์ œ/๋ถ„์„ ํ•จ์ˆ˜ ๋ถ„๋ฆฌ)
โ–ก config(์ƒ์ˆ˜/๊ฒฝ๋กœ/URL) ๋ถ„๋ฆฌ
โ–ก ์žฌ์‹คํ–‰ํ•ด๋„ ๊ฐ™์€ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ค๋„๋ก(idempotent) ์ •๋ฆฌ
20260210 - ๋กœ๊น…/์—๋Ÿฌ์ฒ˜๋ฆฌ/CLI ๋ง›๋ณด๊ธฐ
โ€ข
๋ชฉํ‘œ : ์‹ค๋ฌด ๋А๋‚Œ(์‹คํ–‰/๋กœ๊ทธ/์—๋Ÿฌ)์„ ์ตœ์†Œ๋กœ๋ผ๋„ ๊ฐ–์ถ”๊ธฐ
โ–ก print ๋Œ€์‹  logging ์ ์šฉ(๊ธฐ๋ณธ INFO ๋ ˆ๋ฒจ)
โ–ก ์‹คํŒจ ์ผ€์ด์Šค ๋ฉ”์‹œ์ง€ ์ •๋ฆฌ(์–ด๋””์„œ ์™œ ์‹คํŒจํ–ˆ๋Š”์ง€)
โ–ก argparse๋กœ ๊ฐ„๋‹จ CLI ๋งŒ๋“ค๊ธฐ(์˜ˆ: --start-date, --out)
โ–ก (์„ ํƒ) ์žฌ์‹œ๋„(backoff) 1ํšŒ ์ ์šฉ
20260211 - ํ…Œ์ŠคํŠธ/๋ฌธ์„œํ™”/README ํฌํŠธํด๋ฆฌ์˜ค ๋งˆ๊ฐ
โ€ข
๋ชฉํ‘œ : ๊นƒํ—ˆ๋ธŒ ์˜ฌ๋ ธ์„ ๋•Œ "๋ฐ”๋กœ ์ดํ•ด๋˜๋Š”" ์ƒํƒœ๋กœ ๋งˆ๊ฐ
โ–ก ํ•ต์‹ฌ ํ•จ์ˆ˜ 2~3๊ฐœ pytest๋กœ ํ…Œ์ŠคํŠธ
โ–ก requirements.txt ์ž‘์„ฑ
โ–ก README์— ์„ค์น˜/์‹คํ–‰/์˜ˆ์‹œ ๊ฒฐ๊ณผ/๋ฐ์ดํ„ฐ ์ถœ์ฒ˜ ์ž‘์„ฑ
โ–ก TIL์— "๋ฐฐ์šด ์ /ํŠธ๋Ÿฌ๋ธ”์ŠˆํŒ…/๊ฐœ์„  ํฌ์ธํŠธ" ์ •๋ฆฌ
โ–ก (์„ ํƒ) ๊ฒฐ๊ณผ ์š”์•ฝ ๊ทธ๋ž˜ํ”„ 1๊ฐœ(matplotlib) ์ถ”๊ฐ€