Sign In

Agents Learn Their Runtime: Interpreter Persistence as Training-Time Semantics

Created by
  • Haebom
Category
Empty

μ €μž

Victor May, Aaditya Salgarkar, Yishan Wang, Diganta Misra, Huu Nguyen

πŸ’‘ κ°œμš”

λ³Έ μ—°κ΅¬λŠ” 도ꡬ μ‚¬μš© LLM μ—μ΄μ „νŠΈμ˜ ν›ˆλ ¨ 방식과 μ‹€μ œ μ‹€ν–‰ ν™˜κ²½ κ°„μ˜ 간극을 μ‘°λͺ…ν•©λ‹ˆλ‹€. μ—μ΄μ „νŠΈκ°€ μΆ”λ‘ κ³Ό Python 싀행을 λ²ˆκ°ˆμ•„ μˆ˜ν–‰ν•  λ•Œ, μ‹€ν–‰ ν™˜κ²½μ—μ„œλŠ” 이전 λ‹¨κ³„μ˜ μƒνƒœκ°€ μœ μ§€λ˜μ§€λ§Œ 일반적인 ν›ˆλ ¨ 방식은 이λ₯Ό κ°„κ³Όν•˜κ³  μƒνƒœ μœ μ§€ λ©”μ»€λ‹ˆμ¦˜μ„ ν•™μŠ΅μ‹œν‚€μ§€ μ•Šμ•˜μŠ΅λ‹ˆλ‹€. μ—°κ΅¬νŒ€μ€ 절차적으둜 μƒμ„±λœ 'Opaque Knapsack'μ΄λΌλŠ” μ΅œμ ν™” 과제λ₯Ό 톡해 μƒνƒœ μœ μ§€μ˜ μ€‘μš”μ„±μ„ λΆ„μ„ν–ˆμœΌλ©°, ν›ˆλ ¨ μ‹œ μƒνƒœ μœ μ§€λ₯Ό λͺ…μ‹œμ μœΌλ‘œ ν•™μŠ΅μ‹œν‚¨ μ—μ΄μ „νŠΈκ°€ μ‹€μ œ μ‹€ν–‰ ν™˜κ²½μ—μ„œ 더 효율적이고 μ•ˆμ •μ μž„μ„ μž…μ¦ν–ˆμŠ΅λ‹ˆλ‹€.

πŸ”‘ μ‹œμ‚¬μ  및 ν•œκ³„

β€’
ν›ˆλ ¨ λ°μ΄ν„°μ˜ μ‹€ν–‰ 의미둠(execution semantics)κ³Ό μ‹€μ œ μΆ”λ‘  μ‹œμ μ˜ λŸ°νƒ€μž„ μƒνƒœ μœ μ§€κ°€ μΌμΉ˜ν•΄μ•Ό μ—μ΄μ „νŠΈμ˜ νš¨μœ¨μ„±κ³Ό μ•ˆμ •μ„±μ„ 높일 수 μžˆμŠ΅λ‹ˆλ‹€.
β€’
μƒνƒœ μœ μ§€ λ©”μ»€λ‹ˆμ¦˜μ„ ν›ˆλ ¨ μ‹œμ μ— ν•™μŠ΅μ‹œν‚€λ©΄, λŸ°νƒ€μž„μ—μ„œμ˜ λΆˆν•„μš”ν•œ μƒνƒœ μž¬κ³„μ‚°μ΄λ‚˜ 였λ₯˜ λ°œμƒμ„ 쀄여 토큰 μ‚¬μš©λŸ‰κ³Ό μˆ˜ν–‰ μ•ˆμ •μ„±μ„ κ°œμ„ ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
β€’
λ³Έ μ—°κ΅¬λŠ” νŠΉμ • μœ ν˜•μ˜ μ΅œμ ν™” λ¬Έμ œμ— κ΅­ν•œλ˜μ—ˆμœΌλ―€λ‘œ, λ‹€μ–‘ν•œ μ‹€μ œ 적용 μ‹œλ‚˜λ¦¬μ˜€μ—μ„œμ˜ μƒνƒœ μœ μ§€ ν•™μŠ΅ νš¨κ³Όμ— λŒ€ν•œ 좔가적인 검증이 ν•„μš”ν•©λ‹ˆλ‹€.
πŸ‘