읽은 목적

다가오는 AIMO phase3에서, 자연어 Math Word Problem을 Lean4 코드로 맵핑하기 위해 모델 훈련 레시피가 필요
어떤 데이터셋을 썻고, 어떤 훈련 방법을 썻는지?

Notes

방법론

자연어 수학 문제를 읽고 formal(Lean4로 된) statement를 생성
model scoring과 hypothesis rejection을 통해 생성한 statement 중high-quality만 남김
statement를 DeepSeek-prover가 Lean4로 된 proof을 생성. 생성한 proof은 Lean4로 돌려서 검증. 이렇게 하면 좋은 formalized statements, proof 쌍을 얻는다.
이렇게 얻은 자연어 수학문제 - Lean4 statements, proof 으로 DeepSeek-prover를 FT하고 더이상의 성능 상승이 없을때까지 1단계부터 다시 반복

이러면 초기 모델부터 proof을 어느정도 잘 생성한다는 소리가 되는데, 알고보니 DeepSeek-Math Base를 쓴다
3단계를 할 때, original statement과 그 반증을 동시에 트라이. 만일 반증이 성공하면 즉시 original statement를 버림으로서 효율적인 프로세스
MMA 데이터셋은 Lean4 mathlib에 있는 statements를 GPT-4를 이용해 자연어 문제로 역번역 한 데이터셋
MMA 데이터셋을 이용해 모델이 MWP를를 Lean4 formal statement로 번역하게 instruct

Prompt

Mathematical Problem in Natural Language:
{informal statement with answers}
Translate the problem to Lean4 (only the core declaration):
```Lean4
 
**Response**
{formal statement}

암튼 이런 방식으로 정제된 좋은 712073개의 데이터를 포함한 corpus 만듦
statement P에 대해, 를 동시에 증명 시도. 둘 중 하나 검증되면 증명 종료
miniF2F 데이터셋을 Lean4로 변환한 LeanDojo project 데이터셋이 있음

실험 세팅

Base model: DeepSeekMath-Base 7B (tranformer decoder 기반)
Corpus: 120B math-related tokens
batch size: 512
learning rate: 1e-4(constant)
warmup steps: 6000

Takeaways

위 훈련 방식을 grokking으로 적용하려면 MWP를 formal statement로 바꾸는 태스크, formal statement의 formal proof을 작성하는 task로 단계적으로 분리해야 할 듯
근데 loss는 뭘 썻는지 등 자세한 훈련 레시피는 안알려주네

RicePunchb's Garden

탐색기

DeepSeek-Prover 1

읽은 목적

Notes

방법론

실험 세팅

Takeaways

그래프 뷰

목차