feat: jsoup 크롤러 api 구현#2
Merged
Merged
Conversation
Paley-Z
approved these changes
May 27, 2026
Collaborator
Author
|
일단은 같은 프로젝트 디렉토리 내에서 관리를 하다가 범위가 더 커지고, 참고 부탁드리겠습니다~ |
hwangbohye03
approved these changes
May 27, 2026
2mhh
approved these changes
May 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request
작업 개요
훈련과정 개요영역의훈련대상자요건,훈련목표를 jsoup로 크롤링합니다.변경 사항
org.jsoup:jsoup:1.22.2의존성을 추가했습니다.POST /api/work24/training-course-overview/crawlAPI를 추가했습니다.app.work24.crawler하위에 추가했습니다.#traCrseinfo테이블에서 항목명을 기준으로 텍스트를 추출하고 줄바꿈을 정규화하도록 구현했습니다.관련 이슈
테스트
.\gradlew.bat test)체크리스트
API 변경 사항
POST /api/work24/training-course-overview/crawl{ "url": "https://www.work24.go.kr/hr/a/a/3100/selectTracseDetl.do?tracseId=AIG20250000501645&tracseTme=4&crseTracseSe=C0061&trainstCstmrId=500020021537", "outputPath": "build/crawled/work24-training-course-overview.json" }url,outputPath는 생략 가능하며 설정 기본값을 사용합니다.{ "sourceUrl": "https://www.work24.go.kr/hr/a/a/3100/selectTracseDetl.do?...", "savedPath": "C:\\dev\\AIBE5_FinalProject_Team5_BE\\build\\crawled\\work24-training-course-overview.json", "trainingTargetRequirements": "...", "trainingGoal": "...", "crawledAt": "2026-05-27T07:27:38.648283900Z" }IllegalStateException이 발생합니다. 공통 예외 응답 체계가 생기면 매핑 보강이 필요합니다.리뷰 포인트
#traCrseinfo table)가 영향을 받을 수 있습니다.