Taemin Lee

Researcher @ HIAI Research.

Currently

Senior Researcher at Human-Inspired AI Research Registered Director at Next Generation Artificial Intelligence Convergence Association

Specialized in

Programming, Artificial Intelligence, Database

Research interests

Neural Networks, Natural Language Processing

Education

2004-2008 BS Computer Science Engineer (3.46/4.50)

Korea University, Seoul, Korea

2008-2020 MS Computer Science (4.44/4.50)

Thesis: The Method for Predicting Real Estate Price Using Recurrent Neural Network

Advisor: Prof. Soonyoung Jung

Korea University, Seoul, Korea

Awards

2015 Best Paper, “Design and Implementation of Simulator for educating Parallel K-Means”, Korean Association of Computer Education Conference

2013 3rd place in the plagiarism detection competetion, “CopyCaptor : Plagiarized Source Retrieval System using Global word frequency and Local feedback”, Notebook for PAN at CLEF 2013

Experience

2022~ Teaching Assistant (Korea University, Korea)

2019~ Part-time Lecturer (Korea National Open University, Korea)

2019~ Teaching Assistant (Korea National Open University, Korea)

2011-2014 Military service exempted technical research personnel (Korea University, Seoul, Korea)

Open Source Contributions

2024 huggingface chat-ui

2024 vllm

2024 Massive Text Embedding Benchmark

2024 GLiNER

2020 Korean Sentence Splitter

Projects

2024 GLiNER-ko (HIAI Research)

2024 Ko-Gemma (HIAI Research)

2024- 벡터 임베딩 구축과 유사도 검색 원천기술 개발 및 프롬프팅 노하우 (온더아이티, HIAI Research)

2023- Gaon-Large Language Model (가온플랫폼, HIAI Research)

2023 AI를 이용한 광상 부존지 예측 모델 (KIGAM, HIAI Research)

2023- 건설기준 디지털 전환을 위한 건설기준 라이브러리 및 온톨로지 구축활용 기술개발 연구 (한국건설기술연구원, HIAI Research)

2023 KULLM (HIAI Reserach)

2021- Human Inspired AI Resarch, (NRF, HIAI Research)

2022-2023 Development of Interactive Book Recommendation Device Through AI-based Voice Analysis (IITP, CHAEUM CNI, HIAI Research)

2022-2023 건설기준 디지털화 구축 데이터 표준화 연구, (한국건설기술연구원, HIAI Research)

2021-2022 Development of AI customer center using MRC based automatic question answering system, (IITP, O2O, HIAI Research)

2021-2022 Prediction system for real time online marketing perform based on AI, (IITP, BizSpring, HIAI Research)

2021 EBS AI 학습진단 시스템 모델 검증 및 개선방안 연구, (EBS, HIAI Research)

2020-2022 대면면접 특성이 반영된 AI구조화 면접 및 점수화 시스템, (IITP, Withmind, HIAI Research)

2019-2020 REPAN, real estate price prediction, (REPAN, WITHCAT)

2016-2020 Genetree Realtime PCR Viewer, (Genetree, WITHCAT)

2012-2020 Biosewoom Realtime PCR viewer, (Biosewoom, KnP Solution, WITHCAT)

2015-2020 The development of chemical accident behavior analysis/damage prediction models and environmental risk map technology (Korea Environmental Industry and Technology Institute, Korea University)

2018-2019 Educhain: 블록체인 기반의 탈중앙화 온라인 저지 코딩 DAPP 연구, (NRF, Korea University)

2018 Roboreport, (Samsung startup incubating program, Connectdot)

2016-2018 Multidimensional Semantics-based Predictive Mobility Aggregation Analysis (National Research Foundation of Korea, Principal Investigator: Prof. Soonyoung Jung, Korea University)

2016-2018 A research of personalized coding learning system using Lexical-Syntactic-Semantic feature extraction from code for developing algorithmic thinking, (NRF, Korea University)

2017 Language rehabilitation training, (MOHW, MAAUM)

2016-2017 Unitek Lion6, (Unitek, KnP Solution)

2015 Language rehabilitation program, (MOHW, Korea University)

2013-2015 차세대 LBS 지원을 위한 초대용량 이동궤적 데이터의 클러스터 기반 분산 관리 및 질의 처리 기술 연구, (NRF, Korea University)

2012-2013 융합형콘텐츠의 구조적 분석과 협력지원/R&D 체계 개발, (NRF, Korea University)

2011-2013 실시간 위치기반 모바일 증강현실 서비스 지원을 위한 3차원 공간질의 기술 개발, (NRF, Korea University)

2008-2012 환경독성과 내분비계/뇌신경계 질환개발과 환경독성 통합 데이터베이스 구축, (NRF, Korea University)

Publications

2024 “Intelligent Predictive Maintenance RAG framework for Power Plants: Enhancing QA with StyleDFS and Domain Specific Instruction Tuning”, SeongTae Hong, Shin Joong Min, Jaehyung Seo, Taemin Lee, Jeongbae Park, Cho Man Young, Byeongho Choi, Heuiseok Lim, The 2024 Conference on Empirical Methods in Natural Language Processing (Industry Track)

2024 “KoE5: A New Dataset and Model for Improving Korean Embedding Performance”, Youngjoon Jang, Junyoung Son, Chanjun Park, Soonwoo Choi, Byeonggoo Lee, Taemin Lee, Heuiseok Lim, Proceedings of the 36th Annual Conference on Human and Cognitive Language Technology, pp. 239-244

2024 “Building Korean Embedding Benchmarks with Large Language Models”, Junyoung Son, Youngjoon Jang, Soonwoo Choi, Byeonggoo Lee, Taemin Lee, Heuiseok Lim, Proceedings of the 36th Annual Conference on Human and Cognitive Language Technology, pp. 182-186

2024 “X-Optimization for Language Transfer”, Jungseob Lee1, Hyeonseok Moon1, Taemin Lee, Kinam Park, Heuiseok Lim, Proceedings of the 36th Annual Conference on Human and Cognitive Language Technology, pp. 89-93

2024 “자연어 처리 기술을 활용한 인구 고령화 관련 토픽 분석”, 박현정(Hyunjung Park), 이태민(Taemin Lee), 임희석(Heuiseok Lim), 한국IT서비스학회지, 23(1), 55-79. 10.9716/KITS.2024.23.1.055

2023 “Assessing the Retrieval-based Generation Capabilities of Large Language Models: A Call for a New Benchmark”, Jungseob Lee, Junyoung Son, Taemin Lee, Chanjun Park, Myunghoon Kang, Yuna Hur, ICICPE 2023, 9-11

2023 “Evaluating ChatGPT Proficiency in Table QA: Exploring Generative Models in Structured Data Understanding”, JoongMin Shin, Yuna Hur, Taemin Lee, Sungmin Ahn, JeongBae Park, ICICPE 2023

2023 “KULLM: Learning to Construct Korean Instruction-following Large Language Models”, Seungjun Lee, Taemin Lee, Jeongwoo Lee, Yoonna Jang, Heuiseok Lim, Annual Conference on Human and Language Technology, 196-202

2023 “KFREB: Korean Fictional Retrieval-based Evaluation Benchmark for Generative Large Language Models”, Jungseob Lee1, Junyoung Son1, Taemin Lee, Chanjun Park, Myunghoon Kang, Jeongbae Park, Heuiseok Lim, Annual Conference on Human and Language Technology, 9-13

2023 “QA Pair Passage RAG-based LLM Korean chatbot service”, Joongmin Shin, Jaewwook Lee, Kyungmin Kim, Taemin Lee, Sungmin Ahn, JeongBae Park, Heuiseok Lim, Annual Conference on Human and Language Technology, 683-689

2023 “Comparative analysis of large language model Korean quality based on zero-shot learning”, Yuna Hur, Aram So, Taemin Lee, Joongmin Shin, JeongBae Park, Kinam Park, Sungmin Ahn, Heuiseok Lim, Annual Conference on Human and Language Technology, 722-725

2022 “Return on Advertising Spend Prediction with Task Decomposition-Based LSTM Model”, Hyeonseok Moon, Taemin Lee, Jaehyung Seo, Chanjun Park, Sugyeong Eo, Imatitikua D. Aiyanyo, Jeongbae Park, Aram So, Kyoungwha Ok and Kinam Park, Mathematics, 1637

2022 “Dense-to-Question and Sparse-to-Answer: Hybrid Retriever System for Industrial Frequently Asked Questions”, Jaehyung Seo, Taemin Lee Hyeonseok Moon,Chanjun Park, Sugyeong Eo, Imatitikua D. Aiyanyo, Kinam Park, Aram So, Sungmin Ahn and Jeongbae Park, Mathematics, 1335

2021 “Semantic Similarity Retriever for Industrial Question Answering System”, Taemin Lee, Jeongbae Park, Jaehyung Seo, Heuiseok Lim, International conference on interdisciplinary research on computer science, psychology, and education

2021 “Prediction of Return on Advertising Spend Leveraging Keyword Information”, Hyeonseok Moon, Sugyeong Eo, Taemin Lee, Kinam Park, Jaehwa Chung, Heuiseok Lim, International conference on interdisciplinary research on computer science, psychology, and education

2020 “Korean Q&A Chatbot for COVID-19 News Domains Using Machine Reading Comprehension”, Lee, Taemin and Park, Kinam and Park, Jeongbae and Jeong, Younghee and Chae, Jeongmin and Lim, Heuiseok, Annual Conference on Human and Language Technology, 540-542

2018 “Information retrieval service for chemical accident response using messenger Chat bot”, Jung, Hyun-Do and Lee, Tae-Min and Choi, Woo-Sung and Jung, Soon-Young, Proceedings of the Korea Information Processing Society Conference, 467-469

2018 “The Retrieval of Regions with Similar Tendency in Geo-Tagged Dataset”, Taehyung Lim, Woosung Choi, Minseok Kim, Taemin Lee and Soonyoung Jung, CUTE 2019/CSA 2018 LNEE 536, 42-47

2018 “Spatial Clustering based Meteorological Fields Construction for Regional Vulnerability Assessment”, Lee, Taemin and Choi, Woosung and Sohn, Jongryuel and Moon, Kyongwhan and Byeon, Sanghoon and Lee, Wookyun and Jung, Soonyoung, INSIGHT-Indonesian Society for Knowledge and Human Development

2018 “Plagiarism Detection on Source Code using Deep Learning Sentence Embedding”, Taemin Lee ; Daeun Cha ; Soonyoung Jung ; Jeongmin Chae, International conference on interdisciplinary research on computer science, psychology, and education

2018 “PyTicker : Relative Normalized Price Ratio Monitor between Cryptocurrency Exchanges”, Taemin Lee International conference on interdisciplinary research on computer science, psychology, and education

2017 “Deep Representation of Raw Traffic Data: An Embed-and-Aggregate Framework for High-Level Traffic Analysis”, Woosung Choi, Jonghyeon Min, Taemin Lee, Kyeongseok Hyun, Taehyung Lim & Soonyoung Jung, CUTE 2017, CSA 2017: Advances in Computer Science and Ubiquitous Computing

2017 “A corpus-based approach to classifying emotions using Korean linguistic features”, Younghee Jung, Kinam Park, Taemin Lee, Jeongmin Chae & Soonyoung Jung, Cluster Computing, 583-595

2016 “Improvement of position accuracy of geocoded coordination based on Ensemble method”, Taemin Lee, Woosung Choi, Soonyoung Jung, 한국정보처리학회 2016년도 추계학술발표대회, 818-819

2015 “Online coding skill learning system for Teaching and learning C language”, Taemin Lee, Jeongmin Chae, Younghee Jung, Kinam Park, and Soonyoung Jung, 한국정보처리학회 2015년도 추계학술발표대회, 1659-1661

2015 “Design and Implementation of Simulator for educating Parallel K-Means”, Taemin Lee WooSung Choi, SoonYoung Jung, 한국컴퓨터교육학회 학술발표대회논문집, 101-105

2015 “DBSCAN 학습 시뮬레이터 설계 및 구현”, 최우성, 이태민, 정순영, 한국컴퓨터교육학회 학술발표대회 논문집 Vol.19 No. 1 (2015)-

2014 “Identifying non-elliptical entity mentions in a coordinated NP with ellipses”, Chae, Jeongmin and Jung, Younghee and Lee, Taemin and Jung, Soonyoung and Huh, Chan and Kim, Gilhan and Kim, Hyeoncheol and Oh, Heungbum, Journal of biomedical informatics, 139-152

2013 “CopyCaptor: Plagiarized source retrieval system using global word frequency and local feedback”, Lee, Taemin and Chae, Jeongmin and Park, Kinam and Jung, Soon Young, Working Notes for CLEF Conference

2013 “An unordered n-gram model for Plagiarism Detection”, Lee, Taemin and Chae, Jeong-min and Park, Ki-nam and Jung, Soon-Young, International Conference on Convergence Technology

2012 “A Study on the Key Competency of Converged Contents: Focused on the ICT and Eco-system Viewpoints”, Lee, Tae-Min and Jang, Hong-Jun and Jung, Young-hee and Kim, Ja-Mee and Jung, Soon-Young, Proceedings of the Korea Information Processing Society Conference, 1281-1283

2012 “Automatic extraction of user’s search intention from web search logs”, Park, Kinam and Jee, Hyesung and Lee, Taemin and Jung, Soonyoung and Lim, Heuiseok, Multimedia tools and applications

2012 “환경독성물질과 유전체 정보의 통합 연계 활용 시스템”, 정영희 and 이태민 and 채정민 and 정순영, 한국컴퓨터교육학회 학술발표대회논문집, 255-258

2011 “The partial matching method for effective recognizing HLA entities”, Chae, Jeong-Min and Jung, Young-Hee and Lee, Tae-Min and Chae, Ji-Eun and Oh, Heung-Bum and Jung, Soon-Young, The Journal of Korean Association of Computer Education, 83-94

2010 “Extracting Search Intentions from Web Search Logs”, Kinam Park, Taemin Lee, Soonyoung Jung, Sangyep Nam and Heuiseok Lim

2010 “Development of the QSTR model generation software using machine learning methods”, Chan Huh , Gilhan Kim, Jeongmin Chae, TaeMin Lee, YoungHee Jung, SoonYoung Jung and Hyeoncheol Kim, ICONIC 2010, (2010)

2009 “Automatic Extract User Intention from Web Search Log”, Kinam park, Soonyoung Jung, Taewon Suh, Hyseung Ji, Taemin Lee, Heuiseok Lim, 2010 2nd International Conference on Information Technology Convergence and Services, 21-32

2009 “Automatic extraction of HLA-disease interaction information from biomedical literature”, JeongMin Chae, JiEun Chae, Taemin Lee, YoungHee Jung, HeungBum Oh, SoonYoung Jung, Advances in Computational Science and Engineering (SCOPUS), pp. 219-230 (2009)

Patents

2022 “광고 투입에 관한 수익의 예측에 기초한 태스크 분리 방법 및 이를 수행하는 장치”, 문현석, 임희석, 박기남, 이태민

2021 “의미 유사도 기반 응답 획득 장치 및 방법”, 서재형, 임희석, 이태민, 박정배

2018 “메신저 CHAT BOT을 이용한 화학사고 대응정보 검색 서비스”, 정순영, 정현도, 이태민, 최우성, 김민석

Demos

2022 AI Sentiment Analysis (with XAI)

2022 AI Customer Response Evaluation (with XAI)

2021 공모주 시초/공모 예측

2021 covid 19 MRC + kakao talk chatbot demo

2020 covid 19 MRC demo

2020 일상대화 생성기 데모 (based on dialo gpt2)

2020 화장품 리뷰 생성기 데모 (based on CTRL)