Skip to content
View swoook's full-sized avatar

Block or report swoook

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, Du…

Rust 4,172 254 Updated Feb 12, 2025

Large Language Model Text Generation Inference

Python 9,734 1,139 Updated Feb 12, 2025

Secrets of RLHF in Large Language Models Part I: PPO

Python 1,314 95 Updated Mar 3, 2024

Speech Act and its Analysis for the (spoken) Korean Language: An Omnibus Description

7 Updated Apr 3, 2020

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Python 22,784 3,620 Updated Jul 28, 2024

Azure Cognitive Search + Azure OpenAI Accelerator

Jupyter Notebook 391 956 Updated Jan 28, 2025

ELECTRA기반 한국어 대화체 언어모델

Python 54 7 Updated Aug 4, 2021

KoLLaVA: Korean Large Language-and-Vision Assistant (feat.LLaVA)

Jupyter Notebook 285 30 Updated Sep 20, 2024

An Open-Ended Embodied Agent with Large Language Models

JavaScript 5,879 563 Updated Apr 3, 2024

A library that allows you to easily mock out tests based on AWS infrastructure.

Python 7,763 2,079 Updated Feb 11, 2025

[2021 훈민정음 한국어 음성•자연어 인공지능 경진대회] 대화요약 부문 알라꿍달라꿍 팀의 대화요약 학습 및 추론 코드를 공유하기 위한 레포입니다.

Python 128 22 Updated Jul 11, 2022

CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

Python 4,996 387 Updated Jan 31, 2025

Home of CodeT5: Open Code LLMs for Code Understanding and Generation

Python 2,888 438 Updated Jan 20, 2024

Google Research

Jupyter Notebook 34,873 8,005 Updated Feb 11, 2025

KoAlpaca: 한국어 명령어를 이해하는 오픈소스 언어모델 (KoAlpaca: An open-source language model to understand Korean instructions)

Jupyter Notebook 1,555 236 Updated Oct 25, 2024

공공 데이터 조회를 위한 오픈소스 파이썬 라이브러리

Jupyter Notebook 503 96 Updated Nov 30, 2024

The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

Python 1,306 73 Updated Jan 17, 2024

Physical Symbolic Optimization

Python 1,859 258 Updated Dec 6, 2024

An open collection of implementation tips, tricks and resources for training large language models

Python 468 23 Updated Mar 8, 2023

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,258 558 Updated Oct 28, 2024

Efficient Retrieval Augmentation and Generation Framework

Python 1,453 132 Updated Jan 9, 2025

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Python 38,813 5,542 Updated Feb 12, 2025

Named Entity Recognition Model for Naver NLP Challenge 2018 : BiLSTM-CRF model based Korean named entity tagger

Python 14 3 Updated Mar 24, 2023

날짜, 장소, 사람, 기관, 시간

23 1 Updated Jan 10, 2023

한국 금융감독원에서 운영하는 다트(Dart) 시스템 크롤링을 위한 라이브러리

Python 330 112 Updated Feb 10, 2025

NER Task with KoBERT (with Naver NLP Challenge dataset)

Python 98 34 Updated Jun 12, 2023

KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)

Jupyter Notebook 495 109 Updated Feb 11, 2024

🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)

Python 202 20 Updated Dec 28, 2023

김웅곤 - 텐서플로우와 케라스로 구현한 NLP 기초 (2020년 버전)

Jupyter Notebook 180 94 Updated Apr 8, 2021
Next