노트북 번역

ruke79 · Jul 31, 2021 · ca3e5f3 · ca3e5f3
1 parent aec01fe
commit ca3e5f3
Showing 1 changed file with 18 additions and 18 deletions.
diff --git a/notebooks/top_k.ipynb b/notebooks/top_k.ipynb
@@ -4,11 +4,11 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# The top k approach\n",
+    "# 탑-k 방법\n",
     "\n",
-    "The top-k method is a useful method to inspect a model's results. It simply consists of looking at the **most and least successful examples** to identify patterns within them. These patterns can then be used to engineer new features, or iterate on existing ones.\n",
+    "탑-k 방법은 모델의 결과를 조사하는 유용한 방법입니다. 단순하게 **가장 성공적인 샘플과 가장 성공적이지 않은 샘플**을 살펴 보고 그 안의 패턴을 찾는 것입니다. 이런 패턴을 사용해 새로운 특성을 고안하거나 기존 특성을 반복할 수 있습니다.\n",
     "\n",
-    "First, we load the data."
+    "먼저 데이터를 로드합니다."
    ]
   },
   {
@@ -45,7 +45,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Then, we add features and split the dataset."
+    "그다음 특성을 추가하고 데이터셋을 분할합니다."
    ]
   },
   {
@@ -62,7 +62,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We load the trained model, and vectorize the features."
+    "훈련된 모델을 로드하고 특성을 벡터화합니다."
    ]
   },
   {
@@ -122,13 +122,13 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Now, we'll use the top k method to look at:\n",
+    "이제 탑-k 방법을 사용해 다음을 조사합니다:\n",
     "\n",
-    "- The k best performing examples for each class (high and low scores)\n",
-    "- The k worst performing examples for each class\n",
-    "- The k most unsure examples, where our models prediction probability is close to .5\n",
+    "- 각 클래스에서 (높은 점수와 낮은 점수를 내는) k 개의 최상의 샘플\n",
+    "- 각 클래스에서 k 개의 최악의 샘플\n",
+    "- 모델 예측 확률이 0.5에 가까운 가장 불확실한 k 개 샘플\n",
     "\n",
-    "To read more about how plotting these particular examples can help with model iteration, please refer to Chapter 5 of the book."
+    "이런 특정 샘플을 출력하는 것이 모델 반복에 어떻게 도움이 되는지 알려면 이 책의 5장을 참고하세요."
    ]
   },
   {
@@ -163,7 +163,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Most confident correct positive predictions"
+    "가장 올바르게 정답을 맞춘 양성 예측"
    ]
   },
   {
@@ -276,7 +276,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Most confident correct negative predictions"
+    "가장 올바르게 정답을 맞춘 음성 예측"
    ]
   },
   {
@@ -389,9 +389,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "It seems most of the correct negative predictions have **short length**. This result reinforces the feature importance analysis which showed question length as one of the most important features.\n",
+    "올바르게 예측한 음성 예측의 대부분은 **길이가 짧습니다**. 이 결과는 가중 중요한 특성 중의 하나가 질문의 길이라는 특성 중요도 분석의 결과를 뒷받침합니다.\n",
     "\n",
-    "Let's look at the most confident incorrect negative predictions"
+    "가장 확실하게 틀린 양성 예측을 살펴 보죠."
    ]
   },
   {
@@ -504,9 +504,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "On the flipside, we find an overrepresentation of short questions with high scores in the examples our model got wrong.\n",
+    "반대로 모델이 틀린 질문에서 높은 점수를 가진 짧은 질문이 잘 나타나있습니다.\n",
     "\n",
-    "Next, let's look at the most confident incorrect positive predictions"
+    "그다음 가장 확실하게 틀린 음성 예측을 살펴 보겠습니다."
    ]
   },
   {
@@ -619,7 +619,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "And finally, the most \"unsure\" questions, the ones where a model's probability is closest to equal for all classes (`.5` in our case since we have two classes)."
+    "마지막으로 모델의 확률이 모든 클래스에 동일한 가장 불확실한 질문입니다(두 개의 클래스라면 확률이 `0.5`에 가까운 샘플)."
    ]
   },
   {
@@ -731,7 +731,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To find new candidate features, I recommend combining the top-k method with feature importance and vectorization."
+    "새로운 후보 특성을 찾기 위해 탑-k 방법과 특성 중요도, 벡터화 방법을 함께 사용하는 것을 추천합니다."
    ]
   }
  ],