하이퍼링크 수정

ruke79 · Jul 31, 2021 · 6b557e8 · 6b557e8
1 parent 30881c3
commit 6b557e8
Showing 1 changed file with 23 additions and 24 deletions.
diff --git a/notebooks/second_model.ipynb b/notebooks/second_model.ipynb
@@ -4,9 +4,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Training a second model\n",
+    "# 두 번째 모델 훈련하기\n",
     "\n",
-    "In this notebook, I train a second model using features in order to address the first model's shortcomings."
+    "이 노트북에서 첫 번째 모델의 단점을 극복하기 위한 특성을 사용해 두 번째 모델을 훈련합니다."
    ]
   },
   {
@@ -81,7 +81,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Let's add new features we've identified as potential candidates in our new model."
+    "새로운 모델에 도움이 될만한 후보 특성을 추가합니다."
    ]
   },
   {
@@ -107,7 +107,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Check out the ml_editor source code to see more about what these functions are doing!"
+    "`ml_editor` 소스 코드를 확인하여 이 함수들의 기능을 확인해 보세요!"
    ]
   },
   {
@@ -157,9 +157,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Model\n",
+    "# 모델\n",
     "\n",
-    "Now that we've added new features, let's train a new model. We'll use the same model as before, only the features are different."
+    "이제 새로운 특성을 추가했으니 새 모델을 훈련해 보죠. 특성만 다르고 이전과 동일한 모델을 사용합니다."
    ]
   },
   {
@@ -168,7 +168,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# We split again since we have now added all features. \n",
+    "# 특성을 새로 추가했으므로 데이터셋을 다시 나눕니다.\n",
     "train_df, test_df = get_split_by_author(df, test_size=0.2, random_state=40)"
    ]
   },
@@ -259,28 +259,27 @@
    ],
    "source": [
     "def get_metrics(y_test, y_predicted):  \n",
-    "    # true positives / (true positives+false positives)\n",
+    "    # 진짜 양성 / (진짜 양성 + 가짜 양성)\n",
     "    precision = precision_score(y_test, y_predicted, pos_label=True,\n",
     "                                    average='binary')             \n",
-    "    # true positives / (true positives + false negatives)\n",
+    "    # 진짜 양성 / (진짜 양성 + 가짜 음성)\n",
     "    recall = recall_score(y_test, y_predicted, pos_label=True,\n",
     "                              average='binary')\n",
     "    \n",
-    "    # harmonic mean of precision and recall\n",
+    "    # 정밀도와 재현율의 조화 평균\n",
     "    f1 = f1_score(y_test, y_predicted, pos_label=True, average='binary')\n",
     "    \n",
-    "    # true positives + true negatives/ total\n",
+    "    # 진짜 양성 + 진짜 음성 / 전체\n",
     "    accuracy = accuracy_score(y_test, y_predicted)\n",
     "    return accuracy, precision, recall, f1\n",
     "\n",
     "\n",
-    "\n",
-    "# Training accuracy\n",
-    "# Thanks to https://datascience.stackexchange.com/questions/13151/randomforestclassifier-oob-scoring-method\n",
+    "# 훈련 정확도\n",
+    "# https://datascience.stackexchange.com/questions/13151/randomforestclassifier-oob-scoring-method 참고\n",
     "y_train_pred = np.argmax(clf.oob_decision_function_,axis=1)\n",
     "\n",
     "accuracy, precision, recall, f1 = get_metrics(y_train, y_train_pred)\n",
-    "print(\"Training accuracy = %.3f, precision = %.3f, recall = %.3f, f1 = %.3f\" % (accuracy, precision, recall, f1))"
+    "print(\"훈련 정확도 = %.3f, 정밀도 = %.3f, recall = %.3f, f1 = %.3f\" % (accuracy, precision, recall, f1))"
    ]
   },
   {
@@ -298,14 +297,14 @@
    ],
    "source": [
     "accuracy, precision, recall, f1 = get_metrics(y_test, y_predicted)\n",
-    "print(\"Validation accuracy = %.3f, precision = %.3f, recall = %.3f, f1 = %.3f\" % (accuracy, precision, recall, f1))"
+    "print(\"검증 정확도 = %.3f, 정밀도 = %.3f, recall = %.3f, f1 = %.3f\" % (accuracy, precision, recall, f1))"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Fortunately, this model shows stronger aggregate performance than our previous model! Let's save our new model and vectorizer to disk so we can use them later."
+    "다행히 이 모델은 이전 모델보다 성능이 더 높습니다! 새로운 모델과 벡터화 객체를 나중에 사용하기 위해 디스크에 저장하겠습니다."
    ]
   },
   {
@@ -335,9 +334,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Validating that features are useful\n",
+    "## 특성의 유용성 검증하기\n",
     "\n",
-    "Next, we'll use the method described in the feature importance [notebook](https://github.com/hundredblocks/ml-powered-applications/blob/master/notebooks/feature_importance.ipynb) to validate that our new features are being used by the new model."
+    "그다음 특성 중요도 [노트북](https://github.com/hundredblocks/ml-powered-applications/blob/master/notebooks/feature_importance.ipynb)에서 설명한 방법을 사용해 새로운 특성을 새로운 모델이 사용하는지 확인해 보겠습니다."
    ]
   },
   {
@@ -412,27 +411,27 @@
    ],
    "source": [
     "k = 20\n",
-    "print(\"Top %s importances:\\n\" % k)\n",
+    "print(\"상위 %s개 중요도:\\n\" % k)\n",
     "print('\\n'.join([\"%s: %.2g\" % (tup[0], tup[1]) for tup in get_feature_importance(clf, all_feature_names)[:k]]))\n",
     "\n",
-    "print(\"\\nBottom %s importances:\\n\" % k)\n",
+    "print(\"\\n하위 %s개 중요도:\\n\" % k)\n",
     "print('\\n'.join([\"%s: %.2g\" % (tup[0], tup[1]) for tup in get_feature_importance(clf, all_feature_names)[-k:]]))"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Our new features are amongst the most predictive! On the flip side, we can see that the word vectors from the TF-IDF vectorization approach don't seem to be particularly helpful. In a following [notebook](https://github.com/hundredblocks/ml-powered-applications/blob/master/notebooks/third_model.ipynb), we will train a third model without these features and see how well it performs."
+    "새로운 특성이 가장 예측 성능이 좋은 편이군요! 반대로 TF-IDF 벡터화로 얻은 단어 벡터는 특별히 도움이 되는 것 같지 않습니다. 이어지는 [노트북](https://github.com/hundredblocks/ml-powered-applications/blob/master/notebooks/third_model.ipynb)에서 이런 특성을 제외하고 세 번째 모델을 훈련하여 어떤 성능을 내는지 확인해 보습니다."
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Comparing predictions to data\n",
+    "## 예측과 데이터 비교하기\n",
     "\n",
-    "This section uses the evaluation methods described in the Comparing Data To Predictions [notebook](https://github.com/hundredblocks/ml-powered-applications/blob/master/notebooks/comparing_data_to_predictions.ipynb), but on our new model."
+    "이 섹션은 새로운 모델로 데이터와 예측 비교하기 [노트북](comparing_data_to_predictions.ipynb)에서 설명한 평가 방법을 사용합니다."
    ]
   },
   {