Fix references

DS-100 · Jul 19, 2023 · 31c31a2 · 31c31a2
1 parent f6c22b5
commit 31c31a2
Show file tree

Hide file tree

Showing 5 changed files with 7 additions and 6 deletions.
diff --git a/content/ch/15/linear_simple_fit.ipynb b/content/ch/15/linear_simple_fit.ipynb
@@ -65,7 +65,7 @@
     "With the simple linear model, the mean squared error is a function of two model parameters, the intercept and slope. This means that if we use calculus to find the minimizing parameter values, we need to find the partial derivatives of the MSE with respect to $\\theta_0$ and $\\theta_1$. We can also find these minimizing values through other techniques:\n",
     "\n",
     "*Gradient descent*\n",
-    ": We can use numerical optimization techniques, such as gradient descent, when the loss function is more complex and it's faster to find an approximate solution that's pretty accurate (see {numref}`Chapter %s <ch:optimization>`).\n",
+    ": We can use numerical optimization techniques, such as gradient descent, when the loss function is more complex and it's faster to find an approximate solution that's pretty accurate (see {numref}`Chapter %s <ch:gd>`).\n",
     "\n",
     "*Quadratic formula*\n",
     ": Since the average loss is a quadratic function of $ \\theta_0$ and $ \\theta_1 $, we can use the quadratic formula (along with some algebra) to solve for the minimizing parameter values. \n",

diff --git a/content/ch/16/ms_regularization.ipynb b/content/ch/16/ms_regularization.ipynb
@@ -17526,7 +17526,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Notice that we have specified a maximum number of iterations because the minimization uses numerical optimization (see {numref}`Chapter %s <ch:optimization>`) to solve for the coefficients, and we have placed a cap on the number of iterations to run to reach the specified tolerance for the optimal parameter. We're ready to fit the model on the train set:"
+    "Notice that we have specified a maximum number of iterations because the minimization uses numerical optimization (see {numref}`Chapter %s <ch:gd>`) to solve for the coefficients, and we have placed a cap on the number of iterations to run to reach the specified tolerance for the optimal parameter. We're ready to fit the model on the train set:"
    ]
   },
   {
@@ -19849,7 +19849,7 @@
           -0.005702760106252486,
           -0.0035912395388374746,
           -0.0016805224625823202,
-          5.589089566798653e-05,
+          0.00005589089566798653,
           0.001640220534741039,
           0.0030911753339077875,
           0.004424584226366144,

diff --git a/content/ch/18/donkey_model.ipynb b/content/ch/18/donkey_model.ipynb
@@ -2200,7 +2200,7 @@
    "metadata": {},
    "source": [
     "Now we want to find the $\\theta_0$ and $\\theta_1$ that minimize the average anesthetic loss over the data. To do this, we could use calculus as we did in\n",
-    "{numref}`Chapter %s <ch:linear>`, but here we'll instead use the `minimize` method from the `scipy` package, which performs a numerical optimization (see {numref}`Chapter %s <ch:optimization>`):"
+    "{numref}`Chapter %s <ch:linear>`, but here we'll instead use the `minimize` method from the `scipy` package, which performs a numerical optimization (see {numref}`Chapter %s <ch:gd>`):"
    ]
   },
   {

diff --git a/content/ch/19/class_loss.ipynb b/content/ch/19/class_loss.ipynb
@@ -1473,7 +1473,7 @@
     "user_expressions": []
    },
    "source": [
-    "Unlike with squared loss, there is no closed form solution to this loss function. Instead, we use iterative methods like gradient descent (see {numref}`Chapter %s <ch:optimization>`) to minimize the average loss. This is also one of the reasons we don't use squared error loss for logistic models---the average squared error is nonconvex, which makes it hard to optimize. The notion of convexity is covered in greater detail in {numref}`Chapter %s <ch:gd>`, and {numref}`Figure %s <gd-convex>` gives a picture for intuition.  "
+    "Unlike with squared loss, there is no closed form solution to this loss function. Instead, we use iterative methods like gradient descent (see {numref}`Chapter %s <ch:gd>`) to minimize the average loss. This is also one of the reasons we don't use squared error loss for logistic models---the average squared error is nonconvex, which makes it hard to optimize. The notion of convexity is covered in greater detail in {numref}`Chapter %s <ch:gd>`, and {numref}`Figure %s <gd-convex>` gives a picture for intuition.  "
    ]
   },
   {

diff --git a/content/data_sources.md b/content/data_sources.md
@@ -1,4 +1,5 @@
 (ax:data_source)=
+
 # Data Sources
 
 All of the data analyzed in this book are available on the book's website, [LearningDS.org](https://learningds.org/) and on the [GitHub repository](https://github.com/DS-100/textbook/) for the book. These datasets are from open repositories and from individuals. We acknowledge them all here, and include, as appropriate, the file name for the data stored in our repository, a link to the original source, a related publication, and the author(s)/owner(s).
@@ -15,7 +16,7 @@ To begin, we provide the sources for the four case studies in the book. Our anal
   Our analysis is based on [How to Weigh a Donkey in the Kenyan Countryside](https://doi.org/10.1111/j.1740-9713.2014.00768.x) by Milner and Rougier.
 
 - `fake_news.csv`: The hand-classified fake news data are from
-  [Fakenewsnet: A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social Media]() by Shu, Mahudeswaran, Wang, Lee, and Liu.
+  [Fakenewsnet: A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social Media](https://arxiv.org/abs/1809.01286) by Shu, Mahudeswaran, Wang, Lee, and Liu.
 
 In addition to these case studies, another 20-plus datasets were used as examples throughout the book. We acknowledge the people and organizations that made these datasets available in the order in which they appeared in the book.