Finished 1 chapter

vamosraghava · Mar 4, 2023 · a61d763 · a61d763
1 parent 2932260
commit a61d763
Show file tree

Hide file tree

Showing 5 changed files with 24 additions and 25 deletions.
diff --git a/content/_index.md b/content/_index.md
@@ -1,6 +1,7 @@
 ---
 title: Technical books summary
 type: docs
+bookToc: false
 ---
 
 # Introduction

diff --git a/content/docs/about me.md b/content/docs/about me.md
@@ -1,6 +1,7 @@
 ---
 weight: 2
 title: "About me"
+bookToc: false
 ---
 
 # About me

diff --git a/...t/docs/books/fundamentals of data engineering/101_data_engineering_described.md b/...t/docs/books/fundamentals of data engineering/101_data_engineering_described.md
@@ -5,13 +5,14 @@ title: "01: Data Engineering Described"
 
 # Data Engineering Described
 
-Data Engineer's goals:
+## Data Engineer's goals
+---
+
 - **`produce optimum ROI`** and reduce costs (financial and opportunity)
 - reduce risk (security, data quality)
 - maximize data value and utility
 - must constantly optimize along the axes of cost, agility, **`scalability`**, simplicity, reuse, and **`interoperability`**.
 
-<br>
 
 ## History of data engineering
 ---
@@ -29,22 +30,22 @@ Data Engineer's goals:
 - **`Apache Spark`** rise because too many tools on the market drove to inventing one unified tool, which was Apache Spark. It got very popular in 2015 and later.
 - Simplification. despite the power and sophistication of open source big data tools, managing them was a lot of work and required constant attention. data engineers historically tended to the low-level details of monolithic frameworks such as Hadoop, Spark, or Informatica, the trend is moving toward **`decentralized, modularized, managed, and highly abstracted tools`**.
 
-<br>
-
 ## Data team
 ---
 
-Upstream stakeholders:
+{{< columns >}}
+**Upstream stakeholders**
 - Data architects
 - Software engineers
 - **`DevOps engineers`**
 
-Downstream stakeholders:
+<--->
+
+**Downstream stakeholders**
 - Data scientists
 - Data analysts
 - Machine learning engineers and AI researchers
-
-<br>
+{{< /columns >}}
 
 ## Data maturity
 ---

diff --git a/...t/docs/books/fundamentals of data engineering/102_data_engineering_lifecycle.md b/...t/docs/books/fundamentals of data engineering/102_data_engineering_lifecycle.md
@@ -5,41 +5,43 @@ title: "02: The Data Engineering Lifecycle"
 
 # The Data Engineering Lifecycle
 
-Lifecycle:
+## Components of the lifecycle
+---
+
+{{< columns >}}
+**Lifecycle**
 - Generation
 - Storage
 - Ingestion
 - Transformation
 - Serving data
 
-Undercurrents of the data engineering lifecycle:
+<--->
+
+**Undercurrents of the lifecycle**
 - Security
 - Data management
 - DataOps
 - Data architecture
 - Orchestration
 - Software engineering
-
-<br>
+{{< /columns >}}
 
 ## Generation
 ---
 
-Key engineering considerations for generation:
+### Key engineering considerations for generation
 - Is it application/IoT/database?
 - At what rate is data generated.
 - Quality of the data.
 - Schema of ingested data.
 - How frequently should data be pulled from the source system?
 - Will reading from a data source impact its performance?
 
-
-<br>
-
 ## Storage
 ---
 
-Key engineering considerations for storage:
+### Key engineering considerations for storage
 - Data volumes, frequency of ingestion, files format.
 - Scaling (total available storage, read operation rate, write volume, etc.).
 - Capturing metadata (schema evolution, data flows, data lineage)
@@ -48,14 +50,11 @@ Key engineering considerations for storage:
 - How are you tracking master data, golden records data quality, and data lineage for data governance?
 - How are you handling regulatory compliance and data sovereignty?
 
-Temperatures of data:
+### Temperatures of data
 - hot data
 - lukewarm data
 - cold data
 
-
-<br>
-
 ## Ingestion
 ---
 
@@ -77,7 +76,6 @@ Push model: a source system writes data out to a target, whether a database, obj
 
 Pull model: data is retrieved from the source system. Example is CDC with logs.
 
-<br>
 
 ## Transformation
 ---
@@ -89,14 +87,12 @@ Examples of transformations:
 - featurizeing data for ML processes,
 - enriching the data.
 
-<br>
 
 ## Other terms
 ---
 
 Reverse ETL: takes processed data from the output side of the data engineering lifecycle and feeds it back into source systems. It allows us to take analytics, scored models, etc., and feed these back into production systems or SaaS platforms. For some engineers view as a anti-pattern.
 
-<br>
 
 ## Security
 ---
@@ -112,7 +108,6 @@ Security good practices:
 
 - Knowledge of user and identity access management (IAM) roles, policies, groups, network security, password policies, and encryption are good places to start.
 
-<br>
 
 ## Data Management
 ---

diff --git a/content/docs/books/fundamentals of data engineering/_index.md b/content/docs/books/fundamentals of data engineering/_index.md
@@ -1,6 +1,7 @@
 ---
 bookCollapseSection: true
 weight: 2
+bookToc: false
 ---
 
 # Fundamentals of Data Engineering