Dataengineeringgcp

1 minute read

key job skills: designing, building, and running data processing systems; and operationalizing machine-learning models.

Big Data & ML Fundamentals

Big Data Challenges

  1. Migrating existing data workloads
  2. Analyzing large datasets at scale
  3. Building streaming data pipelines(so that the business can make data-driven decisions more quickly)
  4. Applying machine learning to your data(not just reacting to data, but also predicting)

Examples

  1. Compute Power: Creating a VM on Compute engine
  2. Storage: Elastic Storage with Google Cloud Storage
  3. Networking: Google’s data center network speed enables the separation of compute and storage
  4. Security: Cloud IAM
  5. Big data and ML

Choosing the right approach

Compute:

  • Compute Engine:individual machines running native code, Infrastructure as a service(Iaas)
  • Google Kubernetes Engine: clusters of machines running containers
  • App Engine: Platform as a Service (Paas)
  • Cloud Function: a completely serverless execution environment, Function as a service(Faas)

Storage:

  • Cloud Bigtable
  • Cloud Storage
  • Cloud SQL
  • Cloud Spanner
  • Cloud Datastore

Modernizing Data Lakes and Data Warehouses

Building Batch Data Pipelines

Extra-Load,Extract-Load-Transform or Extract-Transform-Load paradigms.

Building Resilient Streaming Analytics Systems

Processing streaming data is becoming increasingly popular as streaming enables businesses to get real-time metrics on business operations.

Smart Analytics, Machine Learning, and AI

Incorporating machine learning into data pipelines increases the ability of businesses to extract insights from their data.

Updated: