Pass the Google Cloud Professional Data Engineer exam

2 minute read

The Prerequisite

Google Cloud certifications need a right balance of both theory and practical skills. For the theory, Google Cloud documentation is a great resource and to increase your practical skills, you will have to practice frequently and extensively.

The Exam Topics

This certification tests your ability to design big data solutions on GCP.

This exam expects that you are familiar with the big data products (storage, processing, display) and their open source alternatives as well. This is required because some of the questions expect you to answer GCP alternatives for open source big data products.

Key topic

BigQuery is the product that you must understand clearly. If you understand BigQuery, you can answer 40% of this exam.

  • writing, saving, and sharing queries, moving data in-out from BigQuery.

  • BigQuery data transfer service and use of BigQuery for GeoData.

  • BigQuery ML

Overview of other topics

  • a clear understanding of Hadoop ecosystem: most of the tools from Hadoop ecosystem have an alternative in GCP like Data Proc for Hadoop Spark cluster, Dataflow for Apache Beam, Composer for Airflow, and others.

  • be able to understand how different GCP components fit with GCP big data products.

  • Data Studio: viewing BigQuery data in Data Studio along with caching concept.

  • DataFlow input source and sink of Dataflow processed data.

  • GCS — You must understand different classes of GCS, moving data from one class to another class, object life cycle, ways to get data into GCS and what components can be used with GCS and use of GCS with DataProc cluster.

  • the use of DataPrep and DataLab in GCP, and target users of these products.

  • Machine Learning: You are not expected to be a master of Machine learning, but you must understand below machine learning concepts such as Bias-Variance trade-off, overfitting and underfitting, training, linear regression, classification, Gradient descent.

  • Machine learning models, what is the use of GCP MLE (Machine learning engine)

Tips

  • Make sure you understand the Compute resources on GCP

  • You must have a great understanding of Google Cloud Storage

  • Stackdriver monitoring is a must. You can expect 3–4 questions easy-to-answer on Stackdriver

  • You should clearly understand the IAM roles

  • Make sure you understand the GCP architecture and design one give a problem statement.

  • You should be able to understand how resources are organized in projects and how projects, folders, and organization structure works in GCP

Final words

If you thoroughly understand the ‘How-to’ and ‘Concepts‘ sections of Google Cloud documentation, you easily have 70% of what it takes to clear GCP certification; remaining 30% is your practice, experience, and your state-of-mind during exams.