Big Data Mini Course: AMP Camp 4 hands-on exercises

UC Berkeley AMPLab

Hands-on Big Data Mini Course

Check out our online Big Data Mini Course! The full course will take 2-4 hours to complete, and in the process you will:

  • Start a ~5 node cluster on EC2 running Hadoop and the Berkeley Data Analytics Stack (BDAS).
  • Interactively explore a real Wikipedia dataset at the Spark and Shark shells.
  • Use Spark Streaming and the Twitter API to generate a real-time list of trending Twitter topics.
  • Write a data clustering algorithm and run it on a real Wikipedia dataset and observe interesting correlations.


Welcome to the AMP Camp 4 hands-on exercises! These exercises are extended and enhanced from those given at previous AMP Camp Big Data Bootcamps. They were written by volunteer graduate students and postdocs in the UC Berkelay AMPLab. Many of those same graduate students are present today as teaching assistants. The exercises we cover today will have you working directly with the Spark specific components of the AMPLab’s open-source software stack, called the Berkeley Data Analytics Stack (BDAS).

Course Prerequisites

A few of the components support multiple languages. In some sections of this training material, you can choose which language you want to use as you follow along and gain experience with the tools. The following table shows which languages this mini course supports for each section. You are welcome to mix and match languages depending on your preferences and interests.

  • Free schedule
Course properties:
  • Free:
  • Paid:
  • Certificate:
  • MOOC:
  • Video:
  • Audio:
  • Email-course:
  • Language: English Gb


No reviews yet. Want to be the first?

Register to leave a review

Included in selections:
More from 'Mathematics, Statistics and Data Analysis':
6e8a49e3-e74b-4a74-81b7-ebaf9c82c620-e20771d7a2a2.small Derivatives Markets: Advanced Modeling and Strategies
Financial derivatives are ubiquitous in global capital markets. Students will...
Ddjlogo Doing Journalism with Data: First Steps, Skills and Tools
This free 5-module online introductory course gives you the essential concepts...
Google_logo_41 Digital Analytics Fundamentals
This three-week course provides a foundation for marketers and analysts seeking...
Logo Information Theory, Pattern Recognition, and Neural Networks
A series of sixteen lectures covering the core of the book "Information Theory...
Uoft_logo Introduction to Machine Learning (CSC2515, Fall 2008)
Introductory course in machine learning by world leading expert Geoffrey Hinton...

© 2013-2019