# Vessel Count Statistics by Continent

When working with data, it's often helpful to **zoom out** and look at patterns from a higher level. Instead of looking at every single port individually, we can group them into broader regions — like continents — to get a **big-picture view**.

By grouping ports by their continent, we can **summarize the data** in a way that's easier to understand and compare. It turns hundreds of individual numbers into a handful of meaningful insights.

### **🔧 Your next tasks:**

* [ ] **Group** the data by `continent` & calculate the **sum, mean, median, standard deviation, min, and max** of `vessel_count_total`.
* [ ] Present results in a table showing statistics for each continent.
* [ ] Discuss the implications of **high or low average vessel counts**.
* [ ] **Interpretation Task**:&#x20;
  * 🤔 Which parts of the world see the most vessel traffic overall?
  * 🤔 Are some continents more active on average than others?
  * 🤔 Are certain regions dominated by just a few very large ports?
  * 🤔 What could explain the differences between mean and median values?

> <img src="https://2669499530-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FnYNN3nXNuXMJpHACcH73%2Fuploads%2Ft1yAGmUambZeYVQvPSeu%2Fp.png?alt=media&#x26;token=01872756-9ca8-44f9-9ec1-1ff5f70ce561" alt="" data-size="line">\
> To conquer this task, try using:
>
> * `groupby()` to group ports based on a specific variable (like continent)
> * `.agg()` to calculate multiple statistics — such as sum, mean, or median — for each group
>
> Think of `groupby` as a way to **split your data into mini-tables**, one for each continent. Then `agg()` helps you **summarize** each one with the statistics you care about.
>
> 📌 Try chaining them together for a clean, one-liner summary!