Skip to content

[Folder 2 - Lesson 7 - notebook-covidspread] fix incorrect indexing#724

Open
wangamber8-collab wants to merge 9 commits intomicrosoft:mainfrom
wangamber8-collab:fix/incorrect-indexing-handling
Open

[Folder 2 - Lesson 7 - notebook-covidspread] fix incorrect indexing#724
wangamber8-collab wants to merge 9 commits intomicrosoft:mainfrom
wangamber8-collab:fix/incorrect-indexing-handling

Conversation

@wangamber8-collab
Copy link
Contributor

@wangamber8-collab wangamber8-collab commented Dec 31, 2025

… entries

Update the slicing of the series from [2:] to [3:] because indexes 0-2 contains non-date metadata (province, latitude, longitude). The label "longitude" is included in the dataframe and appears in the plot. Additionally, drop the column "province/state" column before the function is called. This is because "province/state" contains non-date data that will cause an error when the function is called.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes incorrect data slicing in a COVID-19 spread analysis notebook that was causing non-date metadata to appear in plots. The fix adjusts the slice indexing from [2:] to [3:] to properly exclude Province/State, Latitude, and Longitude columns, and adds Province/State to the list of columns to be dropped.

Changes:

  • Updated data slicing from [2:] to [3:] to exclude three metadata columns instead of two
  • Added 'Province/State' column to the drop list alongside 'Lat' and 'Long' for all three dataframes
  • Removed trailing whitespace in the mkframe function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@wangamber8-collab
Copy link
Contributor Author

@leestott Hi! I’m a freshman and fairly new to contributing to open-source projects, so I really appreciate the feedback you left earlier. I’ve gone through and addressed all the comments and made the requested changes.

When you have a chance, could you take another look and let me know if this is ready to be merged, or if there’s anything else I should update?

Thanks so much for your time!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (2)

2-Working-With-Data/07-python/notebook-covidspread.ipynb:1611

  • This note claims the first three elements are non-date metadata including Province/State, but after the earlier groupby('Country/Region').sum() the example output shows only Lat/Long as non-date columns. Update the note to match the actual post-groupby dataframe shape (or adjust preprocessing so Province/State is still present if that’s intended).
    "> **Note** how we use `[3:]` to remove the first three elements of a sequence that contain non-date metadata (the Province/State and geolocation columns). We can also drop those three columns altogether:"

2-Working-With-Data/07-python/notebook-covidspread.ipynb:1622

  • infected.groupby('Country/Region').sum() earlier produces a dataframe that (per the shown infected.head() output) does not include a Province/State column. Dropping ['Lat','Long','Province/State'] will therefore raise a KeyError on Province/State in the current notebook. Either drop Province/State before the groupby step, or add errors='ignore' / conditional logic so this cell is robust across dataset versions.
    "infected.drop(columns=['Lat','Long','Province/State'],inplace=True)\n",
    "recovered.drop(columns=['Lat','Long','Province/State'],inplace=True)\n",
    "deaths.drop(columns=['Lat','Long','Province/State'],inplace=True)"

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1602 to 1604
"infected.loc['US'][3:].plot()\n",
"recovered.loc['US'][3:].plot()\n",
"plt.show()"
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

infected/recovered/deaths are grouped by Country/Region earlier; the displayed infected.head() after that step shows columns starting with Lat, Long, then date columns (no Province/State). With that shape, slicing with [3:] will skip the first date column (e.g., 1/22/20). Consider keeping [2:] (to drop only Lat/Long) or explicitly dropping the non-date columns before plotting and then plotting the remaining date-indexed series.

This issue also appears in the following locations of the same file:

  • line 1611
  • line 1620

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2026-02-17 123336 Even after grouping by Country/Region, the Province/State column is still in the dataframe (see attachment), so I think we should keep [3:] in order to remove this column. Screenshot 2026-02-17 123350 Additionally, keeping [2:] results in an error in the plot, causing "long" to appear on the x-axis (see attachment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments