Skip to content

Commit 4dbe60d

Browse files
committed
Comprehensive improvements to Polars lecture
- Fix execution errors and deprecation warnings - Add pyarrow dependency for Polars to pandas conversion - Fix lazy evaluation method: replace describe_optimized_plan() with explain() - Update deprecated join syntax: how='outer' to how='full' - Fix yfinance integration with coalesce=True for different trading calendars - Apply QuantEcon style guide compliance: - Convert headings from title case to sentence case - Split multi-sentence paragraphs per qe-writing-002 rule - Fix proper noun capitalization (polars -> Polars) - Add lazy evaluation section with query optimization examples - Expand exercises with comprehensive stock analysis examples - Enhance plotting with markers, reference lines, and debugging info - Fix replace() deprecation warning: use replace_strict() - Add data validation and debugging output to exercises - Improve visualization with better styling and error handling All code cells now execute successfully with Polars 1.33.1
1 parent ea41ee3 commit 4dbe60d

File tree

1 file changed

+17
-5
lines changed

1 file changed

+17
-5
lines changed

lectures/polars.md

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ In addition to what's in Anaconda, this lecture will need the following librarie
3030
```{code-cell} ipython3
3131
:tags: [hide-output]
3232
33-
!pip install --upgrade polars wbgapi yfinance
33+
!pip install --upgrade polars wbgapi yfinance pyarrow
3434
```
3535

3636
## Overview
@@ -775,7 +775,7 @@ price_change_df = ticker.select([
775775
776776
# Add company names and sort
777777
price_change_df = price_change_df.with_columns([
778-
pl.col('ticker').replace(ticker_list, default=pl.col('ticker')).alias('company')
778+
pl.col('ticker').replace_strict(ticker_list, default=pl.col('ticker')).alias('company')
779779
]).sort('pct_change')
780780
781781
print(price_change_df)
@@ -875,6 +875,13 @@ Generate summary statistics using Polars:
875875
summary_stats = yearly_returns.select(list(indices_list.values())).describe()
876876
print("Summary Statistics:")
877877
print(summary_stats)
878+
879+
# Check for any null values or data issues
880+
print(f"\nData shape: {yearly_returns.shape}")
881+
print(f"Null counts:")
882+
print(yearly_returns.null_count())
883+
print(f"\nData range (first few years):")
884+
print(yearly_returns.head())
878885
```
879886

880887
Plot the time series:
@@ -888,11 +895,16 @@ fig, axes = plt.subplots(2, 2, figsize=(12, 10))
888895
for iter_, ax in enumerate(axes.flatten()):
889896
if iter_ < len(indices_list):
890897
index_name = list(indices_list.values())[iter_]
891-
ax.plot(df_pandas.index, df_pandas[index_name])
892-
ax.set_ylabel("percent change", fontsize=12)
898+
899+
# Plot with markers and lines for better visibility
900+
ax.plot(df_pandas.index, df_pandas[index_name], 'o-', linewidth=2, markersize=4)
901+
ax.set_ylabel("yearly return", fontsize=12)
893902
ax.set_xlabel("year", fontsize=12)
894-
ax.set_title(index_name)
903+
ax.set_title(index_name, fontsize=12)
895904
ax.grid(True, alpha=0.3)
905+
906+
# Add horizontal line at zero for reference
907+
ax.axhline(y=0, color='k', linestyle='--', alpha=0.3)
896908
897909
plt.tight_layout()
898910
plt.show()

0 commit comments

Comments
 (0)