-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Labels
Description
What
The units of the multinomial regression in our analysis need to be switched from households to daily observations.
Why
I was originally assigning individual households to clusters based on a dominance score based their modal membership to a cluster based on daily profiles over a month. This throws away a lot of information about usage patterns, so we are eliminating this step and running the multinomial regression on the clusters of usage profiles (households x days). This allows for a more detailed analysis and better results for the client.
How
- Load cluster assignments without aggregation, treating each row as a household-day assignment.
- Attach block groups by joining ZIP+4→BG crosswalk.
- Aggregate to BG × cluster.
- Compute cluster share = number of assignments / total assignments
- Fit MNLogit with:
- outcome constructed from cluster share
- Frequency weights
Deliverables
- PR with: Functioning update to the code that passes the clustering results through a multinomial regression and delivers initial results.