Skip to content

Commit b2dcd75

Browse files
feat: Add biomedical statistical analysis module with Wilcoxon and Mann-Whitney tests
- Implement Wilcoxon Signed-Rank Test for paired data analysis - Implement Mann-Whitney U Test for independent group comparisons - Add comprehensive text-based visualization utilities - Include 5 detailed biomedical examples with real-world scenarios - Provide extensive documentation and theory explanations - Support effect size calculations and statistical interpretations - Pure Python implementation with no external dependencies Addresses common non-parametric statistical needs in biomedical research including clinical trials, drug studies, and diagnostic biomarker analysis.
1 parent c79034c commit b2dcd75

File tree

6 files changed

+1771
-0
lines changed

6 files changed

+1771
-0
lines changed

Biomedical/README.md

Lines changed: 281 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,281 @@
1+
# Biomedical Statistical Analysis Module
2+
3+
A comprehensive Python module providing implementations of non-parametric statistical tests commonly used in biomedical research, along with visualization utilities and educational examples.
4+
5+
## 📊 Overview
6+
7+
This module implements two fundamental non-parametric statistical tests:
8+
9+
- **Wilcoxon Signed-Rank Test**: For analyzing paired/dependent samples
10+
- **Mann-Whitney U Test**: For comparing two independent groups
11+
12+
Both tests are essential when data doesn't meet the assumptions required for parametric tests (normality, equal variances, etc.).
13+
14+
## 🔬 Features
15+
16+
### Core Implementations
17+
- **Pure Python**: No external dependencies required for core functionality
18+
- **Educational Focus**: Clear, well-documented algorithms for learning
19+
- **Biomedical Context**: Examples and documentation tailored for biomedical research
20+
- **Statistical Rigor**: Proper handling of ties, effect sizes, and p-value calculations
21+
22+
### Visualization Tools
23+
- Text-based visualizations (no external plotting libraries required)
24+
- Box plot representations
25+
- Paired data change visualization
26+
- Group comparison histograms
27+
- Statistical summary displays
28+
29+
### Quality Assurance
30+
- Comprehensive error handling and input validation
31+
- Type hints for better code maintainability
32+
- Extensive documentation and examples
33+
- Educational comments explaining algorithms
34+
35+
## 📚 Theory and Applications
36+
37+
### Wilcoxon Signed-Rank Test
38+
39+
**When to use:**
40+
- Paired or dependent samples (before/after, matched pairs)
41+
- Data is ordinal or continuous but not normally distributed
42+
- Dependent variable measured at least at ordinal level
43+
- Differences between pairs are not normally distributed
44+
45+
**Examples in biomedical research:**
46+
- Blood pressure before and after treatment
47+
- Pain scores pre and post medication
48+
- Biomarker levels before and after intervention
49+
- Patient quality of life scores over time
50+
51+
**Algorithm:**
52+
1. Calculate differences between paired observations
53+
2. Remove zero differences
54+
3. Rank absolute differences (handle ties by averaging)
55+
4. Sum ranks for positive and negative differences
56+
5. Test statistic W = smaller of the two sums
57+
6. Calculate p-value using exact tables (small n) or normal approximation (large n)
58+
59+
### Mann-Whitney U Test
60+
61+
**When to use:**
62+
- Two independent groups
63+
- Data is ordinal or continuous but not normally distributed
64+
- Independent observations
65+
- No assumption of equal variances required
66+
67+
**Examples in biomedical research:**
68+
- Treatment vs control group outcomes
69+
- Disease vs healthy population comparisons
70+
- Different drug dosage group comparisons
71+
- Gender differences in biomarker levels
72+
73+
**Algorithm:**
74+
1. Combine both samples and rank all observations
75+
2. Sum ranks for each group
76+
3. Calculate U statistics: U₁ = R₁ - n₁(n₁+1)/2
77+
4. Test statistic = min(U₁, U₂)
78+
5. Calculate p-value using exact tables (small n) or normal approximation (large n)
79+
80+
## 🚀 Quick Start
81+
82+
### Basic Usage
83+
84+
```python
85+
from Biomedical import wilcoxon_signed_rank_test, mann_whitney_u_test
86+
from Biomedical import plot_wilcoxon_results, plot_mann_whitney_results
87+
88+
# Wilcoxon test example: Blood pressure study
89+
before_treatment = [145, 142, 138, 150, 155, 148, 152, 160]
90+
after_treatment = [140, 138, 135, 145, 148, 142, 147, 152]
91+
92+
w_stat, p_value, stats = wilcoxon_signed_rank_test(
93+
before_treatment,
94+
after_treatment,
95+
alternative="greater" # one-sided: treatment reduces BP
96+
)
97+
98+
print(f"W statistic: {w_stat}")
99+
print(f"p-value: {p_value:.4f}")
100+
print(f"Effect size: {stats['effect_size']:.3f}")
101+
102+
# Visualize results
103+
plot_wilcoxon_results(
104+
before_treatment,
105+
after_treatment,
106+
("Before Treatment", "After Treatment"),
107+
"Blood Pressure Reduction Study"
108+
)
109+
110+
# Mann-Whitney test example: Drug efficacy study
111+
treatment_group = [85, 88, 90, 92, 95, 98, 100]
112+
control_group = [78, 80, 82, 85, 87, 89, 91]
113+
114+
u_stat, p_value, stats = mann_whitney_u_test(
115+
treatment_group,
116+
control_group,
117+
alternative="greater" # treatment > control
118+
)
119+
120+
print(f"U statistic: {u_stat}")
121+
print(f"p-value: {p_value:.4f}")
122+
print(f"Median difference: {stats['median_difference']:.1f}")
123+
124+
# Visualize results
125+
plot_mann_whitney_results(
126+
treatment_group,
127+
control_group,
128+
("Treatment", "Control"),
129+
"Drug Efficacy Comparison"
130+
)
131+
```
132+
133+
## 📖 Detailed Examples
134+
135+
### Example 1: Clinical Trial - Pain Medication
136+
137+
```python
138+
# Pre and post medication pain scores (1-10 scale)
139+
pain_before = [8, 7, 9, 6, 8, 7, 9, 8, 7, 6]
140+
pain_after = [4, 3, 5, 3, 4, 3, 5, 4, 3, 2]
141+
142+
w_stat, p_val, stats = wilcoxon_signed_rank_test(
143+
pain_before, pain_after, alternative="greater"
144+
)
145+
146+
print("Pain Medication Study Results:")
147+
print(f"Median pain reduction: {stats['median_difference']:.1f} points")
148+
print(f"Statistical significance: p = {p_val:.4f}")
149+
150+
if p_val < 0.05:
151+
print("✓ Medication significantly reduces pain")
152+
else:
153+
print("✗ No significant pain reduction detected")
154+
```
155+
156+
### Example 2: Biomarker Comparison Study
157+
158+
```python
159+
# Biomarker levels: patients vs healthy controls
160+
patients = [120, 125, 130, 135, 140, 145, 150, 155]
161+
healthy = [100, 105, 110, 115, 118, 120, 125, 128]
162+
163+
u_stat, p_val, stats = mann_whitney_u_test(
164+
patients, healthy, alternative="greater"
165+
)
166+
167+
print("Biomarker Level Comparison:")
168+
print(f"Patient median: {stats['median1']:.1f}")
169+
print(f"Healthy median: {stats['median2']:.1f}")
170+
print(f"Difference: {stats['median_difference']:.1f}")
171+
print(f"Effect size: {stats['effect_size']:.3f}")
172+
173+
if p_val < 0.05:
174+
print("✓ Significant difference between groups")
175+
else:
176+
print("✗ No significant difference detected")
177+
```
178+
179+
## 📊 Interpretation Guidelines
180+
181+
### P-value Interpretation
182+
- **p < 0.001**: Very strong evidence against null hypothesis
183+
- **p < 0.01**: Strong evidence against null hypothesis
184+
- **p < 0.05**: Moderate evidence against null hypothesis
185+
- **p ≥ 0.05**: Insufficient evidence to reject null hypothesis
186+
187+
### Effect Size Interpretation (for large samples)
188+
- **Small effect**: r ≈ 0.1 (explains 1% of variance)
189+
- **Medium effect**: r ≈ 0.3 (explains 9% of variance)
190+
- **Large effect**: r ≈ 0.5 (explains 25% of variance)
191+
192+
### Assumptions and Limitations
193+
194+
**Wilcoxon Signed-Rank Test:**
195+
- ✓ Pairs are independent
196+
- ✓ Data is at least ordinal
197+
- ✓ Distribution of differences is approximately symmetric
198+
- ✗ Cannot handle tied differences well (uses average ranks)
199+
200+
**Mann-Whitney U Test:**
201+
- ✓ Observations are independent
202+
- ✓ Data is at least ordinal
203+
- ✓ No assumption of equal variances
204+
- ✗ Assumes similar distribution shapes for location comparison
205+
206+
## 🔧 API Reference
207+
208+
### `wilcoxon_signed_rank_test(sample1, sample2, alternative='two-sided')`
209+
210+
**Parameters:**
211+
- `sample1`: First sample (list of numbers)
212+
- `sample2`: Second sample (list of numbers, same length as sample1)
213+
- `alternative`: 'two-sided', 'greater', or 'less'
214+
215+
**Returns:**
216+
- `w_statistic`: Test statistic (float)
217+
- `p_value`: P-value (float)
218+
- `stats`: Dictionary with additional statistics
219+
220+
### `mann_whitney_u_test(group1, group2, alternative='two-sided')`
221+
222+
**Parameters:**
223+
- `group1`: First group (list of numbers)
224+
- `group2`: Second group (list of numbers)
225+
- `alternative`: 'two-sided', 'greater', or 'less'
226+
227+
**Returns:**
228+
- `u_statistic`: Test statistic (float)
229+
- `p_value`: P-value (float)
230+
- `stats`: Dictionary with additional statistics
231+
232+
## 🎯 Best Practices
233+
234+
### Study Design Considerations
235+
1. **Sample Size**: Consider power analysis for adequate sample size
236+
2. **Data Collection**: Ensure independence of observations
237+
3. **Multiple Comparisons**: Apply Bonferroni correction when appropriate
238+
4. **Effect Size**: Always report effect sizes alongside p-values
239+
5. **Visualization**: Use plots to understand data distribution
240+
241+
### Code Usage Tips
242+
1. Always validate your data before analysis
243+
2. Choose appropriate alternative hypothesis
244+
3. Check sample size recommendations for test validity
245+
4. Document your analysis assumptions
246+
5. Provide context for statistical significance
247+
248+
## 🤝 Contributing
249+
250+
This module was created as part of Hacktoberfest 2024. Contributions welcome!
251+
252+
### Areas for Enhancement
253+
- [ ] Additional non-parametric tests (Kruskal-Wallis, Friedman)
254+
- [ ] Integration with popular plotting libraries (matplotlib, seaborn)
255+
- [ ] Power analysis functions
256+
- [ ] Bootstrap confidence intervals
257+
- [ ] Multiple comparison corrections
258+
259+
### Development Guidelines
260+
- Follow existing code style and documentation patterns
261+
- Include comprehensive tests for new features
262+
- Maintain educational focus with clear explanations
263+
- Ensure compatibility with existing API
264+
265+
## 📝 License
266+
267+
MIT License - See LICENSE file for details.
268+
269+
## 🔗 References
270+
271+
1. Wilcoxon, F. (1945). Individual comparisons by ranking methods. *Biometrics Bulletin*, 1(6), 80-83.
272+
2. Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. *Annals of Mathematical Statistics*, 18(1), 50-60.
273+
3. Hollander, M., Wolfe, D. A., & Chicken, E. (2013). *Nonparametric Statistical Methods* (3rd ed.). John Wiley & Sons.
274+
275+
## 📧 Contact
276+
277+
Created for Hacktoberfest 2024 - Educational implementation of biomedical statistical methods.
278+
279+
---
280+
281+
*Happy analyzing! 🧬📈*

Biomedical/__init__.py

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
"""
2+
Biomedical Statistical Analysis Module
3+
4+
This module provides implementations of statistical tests commonly used in
5+
biomedical research, including non-parametric tests for comparing groups
6+
and analyzing paired data.
7+
8+
Available Tests:
9+
- Wilcoxon Signed-Rank Test: For paired data when normality assumptions
10+
are violated
11+
- Mann-Whitney U Test: For comparing two independent groups
12+
(non-parametric alternative to t-test)
13+
14+
Features:
15+
- Pure Python implementations following standard algorithms
16+
- Comprehensive visualizations for result interpretation
17+
- Educational examples with biomedical contexts
18+
- Detailed documentation and theory explanations
19+
20+
Author: Contributed for Hacktoberfest
21+
License: MIT
22+
"""
23+
24+
from .mann_whitney_test import mann_whitney_u_test
25+
from .statistical_visualizations import (
26+
plot_independent_groups,
27+
plot_mann_whitney_results,
28+
plot_paired_data,
29+
plot_wilcoxon_results,
30+
)
31+
from .wilcoxon_test import wilcoxon_signed_rank_test
32+
33+
__all__ = [
34+
"mann_whitney_u_test",
35+
"plot_independent_groups",
36+
"plot_mann_whitney_results",
37+
"plot_paired_data",
38+
"plot_wilcoxon_results",
39+
"wilcoxon_signed_rank_test",
40+
]
41+
42+
__version__ = "1.0.0"

0 commit comments

Comments
 (0)