A modular ecosystem under this. namespace.
-
Updated
Feb 5, 2026 - Rust
A modular ecosystem under this. namespace.
🩺 Machine Learning diabetes prediction model using Support Vector Machine (SVM) classifier. Analyzes 8 medical features (glucose, BMI, age, etc.) from Pima Indian dataset to predict diabetes risk with 75-80% accuracy. Built with Python, scikit-learn, pandas. Includes data preprocessing, model training, and prediction system for diabetes..
Example code accompanying the sternberg concept cell data release for Kyzar et al. (2024)
A digital transformation of cyber assessment and authorization data with a relational schema
Unifying Biotic Interactions Data: Terminology, Data Analysis, Standardization, and Proposal of a Data Schema for Plant-Pollinator Interactions
Prepare and check data to comply with Darwin Core Standard in R
Feature Engineering with Python
A Python-based data cleaning project to streamline Quickbooks invoice data for analysis, paving the way for improved insights into sales, pricing, and inventory management.
A new package processes textual descriptions of drone designs to extract structured summaries of their operational capabilities. It focuses on identifying and categorizing key features such as locomot
Building a modern data warehouse with SQL Server, including ETL Processes, Data Modeling and Analytics
This project is about cleaning and preparing a global layoffs dataset for analysis, focusing on handling null values, correcting data types, and ensuring data integrity for more accurate insights.
Highlighting expertise in data migration, data normalization and standardization, this project demonstrates successful data transfer from Snowflake to Databricks. It emphasizes optimized data flow and enhanced accessibility through standardization, showcasing a commitment to ethical data practices.
Hi folk, During my internship at KultureHire, I completed an end to end Data Analytics project. I created an executive and functional dashboard using pivot tables, conducted a thorough analysis, and provided actionable recommendations. I'm excited to share my work and the insights I discovered.
🌟 Data Cleaning and Processing 🌟 Handled missing values, removed duplicates, standardized salary formats, and treated outliers for consistency.Revealed trends in company performance, job roles, and salary distributions after refining the dataset. This project highlights the power of data preprocessing as the backbone of reliable analytics.
vuln-structure is a package that extracts vulnerability details from raw text and outputs standardized, structured data for security teams.
csv-managed is a Rust command-line utility for high‑performance exploration and transformation of CSV data at scale, emphasizing streaming, typed operations, and reproducible workflows via schema and index files.
This repository contains a SQL-based data cleaning project where raw layoffs data was transformed into a clean and structured dataset. The project showcases practical SQL techniques such as duplicate removal, data standardization, null handling, and schema optimization, following real-world data preparation best practices.
基于 Python 的 ETL 流水线,用于标准化 12 个制造基地的异构 IoT 配置数据。具备自动架构映射、多源合并及用于配置生命周期管理的每日变更日志生成功能--自动化聚合 50W+ IoT 资产并生成每日审计追踪,确保平台逻辑与边缘侧实施的一致性。
CDIS data standardization with SAS and R
Add a description, image, and links to the data-standardization topic page so that developers can more easily learn about it.
To associate your repository with the data-standardization topic, visit your repo's landing page and select "manage topics."