CSV Analyzer
A data-oriented skill that takes a CSV file (path or inline content), parses it, and produces a comprehensive analysis including column types, descriptive statistics, missing value counts, correlation highlights, and detected outliers.
79Trust Medium
by hermeshub-coredataintermediatev1.7.0updated Feb 10, 2026
8.7kTotal Runs
86.3%Success Rate
2.5kInstalls
79Trust Score
Tags
#csv#data-analysis#statistics#outliers#patterns
Required Tools
file_readjson_parseInputs
| Name | Type | Description | Req |
|---|---|---|---|
| csv_file | file | Path to the CSV file to analyze. | -- |
| csv_content | text | Inline CSV content as a string. Used if csv_file is not provided. | -- |
| delimiter | text | Column delimiter. Defaults to comma (","). | -- |
| include_correlations | boolean | Whether to compute pairwise correlations for numeric columns. Defaults to true. | -- |
Outputs
| Name | Type | Description | Req |
|---|---|---|---|
| analysis | json | JSON report with fields: row_count, column_count, columns (array of column analyses), outliers, correlations, data_quality_score. | yes |
Compatible Skills
SKILL.md
---
name: csv-analyzer
description: Analyze CSV files to generate descriptive statistics, detect data types, find outliers, and identify patterns.
---
# CSV Analyzer
Comprehensive CSV analysis with statistics and pattern detection.
## Quick Start
### Basic Analysis
```bash
# Using csvstat (csvkit)
csvstat data.csv
# Using Python
python3 -c "
import pandas as pd
df = pd.read_csv('data.csv')
print(df.describe())
print(df.dtypes)
print(f'Missing values: {df.isnull().sum().sum()}')
"
```
### Advanced Statistics
```bash
# Detect outliers (Z-score method)
python3 -c "
import pandas as pd
import numpy as np
df = pd.read_csv('data.csv')
numeric_cols = df.select_dtypes(include=[np.number]).columns
z_scores = np.abs((df[numeric_cols] - df[numeric_cols].mean()) / df[numeric_cols].std())
outliers = df[(z_scores > 3).any(axis=1)]
print(f'Outliers: {len(outliers)}')
"
```
## Tools Available
- csvstat (csvkit) - comprehensive stats
- csvlook - preview tables
- csvgrep - filter rows
- Python pandas - full analysis
- awk - quick field extraction
## Analysis Output
```json
{
"row_count": 1000,
"column_count": 5,
"columns": [
{
"name": "price",
"type": "numeric",
"stats": {
"min": 10.0,
"max": 1000.0,
"mean": 250.5,
"median": 200.0,
"std": 150.2
},
"missing": 5
}
],
"outliers": [...],
"correlations": {...},
"data_quality_score": 92
}
```
## Common Patterns
| Pattern | Detection Method |
|---------|-----------------|
| Missing values | .isnull().sum() |
| Outliers | Z-score > 3 or IQR |
| Correlations | .corr() |
| Duplicates | .duplicated() |
## Error Handling
- Empty files: Return error with suggested fix
- Malformed CSV: Use error_bad_lines=False
- Encoding issues: Try latin-1 or utf-8-sig