-
Notifications
You must be signed in to change notification settings - Fork 146
Open
Description
I was doing some data handling with CSV and at the end of my pipeline, the graphs looked plausible, but results were unexpected. I tracked it down to this correctness bug in CSV.jl:
julia> using CSV; CSV.File(IOBuffer("""
Alpha,,3982,16603,,,,"40*",95,4027,,,
Beta,,,2664,2716,,,"0*",15,833,,,
Gamma,,,,1641,1707,1762,1814,1861,1913,,,
"""), transpose=true)
12-element CSV.File:
(Alpha = String7("40*"), Beta = String3("0*"), Gamma = missing)
(Alpha = missing, Beta = missing, Gamma = missing)
(Alpha = missing, Beta = missing, Gamma = missing)
(Alpha = missing, Beta = missing, Gamma = 1641)
(Alpha = missing, Beta = missing, Gamma = 1707)
(Alpha = missing, Beta = missing, Gamma = 1762)
(Alpha = missing, Beta = missing, Gamma = 1814)
(Alpha = String7("95"), Beta = String3("15"), Gamma = 1861)
(Alpha = String7("4027"), Beta = String3("833"), Gamma = 1913)
(Alpha = missing, Beta = missing, Gamma = missing)
(Alpha = missing, Beta = missing, Gamma = missing)
(Alpha = missing, Beta = missing, Gamma = missing)Or, more readable but with more deps
julia> using CSV, DataFrames; CSV.read(IOBuffer("""
Alpha,,3982,16603,,,,"40*",95,4027,,,
Beta,,,2664,2716,,,"0*",15,833,,,
Gamma,,,,1641,1707,1762,1814,1861,1913,,,
"""), DataFrame, transpose=true)
12×3 DataFrame
Row │ Alpha Beta Gamma
│ String7? String7? Int64?
─────┼─────────────────────────────
1 │ 40* 0* missing
2 │ 95 15 missing
3 │ 4027 833 missing
4 │ missing missing 1641
⋮ │ ⋮ ⋮ ⋮
10 │ missing missing missing
11 │ missing missing missing
12 │ missing missing missing
5 rows omitted
julia> using CSV, DataFrames; CSV.read(IOBuffer("""
Alpha,,3982,16603,,,,"40*",95,4027,,,
Beta,,,2664,2716,,,"0*",15,833,,,
Gamma,,,,1641,1707,1762,1814,1861,1913,,,
"""), DataFrame, transpose=false)
2×13 DataFrame
Row │ Alpha Column2 3982 16603 Column5 Column6 Column7 40* 95 4027 Column11 Column12 Column13
│ String7 Missing Missing Int64? Int64 Int64? Int64? String7 Int64 Int64 Missing Missing Missing
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ Beta missing missing 2664 2716 missing missing 0* 15 833 missing missing missing
2 │ Gamma missing missing missing 1641 1707 1762 1814 1861 1913 missing missing missing This also segfaults from time to time.
It didn't happen before I introduced the *s.
Metadata
Metadata
Assignees
Labels
No labels