Skip to content

replace does not respect target type #352

@nalimilan

Description

@nalimilan

Currently replace calls recode and always returns a CategoricalArray, even if the target values are not CategoricalValues. This is OK for recode as it's specific to CategoricalArrays, but ideally replace should respect the target type. Unfortunately, the behavior of replace on arrays is to call promote_type on the source element type and on target values' types. This would give weird arrays such as Array{Union{CategoricalValue{String,UInt32}, Int}}. We could use a different approach which would choose the element type based on the actual values, like broadcast. But that would trade an inconsistency with Base for another inconsistency.

For example, the following should ideally return a Vector{Int} (see this thread):

julia> a = CategoricalArray(["X", "X", "Y", "Z", "Y", "Y", "Z"])
7-element CategoricalArray{String,1,UInt32}:
 "X"
 "X"
 "Y"
 "Z"
 "Y"
 "Y"
 "Z"

julia> replace(a, "X"=>1, "Y"=>2, "Z"=>3)
7-element CategoricalArray{Union{Int64, String},1,UInt32}:
 1
 1
 2
 3
 2
 2
 3

Cc: @bkamins

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions