Skip to content

feat(DRAFT): Add get_supertype#3396

Draft
dangotbanned wants to merge 127 commits intomainfrom
dtypes/supertyping
Draft

feat(DRAFT): Add get_supertype#3396
dangotbanned wants to merge 127 commits intomainfrom
dtypes/supertyping

Conversation

@dangotbanned
Copy link
Copy Markdown
Member

@dangotbanned dangotbanned commented Jan 10, 2026

Description

Important

@FBruzzesi and I have been + are still iterating on this
Core functionality is there, focusing on readability, performance + shrinking the test suite

This PR implements polars' concept of supertyping - which more generally defines which types can be safely promoted/demoted/cast to other types.

I really like the DuckDB visualization of their version1 of these rules, so here's that for an example:

Show Casting Operations Matrix

typecasting-matrix

This is a preliminary step for implementing relaxed concat (#3386).
The aim is we own a consistent set of rules that all/most backends can participate in.
We've already dropped some supertypes that are valid in polars, but may prove challenging in other backends such as #121.
Some others are directly mentioned in comments (e.g. (Struct, DType) -> Struct)

Additional use-cases

Supertyping in polars is used for much more than just a subset of concat.
In (#2572), it is one of the larger concepts missing from the intermediate representation (see #3386 (comment)).

polars-plan::plans::conversion::type_coercion is full of examples of how deeply related the concept is with expressions.
My aim is not to reproduce all of that 😅 - but to be able to reason about DTypes between LazyFrame operations without querying the backend for a Schema between every step 🤞

Related issues

Tasks

Footnotes

  1. DuckDB also mentions another set of rules called Combination Casting - that is entirely implicit.
    The matrix doesn't relfect these and only one cast example is given, but it would apply to nw.concat:
    "This combination casting occurs for ..., set operations (UNION / EXCEPT / INTERSECT), and ..."

dangotbanned added a commit that referenced this pull request Feb 7, 2026
dangotbanned added a commit that referenced this pull request Feb 14, 2026
As much as is possible without #3396
dangotbanned added a commit that referenced this pull request Feb 16, 2026
Need to decide how many of the others to leave as todos
Main theme is needing `get_supertype` (#3396)
dangotbanned added a commit that referenced this pull request Feb 17, 2026
Everything left requires `get_supertype` (#3396)
* refactor: Replace `_same_supertype` with a custom `@singledispatch`

This is more generally useful and a LOT easier to read from the outside

* refactor: Just use a real class

* fix(typing): Satisfy `mypy`

* fix: Oops forgot the first element

* refactor(typing): Use slightly better names

* chore: Rename `default` -> `upper_bound`

* docs: Replace debugging doc

* docs: More cleanup

* refactor: Use `__slots__`, remove a field

* docs: More, more cleanup

* docs: lil bit of `.register` progress

* cov

* test: Get full coverage for `@just_dispatch`

* chore: Give it a simple repr

* test: Oops, forgot that was an override

* revert: Keep only what is required

See #3396 (comment)

* refactor: Simplify `@just_dispatch` signature

* fix(typing): Satisfy mypy

* test: Gotta get that coverage

Resolves #3410 (comment)

* docs: Restore a minimal version of `@just_dispatch` doc

Resolves #3410 (comment)

* revert: Remove `Impl` alias

#3410 (comment)

* refactor: Rename `Passthrough` -> `PassthroughFn`

Suggested in #3410 (review)

* docs: Add note to use only on internal

Suggested in #3410 (review)
@dangotbanned dangotbanned mentioned this pull request Mar 30, 2026
25 tasks
Copy link
Copy Markdown
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we should be doing this?

I don't think different libraries follow the same supertyping rules, and i'm not sure it's something we should impose

e.g. Datetime('us') vs Datetime('ns'): Polars goes to the former, pandas to the latter

In [16]: df = pl.DataFrame({'a': [datetime(2020,1,1)]})

In [17]: pl.concat([df.with_columns(pl.col('a').cast(pl.Datetime('ns'))), df], how='vertical_relaxed')
Out[17]:
shape: (2, 1)
┌─────────────────────┐
│ a                   │
│ ---                 │
│ datetime[μs]        │
╞═════════════════════╡
│ 2020-01-01 00:00:00 │
│ 2020-01-01 00:00:00 │
└─────────────────────┘

In [18]: pd.concat([df.with_columns(pl.col('a').cast(pl.Datetime('ns'))).to_pandas(), df.to_pandas()], axis=0).dtypes
Out[18]:
a    datetime64[ns]
dtype: object

@dangotbanned
Copy link
Copy Markdown
Member Author

dangotbanned commented Mar 30, 2026

I don't think different libraries follow the same supertyping rules, and i'm not sure it's something we should impose

But doesn't that inconsistency show an example of how - if we don't address it - there's a knock-on effect to things like selectors?

IMO, (#3396 (review)) is the kind of thing that won't be an issue to most use cases - but when it is, it could be a slog to debug.


I wanna stress that my goal is a set of rules.
Those should be:

  • what we can realistically support across all/most backends
  • something downstream can depend on for correctness
    • @FBruzzesi did a really good job on the explainability of them btw 🥳

I like the rules we have here, but I'm still open to more fiddling 🙂

@MarcoGorelli
Copy link
Copy Markdown
Member

i think it could also be a slog to debug when someone switches from pandas (or some other library) to narwhals

for selectors, i think it's fairly common to select from kind (like all temporal columns, or all datetime ones) rather than some exact dtype (like Datetime('us'))

@FBruzzesi
Copy link
Copy Markdown
Member

FBruzzesi commented Mar 30, 2026

@MarcoGorelli I find the polars behavior a bit odd. I didn't check what other backends do nor I could find a polars issue on the topic.

What would you propose to do here? I guess one option is that we start by not allowing supertyping for datetime and duration dtypes unless they have the same time_unit. That's probably the safest approach to begin with, as a user can always decide how to do it externally if needed

@MarcoGorelli
Copy link
Copy Markdown
Member

Yup, happy for supertyping datetimes of different resolutions to raise

Is this mostly about int32 vs int64 -> int64 kind of operations? If so, I think those at least should be fairly standardised, ok with doing those. Is there any other kind of supertyping that this PR does?

sorry i

165 hidden items
Load more…

haven't clicked through everything

@FBruzzesi
Copy link
Copy Markdown
Member

FBruzzesi commented Mar 31, 2026

Thanks @MarcoGorelli

Yup, happy for supertyping datetimes of different resolutions to raise

Alright, I think that can be a starting point. @dangotbanned WDYT?


sorry i

165 hidden items
Load more…

haven't clicked through everything

Yes I feel you, this is definitely a large one with a lot of commits.

I guess the quickest way to get a grasp of it is the documentation page we have written, or even quicker a chart.

The TL;DR is that:

  • Numeric casting is allowed (including Boolean)
  • Categorical to String, Enum to String, String to Binary
  • Nested types are resolved recursively, and with some special casing:
    • When combining a List with an Array, the supertype is a List if both have the same depth
      (nesting level).
    • Struct non-trivial case cannot be expressed in one line 🤣

@MarcoGorelli
Copy link
Copy Markdown
Member

thanks! 🙏

i've read through, and my initial reaction is that this is too complicated

why are we dealing with String vs Int64 for example? do we have a use-case for that?

@FBruzzesi
Copy link
Copy Markdown
Member

To me the buggy part is that in the current implementation of concat every backend has a different behavior (#3191 (comment)).

This PR is a pre requisite to have a consistent behavior:

Coming to:

why are we dealing with String vs Int64 for example? do we have a use-case for that?

casting numeric to string is standardized across backends, I don't see why that would be problematic to support.


If you are up for it, let's have a call chat to better understand what we could land with this PR

@dangotbanned
Copy link
Copy Markdown
Member Author

dangotbanned commented Mar 31, 2026

concat(..., how="*_relaxed") to be loose as polars is via supertyping

Yep the idea is we can allow more things if we can have consistent semantics by casting first.
That's what polars is doing after all 😉

I made a list of other polars APIs that use these rules , will try to find it later

Where we can apply the rules

I think that in a lot of these cases, we wouldn't have coverage (understandably) for what each backend does on it's own.
Sometimes, these things surface in bug reports (#3394), (#2835), (#2082) - but it would be nice for consistency across all of them:

Top-level functions

  • concat(how="diagonal_relaxed")
  • concat(how="vertical_relaxed")
  • coalesce
  • max_horizontal
  • mean_horizontal
  • min_horizontal
  • sum_horizontal
  • when

Expr methods

  • fill_null
  • replace_strict
  • +
  • -
  • *
  • //
  • %

LazyFrame methods

  • join
  • join_asof
  • unpivot

@MarcoGorelli
Copy link
Copy Markdown
Member

are you suggesting that + should result in a consistent output dtype across libraries? still not sure tbh, there's differences due to different design decisions:

In [55]: df = duckdb.sql("""select * from values (1.5), (2.5) df(a)""")

In [56]: duckdb.sql("""
    ...: from df
    ...: select a + a
    ...: """)
Out[56]:
┌──────────────┐
│   (a + a)    │
│ decimal(3,1) │
├──────────────┤
│          3.0 │
│          5.0 │
└──────────────┘

In [57]: df.pl().select(pl.col('a')+pl.col('a'))
Out[57]:
shape: (2, 1)
┌───────────────┐
│ a             │
│ ---           │
│ decimal[38,1] │
╞═══════════════╡
│ 3.0           │
│ 5.0           │
└───────────────┘

In [58]: df.pl()
Out[58]:
shape: (2, 1)
┌──────────────┐
│ a            │
│ ---          │
│ decimal[2,1] │
╞══════════════╡
│ 1.5          │
│ 2.5          │
└──────────────┘

and i'm not sure we should be standardising it

@dangotbanned
Copy link
Copy Markdown
Member Author

are you suggesting that + should result in a consistent output dtype across libraries?

All I'm suggesting is that the list in (#3396 (comment)) is where those rules could be applied.

I know that Decimal is likely to be too different in the new polars version (#3377 (comment))

@dangotbanned
Copy link
Copy Markdown
Member Author

@MarcoGorelli you could pick any combination of types/backends and find examples of incompatibilities.

I don't think it is helpful, considering where I started (#3396 (comment)):

I wanna stress that my goal is a set of rules.
Those should be:

  • what we can realistically support across all/most backends

I'm gonna try a different angle ...

(Provided we can cast our way there)

What do you think about using <insert-a-set-of-rules-here> in places where a backend would otherwise error?

I think that you're okay with that, and these overlap a lot with (#3396 (comment)):

The main motivator for this PR is supporting another of these cases (#3398), but wanting to do it in a standardised way.

I would like it if you could write this and it is reliable:

nw.concat(items, how="vertical_relaxed")

I know that we can do this.
It'll only ever be a subset of polars - but I know there is a version of this that we can do (even for ibis)
Help us find that version ❤️

@MarcoGorelli
Copy link
Copy Markdown
Member

Just so i understand

What do you think about using in places where a backend would otherwise error?

  • outside of concat, which of the other cases listed error for some backends in a way that dtype supertyping would solve?

@dangotbanned
Copy link
Copy Markdown
Member Author

General

(#3396 (comment))

What do you think about using in places where a backend would otherwise error?

outside of concat, which of the other cases listed error for some backends in a way that dtype supertyping would solve?

The interesting part is that supertyping reduces them all to the same problem.
We have 2 or more DTypes and we need to either:

  1. Pick one
  2. Find one that all can safely cast to
  3. Reject it

How often we'd benefit depends on two things:

  1. How closely do the rules a backend uses map to what polars does?
  2. How consistent is the backend (natively) in applying their own rules?

Examples

I've picked the first example at random.
The other two look at how internally consistent the same problem is solved.

Note

There are lots of code blocks hidden here!

1 - fill_null

Show test_fill_null_series_expression

def test_fill_null_series_expression(constructor: Constructor) -> None:
data = {
"a": [0.0, None, 2.0, 3.0, 4.0],
"b": [1.0, None, None, 5.0, 3.0],
"c": [5.0, 2.0, None, 2.0, 1.0],
}
df = nw.from_native(constructor(data))
result = df.with_columns(nw.col("a", "b").fill_null(nw.col("c")))
expected = {
"a": [0.0, 2, 2, 3, 4],
"b": [1.0, 2, None, 5, 3],
"c": [5.0, 2, None, 2, 1],
}
assert_equal_data(result, expected)

Making this change works for all backends besides pyarrow, which raises here

-    df = nw.from_native(constructor(data))
+    df = nw.from_native(constructor(data)).with_columns(nw.col("a", "b").cast(nw.Float32))

But in most contexts (e.g. arithmetic, coalesce) pyarrow will apply the rule (Float32, Float64) -> Float64.

Show me the goods

import narwhals as nw

data = {
    "a": [0.0, None, 2.0, 3.0, 4.0],
    "b": [1.0, None, None, 5.0, 3.0],
    "c": [5.0, 2.0, None, 2.0, 1.0],
}

df = nw.from_dict(
    data,
    schema={"a": nw.Float32(), "b": nw.Float32(), "c": nw.Float64()},
    backend="pyarrow",
)

>>> df.select(native_promotion=nw.coalesce("a", "b", "c")).schema
Schema([('native_promotion', Float64)])

2 - coalesce

pyarrow can handle mixing Int* and Float* in coalesce, like polars too!

Show me more, show me more ...

import polars as pl
import narwhals as nw


native = pl.DataFrame({"a": [None]}).with_columns(
    b=pl.col("a").cast(pl.Int64), c=pl.col("a").cast(pl.Float64)
)
df = nw.from_native(native)

>>> df.with_columns(d=nw.coalesce("b", "c")).schema
Schema([('a', Unknown), ('b', Int64), ('c', Float64), ('d', Float64)])

>>> nw.from_native(df.to_arrow()).with_columns(d=nw.coalesce("b", "c")).schema
Schema([('a', Unknown), ('b', Int64), ('c', Float64), ('d', Float64)])

But it would be tripped up by (#2835):

Show polars nulls

>>> df.with_columns(d=nw.coalesce("a", "b", "c")).schema
Schema([('a', Unknown), ('b', Int64), ('c', Float64), ('d', Float64)])

Show pyarrow nulls

>>> nw.from_native(df.to_arrow()).with_columns(d=nw.coalesce("a", "b", "c")).schema
ArrowNotImplementedError: Function 'coalesce' has no kernel matching input types (null, int64, double)

Even though we can make that work with a cast:

Show cast coming to the rescue

>>> nw.from_native(df.to_arrow()).with_columns(
    d=nw.coalesce(nw.col("a").cast(nw.Int64), "b", "c")
).schema
Schema([('a', Unknown), ('b', Int64), ('c', Float64), ('d', Float64)])

3 - join

However, pyarrow won't use these same rules when we use join:

Show polars join

df_pl = nw.from_native(df.to_polars())

>>> df_pl.join(df_pl, left_on="b", right_on="c").schema
Schema([('a', Float32),
        ('b', Float32),
        ('c', Float64),
        ('a_right', Float32),
        ('b_right', Float32)])

Show pyarrow join

>>> df.join(df, left_on="b", right_on="c")
ArrowInvalid: Incompatible data types for corresponding join field keys: FieldRef.Name(b) of type float and FieldRef.Name(c) of type double

Summary

Defining the rules used by #3398 does not directly fix any of these issues - nor change the behavior of those APIs.

But I think these are real issues we could solve, with this PR giving us the tools to explore that in the future.

One nice thing about introducing supertyping for concat(..., how="*_relaxed"), is that you must opt-in.
That gives us a chance to leave breadcrumbs 1 so everyone is on the same page about what we do here 😄

If I were to suggest how we'd integrate the concept into existing APIs - I think allowing opting-in to supertyping/a subset of it would be the least surprising.
Just a thought for the future 🙂

Footnotes

  1. like DuckDB does with UNION linking to their typecasting rules

@camriddell
Copy link
Copy Markdown
Member

This is a big push towards "Narwhals to ensure consistent backend behavior" rather than "Narwhals to let backends do whatever they wish"

I am onboard with the reasoning for this PR. However I have 2 concerns:

  1. backwards compatibility for the cases where the rules expressed here is different than a backend's native "rule" (or lack thereof).
    • There will be users who want the backend to do whatever the backend is going to do (e.g. parity testing as one translates a codebase from backend X to Narwhals). I may not have spotted this in the code, but what escape hatches can users rely on to completely subvert the promotion logic implemented here?
  2. maintainability how might you onboard a new contributor/maintainer to this part of the code? We already have a fairly large codebase, and this new system adds 2k lines.
    • Do we care to add a "how it works" section added to the docs? The current docs do a great job explaining the promotions available, but a "how it works" could greatly help future onboarding.

@dangotbanned

This comment was marked as outdated.

@dangotbanned
Copy link
Copy Markdown
Member Author

dangotbanned commented Apr 9, 2026

Sorry for the delay @camriddell!

I really do appreciate the time you put into (#3396 (comment)) ❤️

I wanted to circle-up with @FBruzzesi first, so we could avoid adding too much to the thread (93 96 messages 😳)

This is a big push towards "Narwhals to ensure consistent backend behavior" rather than "Narwhals to let backends do whatever they wish"

Consistency is definitely something I'd like to see more of 1, but understand that everyone has unique expectations on how far that should go.
I think the way we serve the most people is by offering the choice for more consistency where it is reasonable for us to do it

I am onboard with the reasoning for this PR

Thank you 😍

Backwards compatibility

there will be users who want the backend to do whatever the backend is going to do

General

I want this feature to preserve backwards compatibility in existing APIs.
@FBruzzesi suggested using stable.v*, which I think could work.

(excluding (#3398)), would you like to see something more concrete than the end of my summary here? I have a few ideas.

polars exposes some level of config for supertypes in:

The main get_supertype function we have, could be adapted to be more like supertype::get_supertype_with_options for flexibility.

but what escape hatches can users rely on to completely subvert the promotion logic implemented here
I may not have spotted this in the code

Yeah this isn't implemented yet (see usage in (#3398) schema.py diff) but I agree would be useful to have.

So all together - if we want configuration - I'd be thinking of this as a jumping off point:

Show get_supertype_with_options

Patent-pending on these names

from __future__ import annotations

import enum
from typing import Literal

from narwhals.dtypes import DType

class SupertypeFlags(enum.Flag):
    DEFAULT = 0
    SKIP = enum.auto()
    SOME_RULE = enum.auto()
    ANOTHER_RULE = enum.auto()
    # <insert-more-things-here>
    # e.g. https://github.com/narwhals-dev/narwhals/pull/3430
    RELATED_RULES = SOME_RULE | ANOTHER_RULE


class SupertypeResult(enum.Enum):
    FAILED = enum.auto()
    SKIPPED = enum.auto()

    def __bool__(self) -> Literal[False]:
        return False


def _get_supertype_with_options(
    left: DType, right: DType, options: SupertypeFlags
) -> DType | None: ...


def get_supertype_with_options(
    left: DType, right: DType, options: SupertypeFlags
) -> DType | Literal[SupertypeResult.FAILED, SupertypeResult.SKIPPED]:
    if SupertypeFlags.SKIP in options:
        return SupertypeResult.SKIPPED
    if dtype := _get_supertype_with_options(left, right, options):
        return dtype
    return SupertypeResult.FAILED

If you've gotten this far (hey, thanks!), narwhals/_plan/_flags.py has examples using enum.Flag

concat(..., how="*_relaxed") (#3398)

I think this is fine in terms of backwards-compatibility.
It is a new option and for PandasLike, we don't validate the "schemas" in the non-relaxed case anyway 😂

We (prematurely, perhaps?) have documented what these rules are and where they're used:

When combining columns of different data types (e.g., in `concat(..., how="vertical_relaxed")`),

I would link to this directly in the updated docstring for (#3398)
We could call out PandasLike specifically though, since IIRC we do validate first for other backends.

Footnotes

  1. from a "compatibility layer between dataframe libraries"

@dangotbanned
Copy link
Copy Markdown
Member Author

Maintainability

The current docs do a great job explaining the promotions available, but a "how it works" could greatly help future onboarding.

100% on board with the motivation!
But could I ask - are there any specific parts of the code that you found hard to follow?
Ideally if we can solve that in the code - then we only need to keep one pretty docs page in sync 😅
E.g. some things we've done for performance are not always intuitive, but we can get the best of both worlds with changes in the style of (55a7de3)

If there is an apetite for making this configurable (#3396 (comment)), then explaining the tricky bits inline would be my preference (for now).

The current impl is under the assumption that valid supertypes are fixed and leans into that pretty heavily.
Things like caching, globals and the order we check things would need to adjust to the new world order

@dangotbanned
Copy link
Copy Markdown
Member Author

LOC aside

I feel the need to clear this up, but don't want this to distract from (#3396 (comment)) 🙂

@camriddell

We already have a fairly large codebase, and this new system adds 2k lines.

If we go purely by the full diff, okay yes there is a big +1700.

However, this covers most of the source LOC changes:

We can reduce the diff by splitting out dtypes.py -> dtypes/ changes if needed?
That would mainly just shrink the number of files touched though, since IIRC it was what gave us the -142.

IMO, those are pretty good figures for a feature that every backend could use in concat + others like (#3396 (comment)) 😏

Copy link
Copy Markdown
Member

@camriddell camriddell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor points & questions on specific code pieces. Nothing high-level.

left_fields, right_fields = left.fields, right.fields
if len(left_fields) != len(right_fields):
return _struct_fields_union(left_fields, right_fields)
new_fields = deque["Field"]()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use a deque here? It seems that we're only .appending to the object, so a list should be okay?

Also, is there a reason for the typing syntax on the right-side of the assignment?

left: Collection[Field], right: Collection[Field], /
) -> Struct | None:
"""Adapted from [`union_struct_fields`](https://github.com/pola-rs/polars/blob/c2412600210a21143835c9dfcb0a9182f462b619/crates/polars-core/src/utils/supertype.rs#L559-L586)."""
longest, shortest = (left, right) if len(left) >= len(right) else (right, left)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor (feel free to ignore/reject), but perhaps this a bit more intent-ful:

shortest, longest = sorted([left, right], key=len) 

for left_f, right_f in zip(left_fields, right_fields):
if left_f.name != right_f.name:
return _struct_fields_union(left_fields, right_fields)
if supertype := get_supertype(left_f.dtype(), right_f.dtype()):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this path do any less work than just always calling _struct_fields_union? It feels like the bulk of this entire function could just return _struct_fields_union

Copy link
Copy Markdown
Member Author

@dangotbanned dangotbanned Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to stare at this for a while again to see it 😂

I'm gonna definitely add some comments, thanks @camriddell

Note

We did steal both from polars 😉

So the other path is optimized for merging both dtype and name differences.

This one bails on the first name mismatch.

does this path do any less work

If we can avoid the name stuff, simply calling get_supertype a bunch can be quite cheap:

  • lots of it is frozenset ops and dict lookups
  • complex cases are aggresively cached 😅

Oh, and the other path requires creating and incrementally building up dict

longest_map = {f.name: f.dtype() for f in longest}

@camriddell
Copy link
Copy Markdown
Member

camriddell commented Apr 15, 2026

If we avoid overriding the behavior of a backend and only use this feature in the cases where the backend provides no alternative, then the current plan for limited application (e.g. concat) and the system put in-place as codified is +1 from me.

My primary concern (this extends beyond this PR, so take with a grain of salt) is that a native backend comes out with a feature that we've already bolted on top in the Narwhals API. What is the future of this feature? Do we continue to route users through our own implementation? Do we adopt the upstream feature with a version check? Will users be surprised if their results differ depending on an interaction between the version of narwhals and the version of their backend?

The above questions don't need to be answered for this PR, but just highlights where my perspective originates :)

Some follow ups from previous discussion

But could I ask - are there any specific parts of the code that you found hard to follow?

Had some time to take a closer look at the code and I think it's pretty readable (to me)! The main entrypoint is get_supertype which calls specific paths if the passed types share an ancestor or not. However, I have a hard time relating the code back to the rules specified in the great docs you already have. That said, I don't think this is a problem that blocks merging.

If we go purely by the full diff, okay yes there is a big +1700.

However, this covers most of the source LOC changes:

dtypes/_supertyping.py +384
narwhals/_dispatch.py +83
A few lines in dtypes/_classes.py
We can reduce the diff by #3204 (comment) changes if needed?
That would mainly just shrink the number of files touched though, since IIRC it was what gave us the -142.

IMO, those are pretty good figures for a feature that every backend could use in concat + others like (#3396 (comment))

Point well-taken. The implementation is only a few hundred lines and we do stand to get future re-use out of this.

but what escape hatches can users rely on to completely subvert the promotion logic implemented here
I may not have spotted this in the code

Yeah this isn't implemented yet (see usage in (#3398) schema.py diff) but I agree would be useful to have.
So all together - if we want configuration - I'd be thinking of this as a jumping off point:

I think hatches at the call-site would be a bit better than passing messages into/out-of get_supertype. But let's cross this bridge when we get to it in the first application of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dtypes enhancement New feature or request internal

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants