Skip to content

Revival of GH-7371: Added vector union support for Python#8680

Open
jtdavis777 wants to merge 6 commits intogoogle:masterfrom
jtdavis777:feature/python_vector_union
Open

Revival of GH-7371: Added vector union support for Python#8680
jtdavis777 wants to merge 6 commits intogoogle:masterfrom
jtdavis777:feature/python_vector_union

Conversation

@jtdavis777
Copy link
Collaborator

@jtdavis777 jtdavis777 commented Aug 21, 2025

I wanted to revisit GH-7371 (fixes #4530) to add Advanced Unions support to python.

All credit for the base work goes to @surculus12 (and I cherry picked their commits over to preserve authorship) - I'm only coming in to get it up to date with the current repo and hopefully get it across the finish line.

I will address the last PR comments left on the original PR as and if they still apply.

@github-actions github-actions bot added c++ codegen Involving generating code from schema python labels Aug 21, 2025
@jtdavis777 jtdavis777 changed the title Revival of !7371: Added vector union support for Python Revival of GH-7371: Added vector union support for Python Aug 21, 2025
@jtdavis777
Copy link
Collaborator Author

jtdavis777 commented Aug 25, 2025

One thing I either don't understand or want to fix, is how to get a member of a vector of union back to its 1st class type. right now the function returns a Table type with no obvious way to get back to the union type. May just be my inexperience with the generated python code. looking at the python examples in the documentation, this is normal.

@jtdavis777 jtdavis777 force-pushed the feature/python_vector_union branch from 9506d5c to 4168b59 Compare August 29, 2025 12:49
@jtdavis777 jtdavis777 force-pushed the feature/python_vector_union branch from 4168b59 to c5751ec Compare November 5, 2025 14:12
@jtdavis777
Copy link
Collaborator Author

@fliiiix would you be up for helping me polish this PR up? I just rebased the original commits onto (a more recent) master, but enough things changed I would appreciate a second set of eyes to point out what still needs work.

@hdiode-florian
Copy link

What's missing before this can go into master? I'm very interested in this feature.

@JamesBasham
Copy link

I am excited to see this PR! We have been using a fork with @surculus12's old diff for a couple years now. It is kind of wild that vector of unions was not supported before. Looking forward to seeing this get merged!!

Copy link

@eric-quera eric-quera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Would be great to get this merged in.

auto vectortype = field.value.type.VectorType();
if (vectortype.base_type == BASE_TYPE_STRUCT) {
GenUnPackForStructVector(struct_def, field, &code);
} else {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect GenUnPack here to also check for the BASE_TYPE_UNION here, a la

if base type is struct: GenUnPackForStructVector
elif base type is union: GenUnPackForUnionVector
else: GenUnPackForScalarVector

The java and csharp gen files do this, at least - why not the python?

case BASE_TYPE_ARRAY:
case BASE_TYPE_VECTOR: {
auto vectortype = field.value.type.VectorType();
if (vectortype.base_type == BASE_TYPE_STRUCT) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here, the other language gen files check for BASE_TYPE_UNION within the vector type base, so something like

if BASE_TYPE_STRUCT
  GenPackForStructVectorField
else if base_type == BASE_TYPE_UNION
  GenPackForUnionVectorField
else
  GenPackForScalarVectorField

field_method + "())";
}

void GenUnPackForStructVector(const StructDef& struct_def,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heavily borrowing from the other two GenUnPacks,

void GenUnPackForUnionVector(const StructDef& struct_def,
                               const FieldDef& field,
                               std::string* code_ptr) const {
    auto& code = *code_ptr;
    const auto field_field = namer_.Field(field);
    const auto field_method = namer_.Method(field);
    const auto struct_var = namer_.Variable(struct_def);
    const EnumDef& enum_def = *field.value.type.VectorType().enum_def;
    
    auto union_type = namer_.Type(enum_def);
    if (parser_.opts.include_dependence_headers) {
      union_type = namer_.NamespacedType(enum_def) + "." + union_type;
    }

    code += GenIndents(2) + "if not " + struct_var + "." + field_method +
            "IsNone():";
    code += GenIndents(3) + "self." + field_field + " = []";
    code += GenIndents(3) + "for i in range(" + struct_var + "." +
            field_method + "Length()):";
    
    // Call the union creator for each element, using the type vector
    // Example Pattern I think? Character.CharacterCreator(movie.CharactersType(i), movie.Characters(i))
    code += GenIndents(4) + "self." + field_field + ".append(" +
            union_type + "Creator(" +
            struct_var + "." + field_method + "Type(i), " +
            struct_var + "." + field_method + "(i)))";
  }

code += ")\n";
}

void GenPackForStructVectorField(const StructDef& struct_def,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heavily borrowing from StructVector and Union GenPackFor*Field methods,

  void GenPackForUnionVectorField(const StructDef& struct_def,
                                  const FieldDef& field,
                                  std::string* code_prefix_ptr,
                                  std::string* code_ptr) const {
    auto& code_prefix = *code_prefix_ptr;
    auto& code = *code_ptr;
    const auto field_field = namer_.Field(field);
    const auto struct_type = namer_.Type(struct_def);
    const auto field_method = namer_.Method(field);

    // Creates the field - union vectors work like non-fixed struct vectors maybe?
    code_prefix += GenIndents(2) + "if self." + field_field + " is not None:";
    code_prefix += GenIndents(3) + field_field + "list = []";
    code_prefix += GenIndents(3) + "for i in range(len(self." + field_field + ")):";
    
    // Note: I have put no thought into how this should work for strings in unions
    code_prefix += GenIndents(4) + "if self." + field_field + "[i] is not None:";
    code_prefix += GenIndents(5) + field_field + "list.append(self." + 
                   field_field + "[i].Pack(builder))";
    code_prefix += GenIndents(4) + "else:";
    code_prefix += GenIndents(5) + field_field + "list.append(0)";

    code_prefix += GenIndents(3) + struct_type + "Start" + field_method +
                   "Vector(builder, len(self." + field_field + "))";
    code_prefix += GenIndents(3) + "for i in reversed(range(len(self." +
                   field_field + "))):";
    code_prefix += GenIndents(4) + "builder.PrependUOffsetTRelative(" +
                   field_field + "list[i])";
    code_prefix += GenIndents(3) + field_field + " = builder.EndVector()";

    code += GenIndents(2) + "if self." + field_field + " is not None:";
    code += GenIndents(3) + struct_type + "Add" + field_method + "(builder, " +
            field_field + ")";
  }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now put thought into how this should work for strings in unions
s/

code_prefix += GenIndents(5) + field_field + "list.append(self." + 
                   field_field + "[i].Pack(builder))";

/

const EnumDef& enum_def = *field.value.type.VectorType().enum_def;
auto union_type = namer_.Type(enum_def);
if (parser_.opts.include_dependence_headers) {
  union_type = namer_.NamespacedType(enum_def) + "." + union_type;
}

code_prefix += GenIndents(4) + field_field + "list.append(" +
               union_type + "Pack(builder, self." + field_field + "[i], " +
               "self." + field_field + "Type[i]))";

/g

to get some nonsense like the following characterslist.append(Character.CharacterPack(builder, self.characters[i], self.charactersType[i]))

}
code += GenIndents(1) + "return None";
code += "\n";
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See idl_gen_csharp.cpp:L1788-L1810 - probably want a GenUnionCreator

std::string enumcode;
GenEnum(enum_def, &enumcode);
if (parser_.opts.generate_object_based_api & enum_def.is_union) {
GenUnionCreator(enum_def, &enumcode);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetUnionCreator(...)
GetUnionPacker(...)


// TODO(luwa): TypeT should be moved under the None check as well.
code_prefix += GenIndents(2) + "if self." + field_field + " is not None:";
code_prefix += GenIndents(3) + field_field + " = self." + field_field +

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with
code_prefix += GenIndents(3) + field_field + " = " + union_type + "Pack(builder, self." + field_field + ", self." + field_field + "Type)";

Not that important but I think that's safer / more defensive for single unions with string members

@jtdavis777
Copy link
Collaborator Author

@ZachMarcus thank you for the thorough review! I can probably work on incorporate these changes in the next couple days as I have time, or feel free to PR them into my fork if you feel so inspired.

@pfd-quera
Copy link

It would be great to get this merged! Any update?

@knmueller
Copy link

I am also looking for these changes. Hope to get this merged soon!

@jtdavis777
Copy link
Collaborator Author

Hey all, I see your comments! Give me a week or two and I'll do my best to polish this up and see about getting it merged.

@jtdavis777 jtdavis777 requested a review from dbaileychess as a code owner March 7, 2026 03:06
@jtdavis777
Copy link
Collaborator Author

I promise I still hope to get back to this PR to get it merged, thank you to everyone for your patience :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ codegen Involving generating code from schema python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vector union not supported Python [Python, OS X, master]

8 participants