Skip to content

Commit 11ff8ee

Browse files
committed
[Enhancement](test) Add tests for Python UDF (apache#60499)
Some versions of pyarrow do not support directly converting Python's `dict` type to the `arrow Map` type, so we need to manually convert the `dict` type to a `list of tuple` before converting it to `arrow Map` (consistent with the type mapping in `convert_arrow_field_to_python`). ```python def convert_arrow_field_to_python(field, column_metadata=None): if pa.types.is_map(field.type): # pyarrow.lib.MapScalar's as_py() returns a list of tuples, convert to dict list_of_tuples = field.as_py() return dict(list_of_tuples) if list_of_tuples is not None else None ```
1 parent 34a827d commit 11ff8ee

25 files changed

Lines changed: 7130 additions & 4 deletions

be/src/runtime/user_function_cache.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,9 @@ class UserFunctionCache {
6262
Status get_pypath(int64_t fid, const std::string& url, const std::string& checksum,
6363
std::string* libpath);
6464

65+
#ifndef BE_TEST
6566
private:
67+
#endif
6668
Status _load_cached_lib();
6769
Status _load_entry_from_lib(const std::string& dir, const std::string& file);
6870
Status _get_cache_entry(int64_t fid, const std::string& url, const std::string& checksum,

be/src/udf/python/python_server.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -373,6 +373,24 @@ def convert_python_to_arrow_value(value, output_type=None):
373373
else:
374374
# Not a struct type, treat as regular tuple and recurse without type
375375
return tuple(convert_python_to_arrow_value(v, None) for v in value)
376+
377+
if isinstance(value, dict):
378+
# For map types, convert keys and values recursively
379+
if output_type and pa.types.is_map(output_type):
380+
key_type = output_type.key_type
381+
item_type = output_type.item_type
382+
# Convert dict to list of tuples (PyArrow Map format)
383+
converted_items = [
384+
(convert_python_to_arrow_value(k, key_type),
385+
convert_python_to_arrow_value(v, item_type))
386+
for k, v in value.items()
387+
]
388+
return converted_items
389+
else:
390+
# No type info, just recurse without type
391+
return [(convert_python_to_arrow_value(k, None),
392+
convert_python_to_arrow_value(v, None))
393+
for k, v in value.items()]
376394

377395
if isinstance(value, pd.Series):
378396
return value.apply(lambda v: convert_python_to_arrow_value(v, output_type))

0 commit comments

Comments
 (0)