apache pig & piggybank avro union types -
i have record of union type of
union {typea, typeb, typec, typed, typee} mydata; i have serialized data in avro format, when trying use piggybank.jar's avrostorage function load avro data, gives me following error:
caused by: java.io.ioexception: don't accept schema containing generic unions. @ org.apache.pig.piggybank.storage.avro.avroschema2pig.convert(avroschema2pig.java:54) @ org.apache.pig.piggybank.storage.avro.avrostorage.getschema(avrostorage.java:384) @ org.apache.pig.newplan.logical.relational.loload.getschemafrommetadata(loload.java:174) ... 23 more so, after reading piggybank source code here https://github.com/triplel/pig/blob/branch-0.12/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/avrostorageutils.java
/** determine whether union nullable union; * note function doesn't check containing * types of input union recursively. */ public static boolean isacceptableunion(schema in) { if (! in.gettype().equals(schema.type.union)) return false; list<schema> types = in.gettypes(); if (types.size() <= 1) { return true; } else if (types.size() > 2) { return false; /*contains more 2 types */ } else { /* 1 of 2 types null */ return types.get(0).gettype().equals(schema.type.null) || types.get(1) .gettype().equals(schema.type.null); } } basically piggybank's avrostorage not support more 2 union types, wondering idea behind decision? why not make compatible avro?
Comments
Post a Comment