xml-avro-converter

xml-avro-converter provides a framework for translating XML schemas and data into an equivalent, yet more efficient, Avro format. This enables transmission and storage of the same data while using less bandwidth and disk space. The Avro formatted-data can also be translated back into the equivalent XML data if desired. Data being converted from XML to Avro or vice versa is mediated through creation of Java objects from a single set of Java classes generated by the XML schema, so no additional translation software or XSLT-style sheets are needed to map data values between the two formats.

xml-avro-converter uses Avro's ReflectData class to generate a schema from a class on the classpath. ReflectData does not natively support adding inherited types to an Avro schema. xml-avro-converter resolves this by providing an interface to automatically modify the schema to accommodate inherited types. All that is required is a one-line declaration for each inherited type, and xml-avro-converter will replace all instances of the base type in the schema with a union for that type and all the subtypes which have been declared. This enables developers to quickly create an Avro schema from an existing Java class hierarchy, even when the Java class hierarchy uses polymorphic types.

This enables xml-avro-converter to generate a full Avro schema from a a Java class hierarchy which is created using JAXB. JAXB can be used to generate a Java class hierarchy from a set of XML schema definitions, and this Java class hierarchy can be used to create an Avro schema - thus enabling creating an Avro schema from an XML schema. To convert data, JAXB can be used to deserialize XML documents into Java objects, and then these Java objects can be serialized using the Avro library and the generated Avro schema. The reverse process can be achieved as well, converting Avro data into Java objects and then into XML.

Since this process uses the same Java class hierarchy, schema and data conversion can often be achieved without the need to write any translation logic at all. However, there are certain situations where users must guide the conversion process. Where JAXB takes advantage of Java's inheritance model, users must find and declare inherited types to guide the schema to accommodate these inherited types. And in some cases, JAXB will generate generic JAXBElement<> types to wrap certain member variables when Java's object model cannot capture the full expressiveness of XML's element model - in these cases, users must manually define the portions of the schema and translation logic of these specific elements. Several examples follow which demonstrate this functionality.

For more information on how to use this package, consult the documentation.