Golang custom encoder/decoder for XML to Golang struct
I’m consuming a REST service which gives various lists in both JSON and XML formats, something similar to these ones:
[ { "id":803267, "name":"Paris, Ile-de-France, France", "region":"Ile-de-France", "country":"France", "is_day":1, "localtime":"2018-05-12 12:53" }, { "id":760995, "name":"Batignolles, Ile-de-France, France", "region":"Ile-de-France", "country":"France", "is_day":0, "localtime":"2018-05-12" } ]
<?xml version="1.0" encoding="UTF-8"?> <root> <geo> <id>803267</id> <name>Paris, Ile-de-France, France</name> <region>Ile-de-France</region> <country>France</country> <is_day>1</is_day> <localtime>2018-05-12 12:53</localtime> </geo> <geo> <id>760995</id> <name>Batignolles, Ile-de-France, France</name> <region>Ile-de-France</region> <country>France</country> <is_day>0</is_day> <localtime>2018-05-12</localtime> </geo> </root>
And I wanted to get them into a slice of this type of structures:
type Locations []Location type Location struct { ID int `json:"id" xml:"id"` Name string `json:"name" xml:"name"` Region string `json:"region" xml:"region"` Country string `json:"country" xml:"country"` IsDay int `json:"is_day" xml:"is_day"` LocalTime string `json:"localtime" xml:"localtime"` }
It’s straight forward for JSON:
jsonRes := &Locations{} err := json.Unmarshal([]byte(jsonInput), jsonRes) if err != nil { log.Println(err) } fmt.Printf("%+v\n", jsonRes) &main.Locations{ main.Location{ ID:803267, Name:"Paris, Ile-de-France, France", Region:"Ile-de-France", Country:"France", IsDay:1, LocalTime:"2018-05-12 12:53" }, main.Location{ ID:760995, Name:"Batignolles, Ile-de-France, France", Region:"Ile-de-France", Country:"France", IsDay:0, LocalTime:"2018-05-12 12:53" } }
While for XML things are not working as I was expecting:
xmlRes := &Locations{} err := xml.Unmarshal([]byte(xmlInput), xmlRes) if err != nil { log.Println(err) } fmt.Printf("%#v\n", xmlRes) &main.Locations{ main.Location{ ID:0, Name:"", Region:"", Country:"", IsDay:0, LocalTime:"" } }
I’ve thought about solutions like having some nested structures instead of a slice, and I could’ve tagged the inner structure like I did for the fields, and things around this idea, but I really wanted to have a slice, because it’s a simple list with some items. After a few hours struggling, the idea came to light: I can do anything I want if I implement the XML Unmarshaler interface and manually write the data into the slice.
func (s *Locations) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error { // I define the slice locations := make(Locations, 0) // and an element to decode each item into el := &Location{} // Then I decode each item until end of file or an error for { err := d.Decode(el) if err == io.EOF { break } if err != nil { return err } // and I append each item into the slice locations = append(locations, *el) } // Finally, I overwrite the pointer receiver (which is the Locations slice) with the slice I've collected the items into *s = locations return nil }
And I got the same result as for JSON.
Another thing that I wanted was to have proper data types for all fields:
- IsDay is 0 or 1, which should be a boolean
- LocalTime is a time format string, so time.Time would suit
You can’t just define the fields with those types, data cannot be decoded directly as this:
IsDay bool `json:"is_day" xml:"is_day"` LocalTime time.Time `json:"localtime" xml:"localtime"`
Maybe I could approach the situation like above, implementing the Unmarshaler interface for XML and for JSON (it has one, too) for the slice, and rewrite everything, but it would be too much. Instead, I’ve defined a new type for each data type I wanted to use for the fields I wanted to have different types than what I’ve got, and implemented the Unmarshaler interfaces for each one of those new types.
IsDay Bool `json:"is_day" xml:"is_day"` LocalTime DateTime `json:"localtime" xml:"localtime"` // Bool is used to convert int 1/0 to bool true/false type Bool bool // DateTime is used to convert string represented time to time.Time format type DateTime time.Time
Bool is 0 or 1, which will be false or true. And when I marshal it back, I’m good with keeping false/true.
// UnmarshalJSON converts int to bool from JSON func (b *Bool) UnmarshalJSON(data []byte) error { num, err := strconv.Atoi(string(data)) if err != nil { return err } value, err := parseBool(num) if err != nil { return err } *b = Bool(value) return nil } // UnmarshalXML converts int to bool from XML func (b *Bool) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error { var el *int if err := d.DecodeElement(&el, &start); err != nil { return err } if el == nil { return nil } value, err := parseBool(*el) if err != nil { return err } *b = Bool(value) return nil } func parseBool(value int) (b bool, err error) { switch value { case 1: b = true case 0: b = false default: err = fmt.Errorf("invalid value for bool: %d", value) } return }
For DateTime, the input string can have different formats. It’s up to you if you want to return an error or just a zero time if an unknown format comes in. And when you marshal it back, you must convert it back to string.
const dateMarshalFormat = "2006-01-02 15:04" // dateLayouts of supported time formats var dateLayouts = []string{ "2006-01-02 15:04", "2006-01-02", } // MarshalJSON converts time to string representation func (t *DateTime) MarshalJSON() ([]byte, error) { dt := formatDate(t) res := "null" if dt != nil { res = fmt.Sprintf(`"%s"`, *dt) } return []byte(res), nil } // UnmarshalJSON converts string represented time to time.Time from JSON func (t *DateTime) UnmarshalJSON(b []byte) error { str := string(b) if str == "null" { return nil } dt, err := parseTime(str) if err != nil { return err } *t = DateTime(dt) return nil } // MarshalXML converts time to string representation func (t *DateTime) MarshalXML(e *xml.Encoder, start xml.StartElement) error { dt := formatDate(t) if dt == nil { return nil } return e.EncodeElement(*dt, start) } // UnmarshalXML converts string represented time to time.Time from XML func (t *DateTime) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error { var el *string if err := d.DecodeElement(&el, &start); err != nil { return err } if *el == "null" { return nil } dt, err := parseTime(*el) if err != nil { return err } *t = DateTime(dt) return nil } func parseTime(value string) (dt time.Time, err error) { value = strings.Trim(value, `"`) for _, l := range dateLayouts { if dt, err = time.Parse(l, value); err == nil { return } } return } func formatDate(value *DateTime) *string { if value == nil { return nil } dt := time.Time(*value) if dt.IsZero() { return nil } formatted := dt.Format(dateMarshalFormat) return &formatted }
When you want to be precise about your JSON and XML data, custom marshal and unmarshal is a way to go.