Loading...

Thursday, October 11, 2012

Grassroots Groovy: Parse XML with XmlSlurper from Java

We can introduce Groovy into our Java projects at grassroots level. Even if we aren't allowed to run the Groovy compiler we can use other ways to run Groovy code. As long as we can include the Groovy libraries as a compile dependency than we can already use Groovy from Java. In this post we see how we can use the power of XmlSlurper to parse XML from our Java code.

To execute a Groovy script from we can use a GroovyShell object and invoke the evaluate() method. The evaluate() method can parse a Groovy script as File or Reader object. We can also use a String value to be evaluated. The last statement of the script that is evaluated can be assigned to a Java variable. To pass variables to the script we use the Binding object. This is a map of variables and their values. We assign values to the variables in the Java code and in the Groovy script we can use the variable values.

In the example we need to be able to parse a XML document with the following structure and find users with a given age:

<?xml version="1.0"?>
<users>
    <user age="39">mrhaki</user>
    <user age="39">hubert</user>
    <user age="23">chris</user>
</users>

We first define a simple Java interface with a method getUsersWithAge(int):

package com.mrhaki.groovy.grassroots.xml;

import com.mrhaki.groovy.grassroots.model.User;

import java.util.List;

public interface DataExtractor {
    /**
     * Get list of User object parsed from
     * an input source like XML.
     *
     * @param age Age of users to look for.
     * @return List of found users.
     */
    List<User> getUsersWithAge(final int age);
}

We have the following User class:

package com.mrhaki.groovy.grassroots.model;

public class User {

    private final String username;
    private final int age;

    public User(final String username, final int age) {
        this.username = username;
        this.age = age;
    }

    public String getUsername() {
        return username;
    }

    public int getAge() {
        return age;
    }
}

The following class is an implementation of the DataExtractor interface and uses Groovy code to find all users from an XML source with a given age:

package com.mrhaki.groovy.grassroots.xml;

import com.mrhaki.groovy.grassroots.model.User;
import groovy.lang.Binding;
import groovy.lang.GroovyShell;

import java.util.List;

/**
 * Use XML input source to extract data from.
 */
public class XMLDataExtractor implements DataExtractor {
    /**
     * XML input source String.
     */
    private final String xml;

    public XMLDataExtractor(final String xml) {
        this.xml = xml;
    }

    @Override
    public List<User> getUsersWithAge(final int age) {
        // First we create the binding with variables
        // to be used in the Groovy script.
        final Binding binding = new Binding();
        binding.setVariable("xml", xml);
        binding.setVariable("age", age);

        // Create Groovy shell to run script and
        // set binding with variables for the script.
        final GroovyShell shell = new GroovyShell(getClass().getClassLoader(), binding);

        // Create Groovy script as String.
        // We use the XmlSlurper to parse the XML String and
        // return a list of found objects.
        final ScriptBuilder parseScript = new ScriptBuilder();
        parseScript.addLine("import com.mrhaki.groovy.grassroots.model.User");
        parseScript.addLine("def slurper = new XmlSlurper().parseText(xml)");
        parseScript.addLine("slurper.user.findAll { it.@age == age }.collect { new User(it.text(), it.@age.toInteger()) } ");

        // Evaluate script. The last line of the script is
        // automatically the return statement, so the value
        // is assigned to result.
        // We could also assign it to a variable in the script and
        // use binding.getVariable() to get the value.
        final Object result = shell.evaluate(parseScript.build());

        return (List<User>) result;
    }

    /**
     * Utility builder class to create a Groovy script
     * as String with the correct line endings.
     */
    private final class ScriptBuilder {
        private StringBuilder script = new StringBuilder();

        public ScriptBuilder addLine(final String scriptLine) {
            script.append(scriptLine);
            script.append(newLine());
            return this;
        }

        private String newLine() {
            return System.getProperty("line.separator");
        }

        public String build() {
            return script.toString();
        }
    }

}

Ronan gave some good comments based on this version of the XMLDataExtractor class. We can reduce the Groovy code to be executed to a single line. We can add the import for the User with an ImportCustomizer. And in the binding we can immediately bind the result of the new XmlParser().parseText(xml) method to the slurper variable. Here is another version of the XMLDataExtractor class:

package com.mrhaki.groovy.grassroots.xml;

import com.mrhaki.groovy.grassroots.model.User;
import groovy.lang.Binding;
import groovy.lang.GroovyShell;
import groovy.util.XmlSlurper;
import org.codehaus.groovy.control.CompilerConfiguration;
import org.codehaus.groovy.control.customizers.ImportCustomizer;
import org.xml.sax.SAXException;

import javax.xml.parsers.ParserConfigurationException;
import java.io.IOException;
import java.util.List;

/**
 * Use XML input source to extract data from.
 */
public class XMLDataExtractor implements DataExtractor {
    /**
     * XML input source String.
     */
    private final String xml;

    public XMLDataExtractor(final String xml) {
        this.xml = xml;
    }

    @Override
    public List<User> getUsersWithAge(final int age) {
        // First we create the binding with variables
        // to be used in the Groovy script.
        final Binding binding = createBinding(age);

        // Create Groovy shell to run script and
        // set binding with variables for the script.
        final GroovyShell shell = createGroovyShell(binding);

        // Evaluate script. The line of the script is
        // automatically the return statement, so the value
        // is assigned to result.
        final Object result = shell.evaluate("slurper.user.findAll { it.@age == age }.collect { new User(it.text(), it.@age.toInteger()) } ");

        return (List<User>) result;
    }

    private GroovyShell createGroovyShell(final Binding binding) {
        // Add import for User class to script compiler configuration.
        final ImportCustomizer imports = new ImportCustomizer();
        imports.addImports("com.mrhaki.groovy.grassroots.model.User");
        final CompilerConfiguration config = new CompilerConfiguration();
        config.addCompilationCustomizers(imports);

        return new GroovyShell(getClass().getClassLoader(), binding, config);
    }

    private Binding createBinding(final int age) {
        final Binding binding = new Binding();
        try {
            binding.setVariable("slurper", new XmlSlurper().parseText(xml));
        } catch (IOException e) {
            throw new RuntimeException(e);
        } catch (SAXException e) {
            throw new RuntimeException(e);
        } catch (ParserConfigurationException e) {
            throw new RuntimeException(e);
        }
        binding.setVariable("age", age);
        return binding;
    }

}

To finish it up we can here see some simple tests to check the results of our XMLDataExtractor object:

package com.mrhaki.groovy.grassroots.xml;

import static org.junit.Assert.assertEquals;

import com.mrhaki.groovy.grassroots.model.User;
import org.junit.Test;

import java.util.List;

public class XMLDataExtractorTest {

    @Test
    public void numberOfUsersWithGivenAgeMustReflectNumbeOfUsersInXML() {
        final String sample = createSampleXml();
        final DataExtractor extractor = getExtractor(sample);

        final List<User> users = extractor.getUsersWithAge(39);

        assertEquals(2, users.size());
    }

    @Test
    public void foundUserMustHaveCorrectPropertyValues() {
        final String sample = createSampleXml();
        final DataExtractor extractor = getExtractor(sample);

        final List<User> users = extractor.getUsersWithAge(23);

        final User user = users.get(0);
        assertEquals("chris", user.getUsername());
        assertEquals(23, user.getAge());
    }

    private String createSampleXml() {
        final StringBuilder xml = new StringBuilder("<?xml version=\"1.0\"?>");
        xml.append("<users>");
        xml.append(createUserSampleXml("mrhaki", 39));
        xml.append(createUserSampleXml("hubert", 39));
        xml.append(createUserSampleXml("chris", 23));
        xml.append("</users>");
        return xml.toString();
    }

    private String createUserSampleXml(final String username, final int age) {
        return String.format("<user age=\"%2$s\">%1$s</user>", username, age);
    }

    protected DataExtractor getExtractor(final String sample) {
        return new XMLDataExtractor(sample);
    }

}

We have to add Groovy as a compile dependency to a Java project. The following build script defines a dependency on the Groovy XML module. Since Groovy 2 we don't have to include all of Groovy, but we can pick the modules we need and add them to our project:

// File: build.gradle
apply {
    plugin 'java'
    plugin 'idea'
}

version = 1.0

repositories {
    mavenLocal()
    mavenCentral()
}

ext {
    groovyVersion = '2.0.+'
    junitVersion = '4.+'
}

dependencies {
    compile "org.codehaus.groovy:groovy-xml:$groovyVersion"
    testCompile "junit:junit:$junitVersion"
}

3 comments:

Ronan said...

1 -------
parseScript.addLine("import com.mrhaki.groovy.grassroots.model.User");

could be replace by something like :

parseScript.addImport(User.class);

public ScriptBuilder addImport(Class c) {
addLine("import " + c.getName() + ";");
return this;
}

or something with your technique :

http://mrhaki.blogspot.fr/2011/06/groovy-goodness-add-imports.html

2 -------
And why bind xml string but not directly the slurper ?

binding.setVariable("slurper", new XmlSlurper().parseText(xml));

to have just one groovy line

parseScript.addLine("slurper.user.findAll { it.@age == age }.collect { new User(it.text(), it.@age.toInteger()) } ");

Hubert Klein Ikkink said...

@Ronan: Thank you very much for your comments. I have added a new version of the XmlDataExtractor class based on your remarks. I think it is even cleaner with this single line evaluation!
When binding slurper directly we must handle the exceptions in our Java code, but this is not really a problem.

Cadu Fernandes said...

Im using this script but the error message is:

unable to resolve class br.com.qreservas.engine.wrappers.hostel
at line: 1, column: 1

unable to resolve class HostelObjectWrapper
at line: 31, column: 15

Post a Comment