Groovy: read CSV file
Groovy does not seem to provide a built-in CSV reader, but there are a number of third-party open source implementations.
In this case I am using GroovyCSV that has its source at GitHub.
Read CSV file and add number in the 3rd column
This is the solution to the exercise in which we need to read the file
Tudor;Vidor;10;Hapci
Szundi;Morgo;7;Szende
Kuka;Hofeherke;100;Kiralyno
Boszorkany;Herceg;9;Meselo
and add the numbers in the 3rd column.
In another case we need to read this file:
Budapest,Bukarest,1200,km
Tel Aviv,Beirut,500,km
London,Dublin,300,km
New York,"Moscow, East",6000,km
Local,"Remote
Location",10,km
and again add the 3rd column
The solution for the first file looks like this:
@Grab('com.xlson.groovycsv:groovycsv:1.3')
import static com.xlson.groovycsv.CsvParser.parseCsv
fh = new File('examples/data/process_csv_file.csv')
def csv_content = fh.getText('utf-8')
def data_iterator = parseCsv(csv_content, separator: ';', readFirstLine: true)
// println data_iterator.getClass() // class com.xlson.groovycsv.CsvIterator
def sum = 0
for (line in data_iterator) {
sum += line[2] as Integer
}
println sum
The first line tells Groovy to use the Grape dependency management tool and install the package. This will happen the first time we run our code. In subsequent runs it will already use the locally intsalled version of the module.
@Grab('com.xlson.groovycsv:groovycsv:1.3')
The second line tells Groovy to import the parseCsv
class from the already installed library.
import static com.xlson.groovycsv.CsvParser.parseCsv
The we open the file using the File class and read in the whole content using the getText
method.
fh = new File('examples/data/distance.csv')
def csv_content = fh.getText('utf-8')
The parseCSV
function expects the content of the CSV file as its first parameter and then it can also accept configuration options. In our case we set the separator
to be ;
as our file uses that instead of plain comma ,
We also set the readFirstLine
to be true
as we wanted the first row to be treated as data and not as header.
The defaults are ,
and false
respectively. There are some other parameter one can set.
The call to parseCSV
return an instance of com.xlson.groovycsv.CsvIterator
. We can use that to iterate over the rows.Each iteration line
holding an array of the fields of the current line. We can use regular array indexing to access index 2 which is the 3rd column as arrays are 0-based.
@Grab('com.xlson.groovycsv:groovycsv:1.3')
import static com.xlson.groovycsv.CsvParser.parseCsv
fh = new File('examples/data/distance.csv')
def csv_content = fh.getText('utf-8')
def data_iterator = parseCsv(csv_content, readFirstLine: true)
def sum = 0
for (line in data_iterator) {
sum += line[2] as Integer
}
println sum
The second file uses ,
as separators which is the default so we don't need to set that explicitly.
Other than that the two solutions are identical.
timestamp: 2018-06-02T11:30:01 tags:
- @Grab
- import
- File
- getText
- parseCSV
- separators
- readFirstLine