User Tools

Site Tools


0. Version history

What's new in versions 3.0 of Spatialytics ETL and 2.5 of GeoKettle?

Spatialytics is proud to announce the immediate release of Spatialytics ETL 3.0 and of GeoKettle 2.5, respectively the Enterprise and the Community Editions of its spatial ETL (Extract, Transform and Load) tool. Since July 2011 and the release of Spatialytics ETL 2.0 and of GeoKette 2.0, Spatialytics has performed a tremendeous work in order to provide users with an even more powerful, scalable, fast and aligned with the standards in the domain tool dedicated to operational and analytical geospatial data integration.

Here is the list of the main new features brought by this release:

  • Versions 2.0 of GeoKettle and Spatialytics ETL have introduced the read support of OGC Sensor Observation Service. With this new release comes a WFS Input step which enables the retrieval of geospatial data directly from an OGC Web Feature Service.
  • The CSW (OGC Catalogue Service for the Web) Input and Output steps introduced in version 2.0 have also been enhanced. Support for reading metadata (Dublin core and ISO) in Deegree and MDWeb CSW has been added. CSW Output step now also supports transactions (metadata insertion, update and deletion). It has been successfully tested with ISO metadata in Deegree and GeoNetwork. SDI administrators can now easily and automatically maintain their infrastructure catalogue service.
  • A WPS (OGC Web Processing Service) Client step has also been added. It enables the invocation of remote geoprocesses exposed as OGC Web Processing Services and hence allows the ETL to benefit from new geoprocessing capabilities. It has been successfully tested with Deegree, GeoServer, PyWPS, Zoo WPS and 52North WPS implementations.
  • Spatialytics ETL users have also the possibility to expose a transformation or a job as a true WPS service. Spatialytics ETL can then be seen as a geoprocessing service factory that, thanks to the use of its graphical worbench (aka. Spoon), makes the design and deployment of such services fast and easy. This unique feature will highly help SDI administrators in their every day life with the maintenance of such complex services.
  • OGR Input and Output steps have undergone a important evolution and provide now an advanced and powerfull support of the OGR capabilities:
    • They are now both based on the GDAL/OGR 1.9.1 codebase.
    • They now support reading and writing of datasources based on a connection string (Google Fusion Tables, OGC WFS, CouchDB, …). MS SQL Server 2008 is only supported in Spatialytics ETL.
    • Spatialytics ETL adds support for reading and writing ArcGIS 10 File Geodatabase directories.
    • Support for spatial (BBOX) and attribute filtering (OGR restricted where) is now available in the OGR Input step.
    • The OGR Output step supports the creation, override, update, append and delete modes when writing data.
    • A new “skip failures” feature has been added in the OGR Input step and avoids the reading of a datasource fails if errors are encountered.
    • Both OGR Input and Output steps add support for multiple data sources/destinations reading or writing. It is thus possible to process a bunch of files or datasources and hence enables advanced batch loading or writing capabilities.
    • Layer creation options are now fully taken into account in the OGR Output step.
    • The definition of a layer name when reading an OGR data source is now possible. It enables the extraction of data from a given layer in data sources that support multiple layers capacity.
    • Width and precision of data types are now retrieved from data sources that support this feature. Definition and use of partial date/time data types are also now supported.
  • The Table Ouput step has been enhanced. SQL reserved keywords and database schemas are now handled correctly. Strings containing simple quotes are also properly escaped and can be committed into DBMS without errors.
  • The SRS Transformation step now allows the sorting of the supported SRS list. It is also now possible to search for a specific SRS.
  • Some very convenient conversion functions have been added to the Calculator step:
    1. Convert a geometry feature in its WKT representation
    2. Build a geometry feature from a WKT string
    3. Convert a Point geometry as a X Y string
    4. Build a geometry from X,Y columns
    5. Convert a geometry feature in its GeoJSON representation
    6. Build a geometry feature from a GeoJSON
    7. Convert a geometry feature in its WKB representation
    8. Build a geometry feature from a WKB
    9. Convert a geometry feature in GML
    10. Build a geometry feature from a GML fragment
    11. Convert a geometry feature in KML
    12. Build a geometry feature from a KML fragment
  • New geometry predicates to test geometry types (GIS_IS_POINT, GIS_IS_LINESTRING, …) have been added and allow the filtering of spatial features in different steps.
  • A “split geometry collection” algorithm is now available in the Spatial Analysis step.
  • The User Defined Java Expression step now handles the geometry data type appropriately.
  • The JTS lib has been upgraded to its last stable version (aka. 1.13) in order to benefit from performance enhancements and bug fixes performed in this release.
  • A more up-to-date version of the EPSG database is also provided in this release. The definition of custom projection introduced in version 2.0 remains possible.
  • Even if the installation process of the GeoKettle or Spatialytics ETL is not so complicated, dedicated installers are now available and makes the installation user experience even more easy. Windows users wil now have an executable (.exe) file that starts a graphical installation wizard guiding them through the whole installation process. Linux users will be able to perform the installation via a Debian or a RPM package. Finally, a disk image (.dmg file) is available for Macintosh users. The classical zip archive is also still available in order to install GeoKettle or Spatialytics ETL on different OS or on a headless remote server.
  • Spatialytics ETL on Mac OS X platforms now runs in 64 bits mode whereas GeoKettle runs in 32 bits mode.
  • Modules add new features that are not provided de facto in the tool. There are currently three modules that can complement the Spatialytics ETL or GeoKettle capabilities. These modules can be found on the Spatialytics.org and are included in Spatialytics ETL:
    • Cookbook (only available for Spatialytics ETL): Enables the automatic generation of a documentation from metadata and comments included in transformations and jobs.
    • Meta Geocoding: Offers some geocoding/reverse geocoding capabilities based on different geocoding engines.
  • Finally, a documentation center has been set up. It gathers in one place advanced documentations and resources on all the Spatialytics software. It replaces the wiki that has been introducedwith version 2.0. At present, a complete version of the Spatialytics ETL documentation is available.

In addition to all these new features and capabilities, various bug fixes and performance enhancements have also been performed. See the revision log for further details.

What is new in version 2.0 of Spatialytics ETL and GeoKettle?

Introduction

It has been a while since the last release of GeoKettle. The last version was indeed 3.2.0-20090609, mentioning that this release was done on June 09, 2009. Since this date, GeoKettle has migrated from a research prototype to a full fledged and professionaly supported spatial ETL tool. Year 2010 has indeed seen the creation of Spatialytics, the company which is now behind GeoKettle. Development activities that have led to version 2.0 have begun in October 2010 and have lasted till now, i.e. in July 2011. Version 2.0 of GeoKettle is thus the result of an important amount of work: many new and powerful features have been added, some bugs have been fixed, performance and robustness have been enhanced.

About GeoKettle versions numbering

Even if the previous version of GeoKettle is numbered 3.2.0-20090609, the newest release is 2.0. This is because 3.2.0 was a reference to the version of Kettle on which GeoKettle was based. Current version is an important milestone for the project as it provides an important amount of new features, better performance and robustness. The previous numbering system did not allow to translate this matter of fact. That is why it has been decided to change the numbering of the versions and to name the new version as 2.0. It emphasizes more the important work performed to provide this new version.

However, it is important to note that versions 2.x will be the last versions of GeoKettle based on the Kettle 3.2 code base. Thanks to the tremendous work of the Kettle developers, future version of GeoKettle will be more pluggable with Kettle and will not be anymore a friendly and spatially enabled fork of Kettle. Hence, it will be possible to add spatial extensions provided by GeoKettle to any Kettle/PDI 4.x installation.

What is new in version 2.0?

Creation of a GUI-based installer for GeoKettle

In addition to the classic zip archive containing the binary distribution of GeoKettle that you could download on the download page of the project, a multiplatform installer with a graphical user interface is now available. It is the easiest way to install GeoKettle on your computer.

Support for Windows 64 bits platforms

While it was running perfectly on Linux 32/64 bits, the previous release of GeoKettle required to have a 32 bits JRE installed on your computer in order to work on Windows. With version 2.0, you can now run GeoKettle on this OS with a 64 bits JRE. GeoKettle has been extensively tested and is known to work very well on the Sun/Oracle JRE. There is no guarantee it will work on other JRE.

Addition of a spatial analysis step

It was already possible to access the JTS objects contained in Geometry fields in the “Modified Java Script Value” step. This made possible the use of spatial analysis functions such as buffer calculations, overlays, metric operators, etc. With version 2.0, a dedicated step named “Spatial Analysis” has been contributed. It makes access to spatial analysis functions easier. Users can now perform buffer, intersection, union, centroïd, … computations through a GUI without typing any line of script.

Addition of a geospatial data preview

It was already possible to preview rows processed during a transformation as a table in order to check if the transformation does the right job. With version 2.0 comes a geospatial preview. It is thus possible to access a cartographic view of geospatial data processed by a transformation. Select a step in the transformation, right click and select Preview in the popup menu. Select the geographic view tab.

You can display several layers (each layer corresponds to a geometry column in the data flow), modify colors, opacity and width of the symbols. You can also pan, zoom in/out and access attributes of a specific object.

Support of OGR

One of the main new feature of GeoKettle is the support of OGR. GeoKettle is thus now able to read/write all geospatial (vector) file formats supported by OGR such as Mapinfo tab and MIF/MID, DGN, GML 2, GPX, DXF, KML 2, GeoConcept, Spatial/SQLite, GeoRSS, … For a complete list of formats, please visit the OGR supported formats page. Also note that as GeoKettle relies on OGR for reading/writing these data formats, same limitations apply.

When writing data with the OGR Output step, you can pass some OGR specific options. Please read OGR documentation for further details on each format.

OGR support is available not only in Spoon (the GUI-based job and transformation designer) but also in Kitchen and Pan, the two command line tools that come with GeoKettle and allow to run jobs and tranformations in batch mode (i.e. by calling directly the GeoKettle engine without the graphical interface).

Read/write support for KML 2.2 file format

In addition to the support of OGR data formats, GeoKettle 2.0 adds the read/write support of KML 2.2 file format. This enables the export and display of geospatial data stemming from GeoKettle into apps like Google Earth.

Read/write support for GML 3.1.1 file format

GeoKettle 2.0 also provides a read/write support for GML 3.1.1 file format. Together with the support of GML 2 provided by OGR, it enables for instance the processing of data stemming from OGC WFS request. This addition opens the door to the automation of numerous tasks in the management of a Spatial Data Infrastructure (SDI).

Support for character encoding when writing/reading GIS file formats

When reading/writing data in Shapefile, KML 2.2, GML 3.1.1 file formats, you are now able to manage the character encoding in which the data are read/written. It is particularly useful when you have to integrate data coming from different countries that use different encodings.

Support for batch loading/writing with GIS file formats

In Shapefile, GML 3.1.1, KML 2.2 input and output steps, it is now possible to specify the name of the file to read from or write to in a column of the input flow. You can also use variables to define this path. These capabilities open the door to the automatic processing of a batch of files in a single step of the transformation. It will be of a peculiar interest for the SDI administrators which need for instance to load an important number of files on a frequent basis.

Extraction of data stemming from an OGC Sensor Observation Service (SOS)

With GeoKettle 2.0, you can now easily access to data stemming from sensors exposed via an OGC SOS (Sensor Observation Service) web service, as the open source implementation proposed by 52 North. You can then mix these data with different other heterogeneous (geo or not) data sources and enrich your analyses with live or historic data captured on the field.

You can access an OGC SOS via GET or POST method and specify different request parameters such as the time period bounds for which you want to retrieve measures captured by a specific sensor.

Read/write support for OGC Catalog Service (CSW)

GeoKettle 2.0 allows to automate the harvesting or at the opposite the feeding of an OGC Catalog Service (CSW). Metadata collected from a CSW can thus be processed as any other data in a transformation. They can be used to pilot actions performed in a transformation.

The CSW Output step allows to automatically retrieve some metadata about the data produced/processed by the ETL tool and to disseminate them in a catalog service, such as the open source implementations provided by the GeoNetwork and MDWeb projects.

The dedicated steps in GeoKettle support different metadata formats/encodings such the one specified by ISO 19115/19139. It makes these steps of peculiar interest for persons involved in the setup and/or management of a Spatial Data Infrastructure (SDI), especially in the context of the European INSPIRE directive.

Axes inversion bug when performing changes in SRS fixed

In previous version of GeoKettle, the SRS Transformation step was sometimes producing an inversion of coordinates axes when performing changes in the Spatial Reference System (SRS). This was due to a change performed by OGC in the definition of some SRS. This bug has been fixed in version 2.0 of your favorite spatial ETL tool.

Better support of SQL output for PostGIS

SQL output generated through the SQL File Output step was buggy with PostGIS in the previous release. Geometry column were not properly created. It has been fixed in version 2.0 and you can now automate the production of SQL dumps that fully support the Geometry data type provided by PostGIS.

SRID with PostGIS are now correcty supported

In a same way, with the SRID constraint enforcement in recent versions of PostGIS, insertion of geospatial data in this spatial DBMS was resulting in an error in previous versions of GeoKettle. SRID handling for PostGIS is now fully supported in GeoKettle 2.0. Insertion of rows and SQL output convey both the geospatial SRID.

Addition of a geospatial data aggregation capabilities

In addition to the usual aggregation operators (sum, average, minimum, maximum, …) available in the Group by step, some spatial aggregation capabilities have been added. It is now possible to group data along with different field values and to produce a geometry which is the result of the union, the geometry collection or the enveloppe of all the grouped geometric features.

Geometric computation capabilities added in the Calculator step

In version 2.0, the Calculator step has been extended with spatial computation capabilities. From now, you can perform usual spatial operations as computing:

Buffers Centroid Random point on surface Area Length Distance Intersection Union Envelope Boundary Convex hull Difference Symetric difference Inverse geometry

Support of JTS 1.12

GeoKettle 2.0 supports JTS 1.12 and hence benefits from the last enhancements brought by this powerful library. It is for instance possible to compute single sided buffers with different caps and joins. Computation intensive tasks involving geometric features rely when it is possible on PreparedGeometry provided by JTS in order to drasticaly shorten the computation time.

Internationalisation

Thanks to the mechanism for internationalization support provided by the Kettle developers, it is easy to provide the GeoKettle interfaces in different languages. By default, GeoKettle is fully available in English and French. Only parts of the tool are also available in other languages.

Obviously, we welcome any translations in different languages.

Various bug fixes and performance enhancements

See the change log for further details about bugs that have been fixed and performance enhancements performed in this release.

en/spatialytics_etl/000_versionhistory.txt · Last modified: 2015/04/20 08:11 by lvaillancourt

Page Tools