Getting Started Using the MongoDB connector for BI

The new features of MongoDB 3.2 include the MongoDB connector for BI, a component allowing the use of industry-standard SQL-based BI and analytics platforms to represent JSON-formated information stored by MongoDB in a visual way alongside other data, structured or not.

This connector uses PostgreSQL to provide a relational view to MongoDB documents, acting as a bridge between the BI tool and the MongoDB server: it translates the SQL sentences into equivalent MongoBD queries and sends them to the server for processing, transforming the returned results into the tabular format that the BI tool expects.

Installing the connector and using it is very easy, as the following example illustrates:

On the one hand we have installed the MongoDB server in a machine called mongobi, and we will be using the bi database in this server. On the other hand, we will need to have Tableu and the PostreSQL driver for Tableu installed.

We will be consuming information about service stations in Spain, which we have loaded into a collection named estaciones in the bi database, including data such as its geographic location, opening hours, and the price for each of the various types of fuel sold in each gas station. Documents in this collection look like this:

documento

This data is being published on a daily basis  by the Spanish Ministry of Industry, Energy and Tourism on its online site, and can be downloaded from this URL. You may want to edit the data before importing it into MongoDB, in order to prevent any formatting problems, like accent marks, dots, etc.

Our objective is to build a Tableau dashboard based on this information, showing the average, minimum or maximum price of each kind of gasoline by province, something more or less like this:

tableau - dashboard

Installation

The first step is installing the MongoDB connector for BI. The connector is available as part of the MongoDB Enterprise Advanced subscription, but it can be downloaded for evaluation purposes from the MongoDB Download Center. It is currently available for RHEL 6 and RHEL 7 only, but packages for Ubuntu and Debian are coming soon. Once the download has been completed, we just need to unzip the tarball and install the RPM packages contained in it. In this example, I’ve used the Centos 7 release:

mongobi_tar

Next we need to create a user for the connector using mongobiuser. You must specify the user name and the connection to the mongo server, including the hostname, the port, and the database where the desired data is located. In our case, the user is called usuariobi, and we will use the bi database in the server running in mongobi:

mongobi_create

Mapping document data to a relational schema

The connector needs a file that maps the MongoDB document structure to a relational schema using the Document Relational Definition Language (DRDL). In order to automatically generate the file we use the mongodrdl command. This command needs to know the database name and, optionally, the collection name and the output file, which in this case is bi_estaciones.drdl.
mongodrdl

The drdl file generated can be downloaded here. In the following, an excerpt with the most relevant information is shown:

estaciones_schema

estaciones_precios_schema

For each field in the document a column will be created in the appropiate PostgreSQL table. Fields with type string will become columns with type varchar. A varchar is also used to represent the _id field in the relational model and fields with date type will be treated as the timestamp type.

In order to represent the array of prices, the connector will use 2 tables: one without the array contents and the other where for each element in the array a row is generated, duplicating the rest of the fields in the document.

In addition, each element in the array is a subdocument that the connector represents as simple fields that have a dot as separator character, making them appear similar to the way you would reference them using dot notation in a MongoDB query.

The file can be edited to make any necessary changes, such as removing or renaming fields, etc.

The last step is to import the file with the scheme in the connector, using mongobischema. The import command needs the connector user name and the drdl file where the scheme is stored.

mongobi_import

Creating our dashboard

Our data is now ready to use in Tableau. The database connection is established using the PostgreSQL connector, and the data inside the tool can be used just as if it were stored in relational tables. The name of the database to which we will connect is the same as the connector user, usuariobi.

postgres_connection

tableau_datasource

In order to generate the dashboard shown before, we transformed the data in several ways from Tableu. For example, we changed the data type for the Price field to numeric and we converted it to continuous to use it as a measure. We gave a geographic role to the Province field to locate it in the map. We used a quick filter that allows us to select the type of product that we want to show in the map. Finally, we defined a calculated field to show the average, the minimum or the maximum at our choice.

The MongoDB connector makes it easy to visualize JSON-formatted data stored in MongoDB using relational Business Intelligence tools. You can find more information about the connector in the official MongoDB Documentation.

Beatriz Alonso Lopez

MongoDB Consultant & Oracle DBA

Leave a Reply