Connecting Apache Superset

In this article, we will walk through the process of connecting Apache Superset to the analytical database. To make it more illustrative, we will also build a dashboard that connects to the analytical database and monitors the real-time status of vehicles, all using Apache Superset.

This guide is part of the DataHub documentation suite and specifically covers connecting Power BI to your data warehouse. If you're still deciding which BI tool to use, refer to the Selecting BI tools overview.

Dashboard features

  • Display total number of objects

  • Visualize vehicle movement statuses (moving/stopped/parked)

  • Visualize connection statuses (active/idle/offline)

  • Detailed table with current status of all vehicles

  • Filtering by vehicle type, group, movement status, and connection status

  • Data and report export capabilities

  • Customizable notifications and alerts

Technical requirements

  • Docker and Docker Compose

  • Minimum 4 GB RAM (8 GB recommended)

  • 20 GB of free disk space

  • Linux/Windows with WSL2/macOS

  • Python 3.8+

  • Internet access for database connection

Installation and setup

  1. Install Docker and Docker Compose by following the official documentation:

  1. Download the official docker-compose file:

  1. Start Superset:

  1. Create an administrator:

  1. Initialize the database:

  1. Load examples and initialize roles:

2. Installation with pip (for development)

  1. Create a virtual environment:

  1. Install Superset:

  1. Initialize the database:

  1. Create an administrator:

  1. Load examples and initialize roles:

  1. Start Superset:

Database connection

  1. Log in to Superset (default: http://localhost:8088)

  2. Navigate to Data → Databases

  3. Click "+" to add a new database

  4. Fill in the connection parameters:

    1. Database: PostgreSQL

    2. SQLAlchemy URI: postgresql://${DB_USER}:${DB_PASS}@${DB_HOST}:${DB_PORT}/${DB_NAME}

    3. Display Name: Analytics Database

    4. Extra: {"engine_params": {"connect_args": {"sslmode": "require"}}}

  5. Click Test Connection to verify the connection

  6. Save the settings

Connection parameter reference

Lakehouse Parameter
Apache Superset Setting Location
Notes

Host

DB_HOST in SQLAlchemy URI

The database server address provided in your welcome email

Port

DB_PORT in SQLAlchemy URI

Default is 5432 for PostgreSQL

Database name

DB_NAME in SQLAlchemy URI

Your assigned database name

Username

DB_USER in SQLAlchemy URI

Your database username

Password

DB_PASS in SQLAlchemy URI

Your secure database password

SSL mode

connect_args in Extra parameters

Set to require in the Extra JSON configuration

Schema

Dataset configuration

Specify schema (raw_business_data or raw_telematics_data) in each dataset

Dashboard and chart import

  1. Clone the bi-integratons repository:

  1. In Superset, go to Settings → Import/Export

  2. Import the files in the following order:

    1. datasets.json - datasets

    2. charts.json - charts

    3. dashboards.json - dashboards

  3. After importing, update the database connections in each dataset

Troubleshooting

Database connection issues

  • Connection error: Check the correctness of credentials and connection parameters

  • Firewall error: Ensure your IP address is added to the allowlist

  • SSL issues: Check SSL settings in connection parameters

Performance issues

  • Slow visualization loading:

    • Optimize SQL queries

    • Reduce the number of simultaneously displayed elements

    • Use result caching

  • High memory usage:

    • Increase Docker container resources

    • Optimize database queries

Other issues

Here are some tricks that can help you fix common issues:

  1. Check Superset logs:

  1. Restart containers:

  1. Clear browser cache

  2. Check Superset version and update if necessary

Next steps

After successfully connecting Power BI to your DataHub instance, we recommend you to:

  • Explore the available data schemas by reviewing the Schema overview section to better understand the data structure and relationships.

  • Start with simple queries focused on specific business entities before building complex dashboards - check our example queries for reference.

Support

For technical questions or requests for access to the demonstration database, please contact: [email protected]

Last updated

Was this helpful?