A data ingestion pipeline for Polkadot-based blockchains that combines Substrate API Sidecar with a custom block ingest service.
dotlake-community enables comprehensive data extraction and processing from Polkadot-based networks through three key components:
- Substrate API Sidecar: REST service for blockchain data access
- Custom Block Ingest Service: Data processing and storage pipeline
- Apache Superset: Data visualization and analytics
- Docker and Docker Compose
- Access to a Substrate-based blockchain node (WSS endpoint)
- Sufficient storage space for blockchain data
- Clone the repository:
git clone https://github.com/your-org/dotlake-community.git
cd dotlake-community
- Configure your settings in
config.yaml
:
relay_chain: Polkadot
chain: Polkadot
wss: wss://polkadot-rpc.dwellir.com
databases:
- type: mysql
host: xx.xx.xx.xx
port: 3306
name: dotlake_sidecar_poc
user: *****
password: ******
ingest_mode: live # live/historical
start_block: 1
end_block: 100
- Start the ingestion pipeline:
sh dotlakeIngest.sh
- Connects to blockchain node via WebSocket
- Exposes REST API on port 8080
- Provides standardized access to blockchain data
Processes blockchain data through multiple stages:
- Data extraction from Sidecar API
- Transformation and enrichment
- Storage in chosen database:
- MySQL
- PostgreSQL
- BigQuery
- Custom visualization capabilities
- Direct connection to stored data
To contribute or modify:
- Fork the repository
- Create a feature branch
- Submit a pull request